Qwen-AgentWorld: Language World Models for General Agents
Qwen-AgentWorld-35B-A3B and Qwen-AgentWorld-397B-A17B are the first language world models that simulate agentic environments across seven domains using long chain-of-thought reasoning. Trained via a three-stage pipeline—CPT, SFT, and RL—these models outperform existing frontier models on AgentWorldBench, a benchmark derived from real-world interactions of five models on nine established tasks.