unsloth/Qwen-AgentWorld-35B-A3B-GGUF
Summary
Unsloth released a GGUF quantization of Qwen-AgentWorld-35B-A3B, a native language world model that simulates agentic environments across seven domains (MCP, Search, Terminal, SWE, Android, Web, OS) using long chain-of-thought reasoning and trained via CPT, SFT, and RL.
View Cached Full Text
Cached at: 06/28/26, 11:21 AM
unsloth/Qwen-AgentWorld-35B-A3B-GGUF · Hugging Face
Source: https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF
This repository contains the model weights and configuration files forQwen-AgentWorld-35B-A3B, a native language world model trained for agentic environment simulation. These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, etc.
Qwen-AgentWorldis the first language world model to cover seven agent interaction domains within a single model. It simulates agentic environments via long chain-of-thought reasoning, predicting the next environment state given an agent’s action and interaction history. Trained through a three-stage pipeline — CPT injects environment knowledge, SFT activates next-state-prediction reasoning, RL sharpens simulation fidelity — Qwen-AgentWorld is anative world model: environment modeling is the training objective from the CPT stage onward, not a post-hoc add-on.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#highlightsHighlights
- **Seven Unified Domains.**A single model covers MCP (tool calling), Search, Terminal, SWE (software engineering), Android, Web, and OS — spanning both text and GUI interaction environments.
- **Native World Model.**Environment modeling from CPT onward, not post-hoc adaptation on a general-purpose LLM.
- **Generalizable, Scalable & Controllable Simulator.**Zero-shot generalization to OOD environments (e.g., OpenClaw); controllable perturbations and fictional-world construction surpass real-environment training.
- **Agent Foundation Model.**LWM RL warm-up on single-turn, non-agentic trajectories transfers to multi-turn, tool-calling agentic tasks across 7 benchmarks, including 3 entirely out-of-domain.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#model-overviewModel Overview
- Type: Causal Language Model (Language World Model)
- Base Model:Qwen3.5-35B-A3B-Base
- Training Stage: Continual Pre-Training (CPT) → Supervised Fine-Tuning (SFT) → Reinforcement Learning (RL, GSPO)
- Number of Parameters: 35B in total and 3B activated
- Hidden Dimension: 2048
- Token Embedding: 248320 (Padded)
- Number of Layers: 40
- Hidden Layout: 10 × (3 × (Gated DeltaNet → MoE) → 1 × (Gated Attention → MoE))
- Gated DeltaNet:- Number of Linear Attention Heads: 32 for V and 16 for QK - Head Dimension: 128
- Gated Attention:- Number of Attention Heads: 16 for Q and 2 for KV - Head Dimension: 256 - Rotary Position Embedding Dimension: 64
- Mixture Of Experts- Number of Experts: 256 - Number of Activated Experts: 8 Routed + 1 Shared - Expert Intermediate Dimension: 512
- Context Length: 262,144 tokens
- Disclaimer: No outputs from external API services are included in the training pipeline.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#performancePerformance
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#agentworldbench-open-ended-evaluationAgentWorldBench (Open-Ended Evaluation)
Five-dimensional rubric mean per domain, normalized to 0-100 scale.
ModelMCPSearchTerm.SWEAndroidWebOSOverallGPT-5.470.1037.2653.6966.2960.0051.8068.5858.25Claude Opus 4.854.9335.1459.1864.1061.5054.6666.6256.59Claude Opus 4.669.9029.3057.5164.5561.7451.4270.2057.80Gemini 3.1 Pro59.0730.2152.4759.0761.4052.8366.9254.57Claude Sonnet 4.670.0028.7956.9864.5258.0350.7863.1756.04DeepSeek-V4-Pro63.2727.6151.2659.4455.1750.3263.7052.97GLM-5.167.6022.4647.3252.0759.1051.5059.1351.31Kimi K2.665.2327.4852.5458.7758.9350.2060.8053.42MiniMax-M2.755.8227.3041.6237.4452.4050.5257.7346.12Qwen3.5-35B-A3B57.8725.9846.1347.5853.1847.1056.2747.73Qwen3.5-397B-A17B68.3130.8155.3064.4454.9048.5560.8554.74Qwen3.6-Plus55.2821.9450.5859.0857.6550.7860.3350.81Qwen-AgentWorld-35B-A3B64.7936.6953.9665.6358.1749.5565.9256.39Qwen-AgentWorld-397B-A17B68.2437.8257.7368.4960.2050.9867.8958.71
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#quickstartQuickstart
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#deploymentDeployment
Qwen-AgentWorld-35B-A3B can be served via APIs with popular inference frameworks. In the following, we show example commands to launch OpenAI-compatible API servers.
The model has a default context length of 262,144 tokens. If you encounter out-of-memory (OOM) errors, consider reducing the context window. However, because Qwen-AgentWorld leverages extended context for multi-turn environment simulation, we advise maintaining a context length of at least 128K tokens.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#sglangSGLang
SGLangis a fast serving framework for large language models.
python -m sglang.launch_server \
--model-path Qwen/Qwen-AgentWorld-35B-A3B \
--port 8000 \
--tp-size 4 \
--context-length 262144 \
--reasoning-parser qwen3
An OpenAI-compatible API will be available athttp://localhost:8000/v1.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#vllmvLLM
vLLMis a high-throughput and memory-efficient inference engine for LLMs.
vllm serve Qwen/Qwen-AgentWorld-35B-A3B \
--port 8000 \
--tensor-parallel-size 4 \
--max-model-len 262144 \
--reasoning-parser qwen3 \
--trust-remote-code
An OpenAI-compatible API will be available athttp://localhost:8000/v1.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#inference-with-transformersInference with Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen/Qwen-AgentWorld-35B-A3B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
)
messages = [
{
"role": "system",
"content": "You are a language world model simulating a Linux terminal environment. "
"Given the user's command, predict the terminal output."
},
{
"role": "user",
"content": "Action: execute_bash\nCommand: ls -la /home/user/project/"
}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.6)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#using-via-the-chat-completions-apiUsing via the Chat Completions API
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="EMPTY",
)
# Terminal domain example
messages = [
{
"role": "system",
"content": "You are a language world model simulating a Linux terminal environment. "
"Given the user's command, predict the terminal output."
},
{
"role": "user",
"content": "Action: execute_bash\nCommand: ls -la /home/user/project/"
}
]
response = client.chat.completions.create(
model="Qwen/Qwen-AgentWorld-35B-A3B",
messages=messages,
max_tokens=32768,
temperature=0.6,
)
print(response.choices[0].message.content)
We providedomain-specific world model system prompt templatesin
prompts/of the GitHub repository for all 7 domains. These serve as general-purpose system prompts when using Qwen-AgentWorld as an environment simulator. Each domain folder contains asystem\_prompt\.txt(world model system prompt) and ajudge\_system\_prompt\.txt(evaluation prompt).
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#evaluate-on-agentworldbenchEvaluate on AgentWorldBench
AgentWorldBench evaluates language world models by scoring each predicted environment observation on 5 dimensions:Format,Factuality,Consistency,Realism, andQuality.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#setupSetup
# Clone the evaluation repository
git clone https://github.com/QwenLM/Qwen-AgentWorld.git
cd Qwen-AgentWorld
# Download the benchmark
huggingface-cli download Qwen/AgentWorldBench --repo-type dataset --local-dir ./AgentWorldBench
# Install dependencies
pip install openai
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#run-evaluationRun Evaluation
The evaluation follows a three-step pipeline:
cd eval
# Step 1: Run world model inference
python eval.py infer \
--data-dir ../AgentWorldBench \
--model-base-url http://localhost:8000/v1 \
--model-name Qwen/Qwen-AgentWorld-35B-A3B \
--output-dir ./results
# Step 2: Run LLM judge scoring
export OPENAI_API_KEY="your-api-key"
python eval.py judge \
--predictions ./results/predictions.jsonl \
--judge-base-url https://api.openai.com/v1 \
--judge-model gpt-5.2-2025-12-11 \
--output-dir ./results
# Step 3: Aggregate and display scores
python eval.py score --predictions ./results/judged.jsonl
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#best-practicesBest Practices
- Sampling Parameters: We recommend
temperature=0\.6,top\_p=0\.95,top\_k=20for world model inference. The model uses thinking mode by default (<think\>\.\.\.</think\>) to reason about environment state transitions before producing the predicted observation. - Adequate Output Length: We recommend an output length of 32,768 tokens for most queries. For long, multi-step trajectories, you may increase the max output length to accommodate detailed environment observations.
- Domain-Specific System Prompts: For optimal simulation fidelity, use the domain-specific system prompts provided in the
prompts/directory of the GitHub repository.
https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF#citationCitation
If you find our work helpful, feel free to give us a cite.
@article{zuo2026qwen,
title={Qwen-agentworld: language world models for general agents},
author={Zuo, Yuxin and Xiao, Zikai and Sheng, Li and Huang, Fei and Tu, Jianhong and Liu, Yuxuan and Tang, Tianyi and Hu, Xiaomeng and Su, Yang and Lan, Qingfeng and others},
journal={arXiv preprint arXiv:2606.24597},
year={2026}
}
Similar Articles
Qwen/Qwen-AgentWorld-35B-A3B
Qwen releases Qwen-AgentWorld-35B-A3B, a native language world model that simulates agentic environments across seven domains via long chain-of-thought reasoning. The model is trained with a three-stage pipeline and supports MCP, Search, Terminal, SWE, Android, Web, and OS interactions.
unsloth/Qwen3.6-27B-GGUF
Unsloth releases a GGUF quantized version of the Qwen3.6-27B model, featuring improved agentic coding capabilities, tool calling, and support for Unsloth Studio.
Qwen-AgentWorld-397B-A17B
Qwen released a new large language model, Qwen-AgentWorld-397B-A17B, as detailed on HuggingFace and the Qwen blog.
unsloth/Qwen3.6-27B-MTP-GGUF
Unsloth has released GGUF weights for the Qwen3.6-27B model, featuring Multi-Token Prediction (MTP) for faster generation and enhanced agentic coding capabilities.
Qwen-AgentWorld: Language World Models for General Agents
Qwen-AgentWorld introduces language world models for agentic environments, covering seven domains with long chain-of-thought reasoning. The work includes a new benchmark, AgentWorldBench, and shows that world modeling improves downstream agent performance.