prefeitura-rio/Rio-3.5-Open-397B
Summary
Rio 3.5 Open 397B is an open-source, frontier-class AI model post-trained from Qwen 3.5 397B, featuring SwiReasoning for dynamic explicit/latent reasoning switching, achieving state-of-the-art performance across agentic coding, reasoning, and multilingual benchmarks.
View Cached Full Text
Cached at: 06/14/26, 07:35 AM
prefeitura-rio/Rio-3.5-Open-397B · Hugging Face
Source: https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B

Rio 3.5 Open 397Bis a frontier-class general-purpose AI model developed byIplanRIO, the municipal IT company of Rio de Janeiro’s city government. Post-trained from Qwen 3.5 397B, Rio 3.5 Open 397B delivers state-of-the-art open-model performance across agentic coding, mathematics, STEM, multilingual, and multimodal benchmarks — surpassing its base model by significant margins and competing with the world’s best open and proprietary models.
Rio 3.5 Open 397B featuresSwiReasoning, a training-free inference framework based onShi et al. (2025)that dynamically switches between explicit chain-of-thought and latent-space reasoning, guided by entropy-based confidence signals. This enables both higher accuracy and dramatically improved token efficiency. This model was explicitly trained to maximize the efficiency gained via latent reasoning.
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#key-featuresKey Features
- 397B total / 17B active parameters(Mixture-of-Experts)
- 1,010,000 token (1M) context window
- SwiReasoning integration— dynamic explicit/latent reasoning switching for Pareto-superior accuracy and efficiency
- General-purpose— strong agentic coding, reasoning, instruction-following, and multimodal performance
- Post-trained from Qwen 3.5 397B
- Multilingual— strong performance in Portuguese, English, Chinese, and dozens of other languages
- MIT License— fully open for commercial and research use
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#benchmark-resultsBenchmark Results
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#agentic-coding–software-engineeringAgentic Coding & Software Engineering
BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5Terminal-Bench 2.170.852.570.367.966.778.2DeepSWE23.06.0–8.024.070.0SWE-Bench Pro58.150.957.659.059.558.6SWE-Bench Verified80.276.277.780.680.282.9SWE-Bench Multilingual77.069.375.876.276.7–
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#knowledge–reasoningKnowledge & Reasoning
BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5GPQA Diamond90.988.490.390.190.593.6HLE36.528.734.737.736.441.4MMLU-Pro88.087.888.587.587.1–MMLU-Redux94.694.994.594.895.3–SuperGPQA72.370.471.469.971.3–Apex29.29.422.738.324.080.2
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#mathematicsMathematics
BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5HMMT 2026 Feb93.987.992.995.292.798.5IMOAnswerBench89.580.986.089.886.0–
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#multilingualMultilingual
BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5MMMLU89.888.589.087.987.5–MMLU-ProX85.684.785.483.983.7–
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#multimodalMultimodal
BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5MMMU-Pro78.479.079.0–79.481.2MathVision89.188.690.3–87.4–VideoMMMU81.684.785.4––86.4
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#agents–instruction-followingAgents & Instruction Following
BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5MCP-Atlas74.274.273.273.666.675.3IFBench78.476.579.177.076.076.0IFEval93.492.694.691.994.5–
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#economic-valueEconomic Value
BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5GDPval (estimated)153312001520155414821769
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#gains-over-base-model-qwen-35-397bGains Over Base Model (Qwen 3.5 397B)
BenchmarkBase ModelRio 3.5 Open 397BΔTerminal-Bench 2.152.570.8**+18.3DeepSWE6.023.0+17.0SWE-Bench Pro50.958.1+7.2SWE-Bench Verified76.280.2+4.0SWE-Bench Multilingual69.377.0+7.7GPQA Diamond88.490.9+2.5HLE28.736.5+7.8HMMT 2026 Feb87.993.9+6.0IMOAnswerBench80.989.5+8.6Apex9.429.2+19.8GDPval (estimated)12001533+333**
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#swireasoning-latentexplicit-reasoningSwiReasoning: Latent/Explicit Reasoning
Rio 3.5 Open 397B integratesSwiReasoning(Shi et al., 2025), a training-free inference framework that dynamically alternates between two reasoning modes:
- Explicit reasoning— standard chain-of-thought in natural language, where the model commits tokens to a single reasoning path
- Latent reasoning— continuous reasoning in hidden space, where the model explores multiple implicit paths simultaneously without emitting tokens
The switching is governed byblock-wise confidenceestimated from entropy trends in the next-token distribution. When confidence is low (entropy trending upward), the model enters latent mode to explore alternatives. When confidence recovers, it switches back to explicit mode to commit to a solution.
This approach achieves aPareto-superiortrade-off: higher accuracy at unlimited budgetsanddramatically better token efficiency under constrained budgets. As with previous Rio generations, the model was post-trained to maximize the gains obtained from latent reasoning.
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#how-to-useHow to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "prefeitura-rio/Rio-3.5-Open-397B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
prompt = "Write a poem about Rio de Janeiro."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=81920,
temperature=0.6,
top_p=0.95,
)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#using-with-vllmUsing with vLLM
vllm serve prefeitura-rio/Rio-3.5-Open-397B \
--tensor-parallel-size 8 \
--max-model-len 1048576 \
--trust-remote-code
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#using-with-sglangUsing with SGLang
python -m sglang.launch_server \
--model-path prefeitura-rio/Rio-3.5-Open-397B \
--tp 8 \
--context-length 1048576 \
--trust-remote-code
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#model-detailsModel Details
DeveloperIplanRIO — Empresa Municipal de Informática e Planejamento S.A.Base ModelQwen 3.5 397BArchitectureMixture-of-Experts (MoE) TransformerTotal Parameters~397BActive Parameters~17BContext Length1,010,000 tokens (1M)Training MethodPost-trainingInference EnhancementSwiReasoning (latent/explicit switching)LicenseMITLanguagesMultilingual (en, pt, zh, ja, ko, fr, de, es, ar, and more)
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#citationCitation
If you use SwiReasoning, please also cite:
@misc{shi2025swireasoning,
title={SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs},
author={Dachuan Shi et al.},
year={2025},
eprint={2510.05069},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#acknowledgmentsAcknowledgments
Rio 3.5 Open 397B is built upon the exceptional work of theQwen Teamand their Qwen 3.5 model family. We also acknowledge the authors ofSwiReasoningfor their innovative inference framework.
Developed in Rio de Janeiro 🇧🇷 byIplanRIO.
Similar Articles
Rio de Janeiro's "homegrown" LLM appears to be a merge of an existing model
Nex-AGI releases and open-sources Nex-N2, an agentic model with a unified framework for adaptive and coherent thinking, achieving competitive performance on agentic and coding benchmarks.
OpenAI o3-mini
OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.
@mishig25: Open source is so back http://hf.co/mistralai/Mistral-Medium-3.5-128B…
Mistral AI releases Mistral Medium 3.5, an open-source 128B dense model with 256k context, multimodal input, configurable reasoning, and agentic capabilities.
Jackrong/Qwen3.5-9B-DeepSeek-V4-Flash-GGUF
This entry describes Qwen3.5-9B-DeepSeek-V4-Flash, a distilled AI model that transfers reasoning capabilities from DeepSeek-V4 into a smaller 9B parameter space for efficient inference.
It looks like Rio 3.5 397B could've simply been a semi-failed embezzling of funding
An investigation reveals that the Rio 3.5 397B AI model, funded with $100K, was likely a simple merge of Nex N2 Pro without any training, leading to accusations of funding embezzlement.