prefeitura-rio/Rio-3.5-Open-397B

Hugging Face Models Trending Models

Summary

Rio 3.5 Open 397B is an open-source, frontier-class AI model post-trained from Qwen 3.5 397B, featuring SwiReasoning for dynamic explicit/latent reasoning switching, achieving state-of-the-art performance across agentic coding, reasoning, and multilingual benchmarks.

Task: image-text-to-text Tags: transformers, safetensors, qwen3_5_moe, image-text-to-text, conversational, pt, en, arxiv:2510.05069, base_model:Qwen/Qwen3.5-397B-A17B, base_model:finetune:Qwen/Qwen3.5-397B-A17B, license:mit, endpoints_compatible, region:us
Original Article
View Cached Full Text

Cached at: 06/14/26, 07:35 AM

prefeitura-rio/Rio-3.5-Open-397B · Hugging Face

Source: https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B Rio 3.5 Open 397B benchmark results

Rio 3.5 Open 397Bis a frontier-class general-purpose AI model developed byIplanRIO, the municipal IT company of Rio de Janeiro’s city government. Post-trained from Qwen 3.5 397B, Rio 3.5 Open 397B delivers state-of-the-art open-model performance across agentic coding, mathematics, STEM, multilingual, and multimodal benchmarks — surpassing its base model by significant margins and competing with the world’s best open and proprietary models.

Rio 3.5 Open 397B featuresSwiReasoning, a training-free inference framework based onShi et al. (2025)that dynamically switches between explicit chain-of-thought and latent-space reasoning, guided by entropy-based confidence signals. This enables both higher accuracy and dramatically improved token efficiency. This model was explicitly trained to maximize the efficiency gained via latent reasoning.

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#key-featuresKey Features

  • 397B total / 17B active parameters(Mixture-of-Experts)
  • 1,010,000 token (1M) context window
  • SwiReasoning integration— dynamic explicit/latent reasoning switching for Pareto-superior accuracy and efficiency
  • General-purpose— strong agentic coding, reasoning, instruction-following, and multimodal performance
  • Post-trained from Qwen 3.5 397B
  • Multilingual— strong performance in Portuguese, English, Chinese, and dozens of other languages
  • MIT License— fully open for commercial and research use

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#benchmark-resultsBenchmark Results

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#agentic-coding–software-engineeringAgentic Coding & Software Engineering

BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5Terminal-Bench 2.170.852.570.367.966.778.2DeepSWE23.06.0–8.024.070.0SWE-Bench Pro58.150.957.659.059.558.6SWE-Bench Verified80.276.277.780.680.282.9SWE-Bench Multilingual77.069.375.876.276.7–

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#knowledge–reasoningKnowledge & Reasoning

BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5GPQA Diamond90.988.490.390.190.593.6HLE36.528.734.737.736.441.4MMLU-Pro88.087.888.587.587.1–MMLU-Redux94.694.994.594.895.3–SuperGPQA72.370.471.469.971.3–Apex29.29.422.738.324.080.2

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#mathematicsMathematics

BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5HMMT 2026 Feb93.987.992.995.292.798.5IMOAnswerBench89.580.986.089.886.0–

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#multilingualMultilingual

BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5MMMLU89.888.589.087.987.5–MMLU-ProX85.684.785.483.983.7–

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#multimodalMultimodal

BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5MMMU-Pro78.479.079.0–79.481.2MathVision89.188.690.3–87.4–VideoMMMU81.684.785.4––86.4

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#agents–instruction-followingAgents & Instruction Following

BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5MCP-Atlas74.274.273.273.666.675.3IFBench78.476.579.177.076.076.0IFEval93.492.694.691.994.5–

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#economic-valueEconomic Value

BenchmarkRio 3.5 Open 397BQwen 3.5 397B (base)Qwen 3.7 PlusDeepSeek V4 ProKimi-K2.6GPT 5.5GDPval (estimated)153312001520155414821769

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#gains-over-base-model-qwen-35-397bGains Over Base Model (Qwen 3.5 397B)

BenchmarkBase ModelRio 3.5 Open 397BΔTerminal-Bench 2.152.570.8**+18.3DeepSWE6.023.0+17.0SWE-Bench Pro50.958.1+7.2SWE-Bench Verified76.280.2+4.0SWE-Bench Multilingual69.377.0+7.7GPQA Diamond88.490.9+2.5HLE28.736.5+7.8HMMT 2026 Feb87.993.9+6.0IMOAnswerBench80.989.5+8.6Apex9.429.2+19.8GDPval (estimated)12001533+333**

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#swireasoning-latentexplicit-reasoningSwiReasoning: Latent/Explicit Reasoning

Rio 3.5 Open 397B integratesSwiReasoning(Shi et al., 2025), a training-free inference framework that dynamically alternates between two reasoning modes:

  • Explicit reasoning— standard chain-of-thought in natural language, where the model commits tokens to a single reasoning path
  • Latent reasoning— continuous reasoning in hidden space, where the model explores multiple implicit paths simultaneously without emitting tokens

The switching is governed byblock-wise confidenceestimated from entropy trends in the next-token distribution. When confidence is low (entropy trending upward), the model enters latent mode to explore alternatives. When confidence recovers, it switches back to explicit mode to commit to a solution.

This approach achieves aPareto-superiortrade-off: higher accuracy at unlimited budgetsanddramatically better token efficiency under constrained budgets. As with previous Rio generations, the model was post-trained to maximize the gains obtained from latent reasoning.

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#how-to-useHow to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prefeitura-rio/Rio-3.5-Open-397B"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

prompt = "Write a poem about Rio de Janeiro."

messages = [
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=81920,
    temperature=0.6,
    top_p=0.95,
)

response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
print(response)

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#using-with-vllmUsing with vLLM

vllm serve prefeitura-rio/Rio-3.5-Open-397B \
    --tensor-parallel-size 8 \
    --max-model-len 1048576 \
    --trust-remote-code

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#using-with-sglangUsing with SGLang

python -m sglang.launch_server \
    --model-path prefeitura-rio/Rio-3.5-Open-397B \
    --tp 8 \
    --context-length 1048576 \
    --trust-remote-code

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#model-detailsModel Details

DeveloperIplanRIO — Empresa Municipal de Informática e Planejamento S.A.Base ModelQwen 3.5 397BArchitectureMixture-of-Experts (MoE) TransformerTotal Parameters~397BActive Parameters~17BContext Length1,010,000 tokens (1M)Training MethodPost-trainingInference EnhancementSwiReasoning (latent/explicit switching)LicenseMITLanguagesMultilingual (en, pt, zh, ja, ko, fr, de, es, ar, and more)

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#citationCitation

If you use SwiReasoning, please also cite:

@misc{shi2025swireasoning,
    title={SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs},
    author={Dachuan Shi et al.},
    year={2025},
    eprint={2510.05069},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

https://huggingface.co/prefeitura-rio/Rio-3.5-Open-397B#acknowledgmentsAcknowledgments

Rio 3.5 Open 397B is built upon the exceptional work of theQwen Teamand their Qwen 3.5 model family. We also acknowledge the authors ofSwiReasoningfor their innovative inference framework.

Developed in Rio de Janeiro 🇧🇷 byIplanRIO.

Similar Articles

OpenAI o3-mini

OpenAI Blog

OpenAI releases o3-mini, a cost-efficient reasoning model with strong STEM capabilities, available in ChatGPT and API with support for function calling, structured outputs, and three reasoning effort levels. The model matches o1 performance in math and coding while being faster and cheaper, with free plan users gaining access to a reasoning model for the first time.

Jackrong/Qwen3.5-9B-DeepSeek-V4-Flash-GGUF

Hugging Face Models Trending

This entry describes Qwen3.5-9B-DeepSeek-V4-Flash, a distilled AI model that transfers reasoning capabilities from DeepSeek-V4 into a smaller 9B parameter space for efficient inference.