Scaling former VibeThinker-1.5B to 3B — now it reaches frontier math & coding performance
Summary
The VibeThinker-3B model achieves state-of-the-art math and coding reasoning performance, scoring 94.3 on AIME'26 and 96.1% on unseen LeetCode problems, demonstrating that small models can reach frontier-level reasoning in verifiable domains.
Similar Articles
WeiboAI/VibeThinker-3B
VibeThinker-3B is a 3B-parameter model that achieves frontier-level reasoning performance on math, coding, and STEM benchmarks by optimizing the Spectrum-to-Signal Principle (SSP) post-training pipeline, reaching performance comparable to much larger models.
VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models
VibeThinker-3B is a compact 3B parameter model that achieves frontier-level performance on verifiable reasoning tasks through a specialized training pipeline, matching larger models like DeepSeek V3.2 and Gemini 3 Pro.
Why Weibo's tiny VibeThinker-3B has the AI world arguing over benchmarks again (15 minute read)
Weibo's VibeThinker-3B, a 3B parameter model, claims to match or exceed the reasoning performance of much larger models like DeepSeek V3.2 and Gemini 3 Pro on math and coding benchmarks, sparking debate over benchmark reliability and the necessity of scaling.
VibeThinker-3B: what is this witchcraft? Killing it at MathQA like it has ~30B parameters
VibeThinker-3B is a small 3B parameter model that achieves performance comparable to ~30B parameter models on the MathQA benchmark, demonstrating significant efficiency.
@f14bertolotti: Stellar performance from a 3B model. These results were achieved primarily through post-training refinements on Qwen2.5…
This technical report introduces VibeThinker-3B, a 3B parameter model that achieves frontier-level verifiable reasoning performance through post-training refinements on Qwen2.5-Coder, including curriculum-based supervised fine-tuning, multi-domain reinforcement learning, and offline self-distillation, matching or exceeding much larger models like DeepSeek V3.2.