@nv_pavlichenko: Today we're releasing Mellum2: our first "serious" LLM. This is a 12B A2.5B MoE LLM pre-trained on ~11T tokens and post…
Summary
Releases Mellum2, a 12B A2.5B MoE LLM pretrained on ~11T tokens and post-trained with RLVR. Base, SFT, and RL checkpoints are released with a technical report.
View Cached Full Text
Cached at: 06/01/26, 03:46 PM
Today we’re releasing Mellum2: our first “serious” LLM.
This is a 12B A2.5B MoE LLM pre-trained on ~11T tokens and post-trained with RLVR. I’m proud to be leading the team that was working on it for the last 6 months.
We release base/SFT/RL checkpoints along with a tech https://t.co/Zj2GusGmYP
Similar Articles
JetBrains's Mellum 2 (49 minute read)
JetBrains releases Mellum 2, a 12B-parameter open-weight Mixture-of-Experts language model specialized in software engineering, with competitive performance in code generation, reasoning, and tool use, available under Apache 2.0.
Mellum2 Technical Report
Mellum 2 is a 12B-parameter open-weight MoE language model by JetBrains with 2.5B active parameters, specialized in software engineering tasks and optimized for efficient inference on commodity GPUs.
Mellum 2 12B A2.5B
JetBrains released Mellum 2 12B A2.5B, a coding-focused small MoE model with reasoning performance comparable to Qwen 3.5 9B but weaker in other tasks.
@vllm_project: vLLM v0.21.0 is out! 367 commits from 202 contributors (49 new). Highlights: KV Offload + HMA, spec decode with thinkin…
vLLM v0.21.0 has been released with KV Offload + HMA, speculative decoding with thinking budget for reasoning models, TOKENSPEED_MLA on Blackwell for DSR1/Kimi K2.5, Mooncake distributed KV, DeepSeek V4 pipeline parallelism, and a C++20 + Transformers v5 baseline.
@vllm_project: Today we're excited to introduce vime — a simple, stable, and efficient RL framework for LLM post-training in the vLLM …
vime is a new open-source RL framework for LLM post-training, built on slime's training design and vLLM's inference engine, providing a simple, stable, and efficient pipeline within the vLLM ecosystem.