@nv_pavlichenko: Today we're releasing Mellum2: our first "serious" LLM. This is a 12B A2.5B MoE LLM pre-trained on ~11T tokens and post…

X AI KOLs Timeline 06/01/26, 01:24 PM Models

llm mixture-of-experts rlvr open-source pre-training fine-tuning

Summary

Releases Mellum2, a 12B A2.5B MoE LLM pretrained on ~11T tokens and post-trained with RLVR. Base, SFT, and RL checkpoints are released with a technical report.

Today we're releasing Mellum2: our first "serious" LLM. This is a 12B A2.5B MoE LLM pre-trained on ~11T tokens and post-trained with RLVR. I'm proud to be leading the team that was working on it for the last 6 months. We release base/SFT/RL checkpoints along with a tech https://t.co/Zj2GusGmYP

Original Article

View Cached Full Text

Cached at: 06/01/26, 03:46 PM

Today we’re releasing Mellum2: our first “serious” LLM.

This is a 12B A2.5B MoE LLM pre-trained on ~11T tokens and post-trained with RLVR. I’m proud to be leading the team that was working on it for the last 6 months.

We release base/SFT/RL checkpoints along with a tech https://t.co/Zj2GusGmYP

Similar Articles

JetBrains's Mellum 2 (49 minute read)

TLDR AI

JetBrains releases Mellum 2, a 12B-parameter open-weight Mixture-of-Experts language model specialized in software engineering, with competitive performance in code generation, reasoning, and tool use, available under Apache 2.0.

Mellum2 Technical Report

Hugging Face Daily Papers

Mellum 2 is a 12B-parameter open-weight MoE language model by JetBrains with 2.5B active parameters, specialized in software engineering tasks and optimized for efficient inference on commodity GPUs.

Mellum 2 12B A2.5B

Reddit r/LocalLLaMA

JetBrains released Mellum 2 12B A2.5B, a coding-focused small MoE model with reasoning performance comparable to Qwen 3.5 9B but weaker in other tasks.

@vllm_project: vLLM v0.21.0 is out! 367 commits from 202 contributors (49 new). Highlights: KV Offload + HMA, spec decode with thinkin…

X AI KOLs Following

vLLM v0.21.0 has been released with KV Offload + HMA, speculative decoding with thinking budget for reasoning models, TOKENSPEED_MLA on Blackwell for DSR1/Kimi K2.5, Mooncake distributed KV, DeepSeek V4 pipeline parallelism, and a C++20 + Transformers v5 baseline.

@vllm_project: Today we're excited to introduce vime — a simple, stable, and efficient RL framework for LLM post-training in the vLLM …