Mellum 2 12B A2.5B

Reddit r/LocalLLaMA 06/01/26, 01:23 PM Models

Summary

JetBrains released Mellum 2 12B A2.5B, a coding-focused small MoE model with reasoning performance comparable to Qwen 3.5 9B but weaker in other tasks.

Coding focused small MoE from JetBrains. They claim coding performance around Qwen 3.5 9B for the reasoning model. Worse than Qwen 3.5 4B in in everything else. Models: [https://huggingface.co/collections/JetBrains/mellum-2](https://huggingface.co/collections/JetBrains/mellum-2) Technical report: [https://arxiv.org/abs/2605.31268](https://arxiv.org/abs/2605.31268)

Original Article

Mellum 2 12B A2.5B

Similar Articles

Mellum2 Technical Report

JetBrains's Mellum 2 (49 minute read)

JetBrains/Mellum2-12B-A2.5B-Thinking

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog

Submit Feedback

Similar Articles

JetBrains's Mellum 2 (49 minute read)

JetBrains/Mellum2-12B-A2.5B-Thinking

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

Mellum2 Goes Open Source: A Fast Model for AI Workflows | The JetBrains AI Blog