Why there is a lack of new 100B-120B models?

Reddit r/LocalLLaMA 06/15/26, 11:35 AM News

model-sizes large-language-models open-source ai-trends moe-models model-releases

Summary

Analysis of the trend in AI model sizes, noting a gap in the 100-120B parameter range with recent releases focusing on smaller (25-35B) or larger (200B+) models.

GPT-OSS-120B was the first model of that family, which was followed by GLM-4.5-Air, Nemotron-3-Super, Qwen3.5-122B, Mistral-Small-4-119B. However, all models are at least 3 months old (10 months for GPT-OSS-120B) and all latest releases are either 25B-35B (Gemma4, Qwen3.6) or 200B+ (Step 3.5/3.7 Flash, DeepSeek-V4-Flash, MiniMax-M3, Nemotron-3-Ultra). Did the \~120B MoE family "die" like the 70B/80B one or there will likely be new releases for H2 2026?

Original Article

Similar Articles

A 4b model is now beating 30b ones at web research and the reason is not size

Reddit r/artificial

A 4 billion parameter open model from the Apodex family outperforms 30 billion parameter models on web research benchmarks, attributed to careful training data and self-verification techniques rather than raw scale, suggesting a more democratic trajectory for AI capability.

We need a 80-160B model urgently. The unified memory device market needs more Models.

Reddit r/LocalLLaMA

The author argues that there is an urgent need for AI models in the 80-160B parameter range to support users with unified memory devices (e.g., high-RAM Apple/AMD systems), as recent models are either too small or too large for their hardware.

@LottoLabs: There’s so much demand for a good small model, look at top downloaded qwen models All < 9b

X AI KOLs Following

Observation that there is high demand for small AI models, as seen in the top downloads of Qwen models under 9B parameters.

@ChrisGPotts: We take for granted that larger models are better than smaller ones, but why is this so? Our new paper, led by Jing Hua…

X AI KOLs Following

This paper investigates why larger models outperform smaller ones, attributing it to data-induced competition for neural resources through formal analysis and experiments.

AMD & Intel, now onwards it's your turn to release your own models