@ash_csx: We’re dropping two open source SLMs this week. 1. One of them matches SOTA accuracy at up to 93x smaller. 2. The other …
Summary
Two new open-source small language models are being released: one matches state-of-the-art accuracy at up to 93x smaller size, and the other outperforms a recent OpenAI model. The first model drops tomorrow.
View Cached Full Text
Cached at: 05/12/26, 10:50 AM
We’re dropping two open source SLMs this week.
- One of them matches SOTA accuracy at up to 93x smaller.
- The other one beats a recent OpenAI model.
Model #1 drops tomorrow 👀 https://t.co/NBXSlhGsUi
Similar Articles
@cjzafir: VLMs (Vertical Language Models) are beating top LLMs. These small 7B to 15B niche-focused models are beating SoTA model…
The author demonstrates that small vertical language models (6B-15B) can outperform top LLMs on niche benchmarks through cost-effective fine-tuning using open-source models and Codex orchestration, achieving results with a $300 dataset.
Introducing gpt-oss
OpenAI releases gpt-oss-120b and gpt-oss-20b, two state-of-the-art open-weight language models under Apache 2.0 license that achieve near-parity with proprietary models while being optimizable for consumer hardware and edge devices. Both models demonstrate strong reasoning and tool-use capabilities with comprehensive safety evaluations.
@raphaelsrty: We're releasing LateOn and DenseOn today. Two open retrieval models, 149M parameters each. LateOn (ColBERT, multi-vecto…
Raphael released two open-source retrieval models, LateOn (ColBERT multi-vector) and DenseOn (single-vector), each 149M parameters and outperforming 4× larger models on BEIR.
@ClementDelangue: OpenAI dropped a new model on HF today!
OpenAI released a new model on Hugging Face today.
@AlexJonesax: Two open-source MLX inference servers worth knowing about if you run LLMs on Mac: MTPLX (@youssofal) Uses a model's own…
This article highlights two open-source MLX inference servers for Mac: MTPLX, which optimizes token speed using speculative decoding without a draft model, and oMLX, which improves workflow efficiency with persistent KV caches for coding agents.