Tag
Two new open-source small language models are being released: one matches state-of-the-art accuracy at up to 93x smaller size, and the other outperforms a recent OpenAI model. The first model drops tomorrow.
A new open-source memory layer called Memvid claims to outperform all existing RAG systems, achieving +35% SOTA on LoCoMo and +76% on multi-hop reasoning, packaged as a single .mv2 file.
Xiaomi launched MiMo-V2.5-Pro, claiming state-of-the-art performance.
Moonshot has open-sourced the Kimi K2.6 model, supporting 4,000 tool calls in a single session and 300 parallel sub-agents, achieving SOTA on benchmarks like SWE-Bench Pro and claiming performance on par with Claude Opus 4.6 and GPT-5.4.
Kimi K2.6 is released as an open-source model that achieves state-of-the-art performance on long-horizon coding and agent swarm benchmarks.
LongCoT introduces two new agent leaderboards (Restricted & Open Harness), with GPT 5.2 RLM topping the Open Harness at 25.12%.