Tag
Nemotron 3 Ultra is an open-weight release with an impressive capability-to-efficiency ratio, using a Mamba-2-attention hybrid stack and LatentMoE, and is larger than the previous Super variant.
Ideogram 4.0 is released as an open-weight model with layout control for generating design-ready images.
Ideogram 4 is an open-weight text-to-image model trained from scratch, featuring structured JSON prompting, best-in-class multilingual text rendering, bounding-box layout controls, color-palette controls, and native 2K resolution output.
Sebastian Raschka highlights four recent additions to the open-weight local LLM ecosystem that can run on consumer hardware.
MiniMax released M3, a model with a 1M-token context window and native multimodal input, via API. The company promises open-weight release and a technical report within 10 days.
JetBrains releases Mellum 2, a 12B-parameter open-weight Mixture-of-Experts language model specialized in software engineering, with competitive performance in code generation, reasoning, and tool use, available under Apache 2.0.
Luke J. Huang's new blog post surveys asynchronous reinforcement learning theory and infrastructure across 8 open-weight frontier labs, addressing algorithmic techniques and systems fixes for train-inference mismatch.
MiniMax releases M3, an open-weight model with frontier coding, agentic, 1M context, and native multimodal capabilities, achieving top benchmarks on coding and agentic tasks with autonomous task decomposition and long-context support.
Miles Brundage comments on the lack of quantitative analysis on how distillation affects the capability gap between open-weight and proprietary AI models, referencing a claim by Epoch AI that open-weight models lag by four months.
Epoch AI Research analyzed the capability gap between open-weight and proprietary AI models, finding that open-weight models have been trailing the state of the art by approximately four months since the start of the year.
Mellum 2 is a 12B-parameter open-weight MoE language model by JetBrains with 2.5B active parameters, specialized in software engineering tasks and optimized for efficient inference on commodity GPUs.
Numind released NuExtract3, a 4B open-weight vision-language model based on Qwen3.5-4B, designed for converting document images to Markdown, OCR, and structured data extraction. It is Apache-2.0 licensed and self-hostable with quantized versions for low VRAM.
Qwen 3.7 open-weight model has been released, generating significant hype in the AI community as a new top-tier model.
Stability AI released Stable Audio 3.0, an open-weight model family for variable-length audio generation up to six minutes, with support for LoRA fine-tuning and audio inpainting, trained on fully licensed data.
Stability AI has released Stable Audio 3.0, an open-weight model family for generative audio, designed for artistic experimentation and integration into DAWs like gary4juce.
MiroThinker-1.7 is an open-weight deep research agent built on Qwen3 MoE, with a mini version (30B total, 3B active) designed for consumer hardware; the team shares benchmarks and seeks feedback on local deployment.
Infinity releases two open-weight models, Infinity-Parser2-Pro (35B) and Infinity-Parser2-Flash (2B), which top the ParseBench leaderboard for document understanding, leveraging a synthetic data engine and a novel joint RL algorithm.
Santiago (@svpino) highlights MiniMax-M2.7, a 230B open-weight model that rivals top proprietary models like Opus 4.6 and GPT-5.4, achieving 440+ tokens/s inference on SambaNova at low cost.
Poolside is hosting a 2-day model research hackathon in London to push an open-weight agent model further using RL and fine-tuning on Laguna XS.2, with partners including NVIDIA, Prime Intellect, and Hugging Face, and a prize of an NVIDIA DGX Spark.
Hebatron is a new open-weight Hebrew-specialized Large Language Model built on NVIDIA's Nemotron-3 Mixture-of-Experts architecture, achieving strong reasoning performance with efficient inference. It is the first language-specific adaptation of this architecture and supports native long-context processing.