@victormustar: Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drop…

X AI KOLs Following 06/05/26, 09:59 PM News

open-source open-weights large-language-models vision audio weekly-roundup ai-news

Summary

A recap of an extraordinary week in open AI, featuring over 25 open-weight model releases across LLMs, image generation, audio/speech, vision, and video/3D, with notable contributions from NVIDIA, Google, and others.

Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality: LLMs → NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models. → Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week. → StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0. → Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week. → JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0. Image gen (the surprise of the week) → Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can't believe this is open weights. Audio & Speech (a breakout week for open TTS, 4 labs shipped) → Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B. Vision & VLMs → PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0. Video, 3D & World Models → NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

Original Article

View Cached Full Text

Cached at: 06/05/26, 11:21 PM

Before the week ends, let’s acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:

LLMs

→ NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models.

→ Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week.

→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.

→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.

→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.

Image gen (the surprise of the week)

→ Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can’t believe this is open weights.

Audio & Speech (a breakout week for open TTS, 4 labs shipped)

→ Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B.

Vision & VLMs

→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.

Video, 3D & World Models

→ NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).

@victormustar: Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drop…

Similar Articles

@ClementDelangue: So much great work lately from Nvidia, the "King of American Open-source AI"! - Crossed 1,000 total public repositories…

@dair_ai: https://x.com/dair_ai/status/2058537927823556668

AI News: A Huge Week for AI Apps (Anthropic, OpenAI, Google)

@dair_ai: The Top AI Papers of the Week (May 11 - May 17) - AEvo - δ-mem - AutoTTS - AI Co-Mathematician - Lighthouse Attention -…

National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources

Submit Feedback

Similar Articles

@ClementDelangue: So much great work lately from Nvidia, the "King of American Open-source AI"! - Crossed 1,000 total public repositories…

@dair_ai: https://x.com/dair_ai/status/2058537927823556668

AI News: A Huge Week for AI Apps (Anthropic, OpenAI, Google)

@dair_ai: The Top AI Papers of the Week (May 11 - May 17) - AEvo - δ-mem - AutoTTS - AI Co-Mathematician - Lighthouse Attention -…
A curated list of the top AI papers from May 11-17, featuring papers on AEvo, δ-mem, AutoTTS, AI Co-Mathematician, Lighthouse Attention, and others.

National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources