@victormustar: Before the week ends, let's acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drop…
Summary
A recap of an extraordinary week in open AI, featuring over 25 open-weight model releases across LLMs, image generation, audio/speech, vision, and video/3D, with notable contributions from NVIDIA, Google, and others.
View Cached Full Text
Cached at: 06/05/26, 11:21 PM
Before the week ends, let’s acknowledge one of the most INSANE week ever for open AI, with 25+ notable open-weight drops across every modality:
LLMs
→ NVIDIA Nemotron 3 Ultra: 550B hybrid Mamba-MoE, only 55B active, 1M context, MMLU 89.1. NVFP4 variant claims ~5x throughput on Blackwell. First openly-weighted 550B hybrid Mamba-Transformer, closing the gap with frontier closed models.
→ Google Gemma 4 12B: fully open dense any-to-any (text/image/audio/video), 256k context, encoder-free, 140+ languages, AIME 2026 at 77.5. Shipped with a 23-checkpoint QAT wave (mobile ONNX + MLX). Most deployable model of the week.
→ StepFun Step-3.7-Flash: 198B sparse MoE VLM, ~11B active, SWE-Bench PRO 56.3. Apache 2.0.
→ Liquid AI LFM2.5-8B-A1B: edge MoE, just 1.5B active, 128k ctx, MATH500 88.8, MLX-ready. Best on-device option this week.
→ JetBrains Mellum2-12B-A2.5B-Thinking: their first open MoE, near-Qwen3-14B coding at 2.5B active. Apache 2.0.
Image gen (the surprise of the week)
→ Ideogram 4: their FIRST-EVER open weights. 9.3B flow-matching DiT trained from scratch. #2 overall behind GPT Image 2, top open-weight model on Design Arena + LMArena. Strongest open checkpoint for text-rich images, full stop. It has taste. Still can’t believe this is open weights.
Audio & Speech (a breakout week for open TTS, 4 labs shipped)
→ Boson Higgs Audio v3 4B: 102 languages, 21 emotions, singing/whispering/shouting, sub-second TTFA. → RedNote dots.tts: the only fully continuous (no codec) open TTS pipeline, Apache 2.0. → Google Magenta RealTime 2: real-time music gen, <200ms latency, text+audio+MIDI. multimodalart ported it to PyTorch within hours with live ZeroGPU demos. → NVIDIA Nemotron-3.5 ASR: 600M streaming, 17x more concurrent streams vs Parakeet RNNT 1.1B.
Vision & VLMs
→ PaddleOCR-VL-1.6: SOTA document parsing at 1B params, Apache 2.0. → Baidu NAVA: 6.3B joint audio-video gen, best-in-class A/V sync, Apache 2.0.
Video, 3D & World Models
→ NVIDIA Cosmos3-Super: 64B omnimodal world model coupling action trajectories with video+audio gen, for Physical AI. → JD JoyAI-Echo: up to 5-min multi-shot text-to-video on LTX-2.3. → ByteDance Bernini-R + VAST TripoSplat (single-image-to-3D Gaussian splats, MIT).
Similar Articles
@ClementDelangue: So much great work lately from Nvidia, the "King of American Open-source AI"! - Crossed 1,000 total public repositories…
Nvidia crossed 1,000 public repositories on Hugging Face, featuring trending models and announcing plans for Cosmos 3, Alphamayo 2 Super, Nemotron 3/4, and adoption of the OpenMDW framework, underscoring its leadership in open-source AI.
@dair_ai: https://x.com/dair_ai/status/2058537927823556668
A roundup of the top AI papers of the week (May 18-24) covering a survey on code-as-harness for agents, OpenAI's autonomous resolution of the unit distance conjecture, and a memory model for continual learning without forgetting.
AI News: A Huge Week for AI Apps (Anthropic, OpenAI, Google)
OpenAI’s new Codex desktop app combines code generation, browser automation and persistent agents into a single IDE, while Anthropic upgraded Claude Code with parallel sessions and Google launched desktop apps, Chrome slash commands and an expressive TTS model.
@dair_ai: The Top AI Papers of the Week (May 11 - May 17) - AEvo - δ-mem - AutoTTS - AI Co-Mathematician - Lighthouse Attention -…
A curated list of the top AI papers from May 11-17, featuring papers on AEvo, δ-mem, AutoTTS, AI Co-Mathematician, Lighthouse Attention, and others.
National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources
NVIDIA highlights breakthroughs in physical AI and robotics during National Robotics Week, announcing new technologies including NVIDIA Isaac GR00T open models for natural language instruction understanding, Cosmos world models for synthetic data generation, Newton 1.0 physics engine, and expanded simulation capabilities with Isaac Sim 6.0 and Isaac Lab 3.0 to accelerate robot development from training to real-world deployment.