Tag
NVIDIA's Jim Fan spoke at Sequoia AI Ascent 2026, declaring the VLA architecture obsolete and proposing World Action Models (WAM) as a new paradigm for robotics. He introduced key technologies including DreamZero, EgoScale, and the neural simulator Dream Dojo.
Allen AI releases MolmoAct2, an open-weight Vision-Language-Action model designed for real-world robotic deployment, featuring new datasets, an open action tokenizer, and adaptive reasoning to reduce latency.
NVIDIA and Hugging Face publish a hands-on demo showing Gemma 4 running as a vision-language-action model entirely on the Jetson Orin Nano Super, using local STT/TTS and webcam input.
FlashDrive reduces reasoning vision-language-action model inference latency from 716 ms to 159 ms on RTX PRO 6000—up to 5.7× faster—with zero accuracy loss, enabling real-time autonomous applications.
LeRobot v0.5.0 is a major release featuring support for Unitree G1 humanoid robots, new policy architectures (Pi0-FAST VLAs, Real-Time Chunking), streaming video encoding for 3x faster training, and EnvHub for loading simulation environments from Hugging Face Hub.