@itsPaulAi: Woow Nvidia has just released a 2.6B open-source world model You can turn a single image, text prompt and trajectory in…

X AI KOLs Timeline 05/15/26, 09:39 PM Models

nvidia open-source world-model single-image text-prompt trajectory embodied-ai

Summary

Nvidia released a 2.6B open-source world model that can generate controllable worlds from a single image, text prompt, and trajectory, running on a single GPU.

Woow Nvidia has just released a 2.6B open-source world model You can turn a single image, text prompt and trajectory into controllable worlds... And on a single GPU! - Code available on GitHub - Paper as well on arxiv You can use it for many things like embodied AI and robotics research, simulations, etc. Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!

Original Article

View Cached Full Text

Cached at: 05/16/26, 07:14 AM

Woow Nvidia has just released a 2.6B open-source world model

You can turn a single image, text prompt and trajectory into controllable worlds…

And on a single GPU!

Code available on GitHub
Paper as well on arxiv

You can use it for many things like embodied AI and robotics research, simulations, etc.

Because it can run on a single GPU (like an RTX 5090 or H100) it makes world models accessible to basically everyone!

Similar Articles

nvidia/Cosmos3-Super-Text2Image

Hugging Face Models Trending

NVIDIA released Cosmos3-Super-Text2Image, a text-to-image model part of the Cosmos3 omnimodal world model platform for Physical AI, enabling machines to understand and simulate the physical world.

NVIDIA OmniDreams: Real-Time Generative World Model for Closed-Loop Autonomous Vehicle Simulation

Hugging Face Daily Papers

NVIDIA presents OmniDreams, a generative world model built from the Cosmos diffusion model for real-time action-conditioned video generation, enabling closed-loop simulation for autonomous driving policy evaluation in complex unseen scenarios.

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Hugging Face Daily Papers

HY-World 2.0 is a multi-modal world model framework that generates high-fidelity 3D Gaussian Splatting scenes from text, images, and videos through specialized modules for panorama generation, trajectory planning, and scene composition, achieving state-of-the-art performance among open-source approaches.

@dhruvtwt_: Why is no one talking about this? @nvidia is offering around 80 AI models via hosted APIs absolutely for free. You get …

X AI KOLs Timeline

Nvidia quietly provides ~80 free hosted AI model APIs including MiniMax M2.7, GLM 5.1, Kimi 2.5, DeepSeek 3.2, GPT-OSS-120B, ready to integrate with popular dev tools like OpenClaude and Zed IDE.

nvidia/Cosmos3-Super-Image2Video