Tag
This paper introduces RLA-WM, a visual feature-based world model that leverages residual latent actions and flow matching to efficiently predict future visual states. The method outperforms existing video-diffusion and feature-based approaches while enabling novel robot learning techniques from offline, actionless demonstration videos.
This paper introduces TT4D, a novel pipeline and large-scale dataset for reconstructing table tennis gameplay in 4D from monocular videos. It features a unique lift-first approach that estimates 3D ball trajectories and spin before time segmentation, enabling robust reconstruction even with occlusions.
This comprehensive survey reviews the literature on world models for robot learning, covering their roles in policy learning, planning, and simulation. It highlights key paradigms, benchmarks, and future directions for predictive modeling in embodied agents.
NVIDIA highlights breakthroughs in physical AI and robotics during National Robotics Week, announcing new technologies including NVIDIA Isaac GR00T open models for natural language instruction understanding, Cosmos world models for synthetic data generation, Newton 1.0 physics engine, and expanded simulation capabilities with Isaac Sim 6.0 and Isaac Lab 3.0 to accelerate robot development from training to real-world deployment.
The article discusses the challenges of programming robots due to Moravec's paradox and proposes robot learning as a solution to enable embodied intelligence.
OpenAI hosted its first Robotics Symposium on April 27, 2019, bringing together robotics and machine learning experts to discuss learning robots and demonstrate their humanoid robot hand solving manipulation tasks using vision and reinforcement learning.
OpenAI proposes an asymmetric actor-critic method for robot learning that leverages full state observability in simulators to train policies that operate on partial observations (RGBD images), enabling effective sim-to-real transfer without real-world training data.