OASIS: From Simulation Data Collection to Real-World Humanoid Loco-Manipulation

Hugging Face Daily Papers Papers

Summary

OASIS is a simulation-data-driven framework for humanoid loco-manipulation that uses 3D generative models and hierarchical visuomotor policies. It achieves better zero-shot performance than real-robot training by leveraging domain randomization in simulation.

Recent progress in robot manipulation has been largely driven by learning from large-scale demonstrations. For humanoid robot loco-manipulation tasks, however, existing data sources force an unsatisfying tradeoff between trajectory quality and scalability. Real-world teleoperation provides the highest-quality trajectories but requires dedicated physical space and time-consuming scene resets. Simulation offers an alternative way out of this dilemma: it can produce clean, embodiment-aligned data at scale without any physical hardware. In this paper, we propose OASIS, a simulation-data-driven framework for humanoid loco-manipulation. OASIS automatically reconstructs realistic object assets from real-world images using a 3D generative model. Based on these assets, trajectories are first collected through teleoperation in simulation, and then augmented under diverse domain randomizations in a post-processing stage. With the resulting simulation data, we further design a hierarchical visuomotor policy for humanoid loco-manipulation. Extensive experiments on the real humanoid robot show that, under zero-shot deployment, the policy trained on our simulation data achieves higher success rates on most tasks than that trained on real-robot teleoperation data, owing largely to the broad lighting and environmental variations covered by our simulation rendering, which real-robot data fails to capture. The project page is available at https://oasis-humanoid.github.io/.
Original Article
View Cached Full Text

Cached at: 06/09/26, 08:43 AM

Paper page - OASIS: From Simulation Data Collection to Real-World Humanoid Loco-Manipulation

Source: https://huggingface.co/papers/2606.08548

Abstract

A simulation-data-driven framework for humanoid loco-manipulation that uses 3D generative models to create realistic assets and hierarchical visuomotor policies trained on simulated data achieves better zero-shot performance than real-robot training.

Recent progress in robot manipulation has been largely driven by learning from large-scale demonstrations. For humanoid robot loco-manipulation tasks, however, existing data sources force an unsatisfying tradeoff between trajectory quality and scalability. Real-worldteleoperationprovides the highest-quality trajectories but requires dedicated physical space and time-consuming scene resets. Simulation offers an alternative way out of this dilemma: it can produce clean, embodiment-aligned data at scale without any physical hardware. In this paper, we propose OASIS, asimulation-data-driven frameworkforhumanoid loco-manipulation. OASIS automatically reconstructs realistic object assets from real-world images using a3D generative model. Based on these assets, trajectories are first collected throughteleoperationin simulation, and then augmented under diversedomain randomizations in a post-processing stage. With the resulting simulation data, we further design ahierarchical visuomotor policyforhumanoid loco-manipulation. Extensive experiments on the real humanoid robot show that, under zero-shot deployment, the policy trained on our simulation data achieves higher success rates on most tasks than that trained on real-robotteleoperationdata, owing largely to the broad lighting and environmental variations covered by our simulation rendering, which real-robot data fails to capture. The project page is available at https://oasis-humanoid.github.io/.

View arXiv pageView PDFProject pageAdd to collection

Get this paper in your agent:

hf papers read 2606\.08548

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2606.08548 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2606.08548 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2606.08548 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Similar Articles

OneHOI: Unifying Human-Object Interaction Generation and Editing

Hugging Face Daily Papers

OneHOI is a unified diffusion transformer framework that consolidates human-object interaction (HOI) generation and editing into a single conditional denoising process using relational modeling and structured attention mechanisms. The approach achieves state-of-the-art results across both HOI generation and editing tasks with support for multiple control modalities.

OdysSim: Building Foundation Models for Human Behavior Simulation

arXiv cs.CL

OdysSim presents a systematic investigation into behavioral foundation models for simulating human behavior, introducing the Soul taxonomy, a corpus of 21.4M interactions, and a training recipe that achieves state-of-the-art on 8 of 23 benchmark tasks while producing more human-like outputs.