@NVIDIAAI: Most motion papers tailor one controller to one specific task. This year at SIGGRAPH, our research team asks: can motor…
Summary
NVIDIA research presents Generative Pretrained Controllers (GPC), a method to pretrain motor control using discrete tokens and transformer-based next-token prediction, enabling fine-tuning for new tasks. Trained on 600+ hours of motion, GPC runs in real-time physics simulation for interactive control.
View Cached Full Text
Cached at: 06/30/26, 09:47 PM
Most motion papers tailor one controller to one specific task. This year at SIGGRAPH, our research team asks: can motor control itself be pretrained and reused?
Generative Pretrained Controllers, or GPC, turn motor skills into a vocabulary of discrete tokens and train a transformer-based generative controller through next-token prediction. Just like GPT, the same pretrained controller can then be fine-tuned to solve new tasks.
Trained on 600+ hours of motion, GPC runs in real-time inside a physics simulation, producing natural and physically grounded behaviors for interactive control.
Similar Articles
NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale
NVIDIA Research presents three papers at CVPR: GraspGen-X (zero-shot grasping foundation model), LCDrive (efficient autonomous driving reasoning), and NitroGen (generalized gameplay AI foundation model), highlighting training at scale for physical AI systems.
Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion Tracking
Humanoid-GPT is a GPT-style Transformer pre-trained on a billion-scale motion corpus, achieving zero-shot generalization for whole-body motion tracking across unseen motions and tasks.
NVIDIA's AI agents taught robots to install GPUs into motherboards without any human help
NVIDIA's ENPIRE framework, developed with CMU and UC Berkeley, uses AI coding agents to autonomously train robots for high-precision physical tasks like GPU installation, achieving a 99% success rate through a closed feedback loop and real hardware trials.
MotiMotion: Motion-Controlled Video Generation with Visual Reasoning
MotiMotion introduces a reasoning-then-generation framework for motion-controlled video generation that uses vision-language reasoning to refine trajectories and a confidence-aware control scheme to improve plausibility, outperforming existing approaches on a new benchmark.
Prediction and control with temporal segment models
OpenAI introduces a method for learning complex nonlinear system dynamics using deep generative models over temporal segments, enabling stable long-horizon predictions and differentiable trajectory optimization for model-based control.