@LeRobotHF: VLA-JEPA just dropped in LeRobot What makes this model special is that it does not just learn what action to take from …
Summary
VLA-JEPA, a new model integrating a JEPA world model for action-relevant dynamics, has been released in LeRobot. It enables pretraining on human videos and achieves strong performance with minimal fine-tuning, running real-time on NVIDIA DGX Spark.
View Cached Full Text
Cached at: 06/08/26, 05:14 AM
VLA-JEPA just dropped in LeRobot
What makes this model special is that it does not just learn what action to take from a given observation, it also leverages a JEPA world model to learn action-relevant dynamics.
During training, the VLA leverages V-JEPA2 by conditioning its predictor. This clever trick adds a world modeling objective to the training, which also allows pretraining on human videos. At inference, the world model is dropped entirely, keeping only a standard VLA architecture: Qwen backbone and action head.
The demo here was only fine-tuned on 13 examples, showing great pretraining capability and running in real time on @NVIDIARobotics DGX Spark!
VLA-JEPA is the first world model to be ported to LeRobot, and I feel like it won’t be the last
@Thom_Wolf @ClementDelangue
Similar Articles
@AbdelStark: It’s time to JEPA pill the world! awesome-jepa: A curated list of papers, models, code, datasets, and learning resource…
A curated list of papers, models, code, datasets, and learning resources for Joint Embedding Predictive Architectures (JEPA), the self-supervised approach to world models proposed by Yann LeCun.
So, what is Yann LeCun's "World Models" and JEPA and is it Really a Replacement for LLMs?
Discusses Yann LeCun's 'World Models' and JEPA from a recent arXiv paper, clarifying that it is not a replacement for LLMs but a model optimized for visual processing in robotics, self-driving, and industrial controls.
GitHub - keon/jepa: implementing minimal versions of joint-embedding predictive architecture (JEPA)
A GitHub repository providing minimal, standalone PyTorch reimplementations of JEPA family models (I-JEPA, V-JEPA, V-JEPA 2, C-JEPA) for educational purposes, including tutorials and visualization tools.
Sub-JEPA: a simple fix to LeCun group's LeWorldModel that consistently improves performance [P]
Sub-JEPA improves LeWorldModel by applying Gaussian regularization in frozen random orthogonal subspaces, consistently outperforming the original on benchmarks with up to +10.7 percentage points improvement.
Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack
HyVLA-0.5 is an end-to-end robotic learning system that integrates data collection, model design, pre-training, fine-tuning, and reinforcement learning for real-world deployment.