@LeRobotHF: VLA-JEPA just dropped in LeRobot What makes this model special is that it does not just learn what action to take from …

X AI KOLs Following 06/06/26, 08:08 AM Models

vla-jepa lerobot world-model robotics v-jepa fine-tuning nvidia

Summary

VLA-JEPA, a new model integrating a JEPA world model for action-relevant dynamics, has been released in LeRobot. It enables pretraining on human videos and achieves strong performance with minimal fine-tuning, running real-time on NVIDIA DGX Spark.

VLA-JEPA just dropped in LeRobot What makes this model special is that it does not just learn what action to take from a given observation, it also leverages a JEPA world model to learn action-relevant dynamics. During training, the VLA leverages V-JEPA2 by conditioning its predictor. This clever trick adds a world modeling objective to the training, which also allows pretraining on human videos. At inference, the world model is dropped entirely, keeping only a standard VLA architecture: Qwen backbone and action head. The demo here was only fine-tuned on 13 examples, showing great pretraining capability and running in real time on @NVIDIARobotics DGX Spark! VLA-JEPA is the first world model to be ported to LeRobot, and I feel like it won't be the last @Thom_Wolf @ClementDelangue

Original Article

View Cached Full Text

Cached at: 06/08/26, 05:14 AM

VLA-JEPA just dropped in LeRobot

What makes this model special is that it does not just learn what action to take from a given observation, it also leverages a JEPA world model to learn action-relevant dynamics.

During training, the VLA leverages V-JEPA2 by conditioning its predictor. This clever trick adds a world modeling objective to the training, which also allows pretraining on human videos. At inference, the world model is dropped entirely, keeping only a standard VLA architecture: Qwen backbone and action head.

The demo here was only fine-tuned on 13 examples, showing great pretraining capability and running in real time on @NVIDIARobotics DGX Spark!

VLA-JEPA is the first world model to be ported to LeRobot, and I feel like it won’t be the last

@Thom_Wolf @ClementDelangue

@LeRobotHF: VLA-JEPA just dropped in LeRobot What makes this model special is that it does not just learn what action to take from …

Similar Articles

@AbdelStark: It’s time to JEPA pill the world! awesome-jepa: A curated list of papers, models, code, datasets, and learning resource…

So, what is Yann LeCun's "World Models" and JEPA and is it Really a Replacement for LLMs?

GitHub - keon/jepa: implementing minimal versions of joint-embedding predictive architecture (JEPA)

Sub-JEPA: a simple fix to LeCun group's LeWorldModel that consistently improves performance [P]

Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack

Submit Feedback

Similar Articles

@AbdelStark: It’s time to JEPA pill the world! awesome-jepa: A curated list of papers, models, code, datasets, and learning resource…

So, what is Yann LeCun's "World Models" and JEPA and is it Really a Replacement for LLMs?

GitHub - keon/jepa: implementing minimal versions of joint-embedding predictive architecture (JEPA)

Sub-JEPA: a simple fix to LeCun group's LeWorldModel that consistently improves performance [P]

Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack