@BasileTerv987: Accepted to TMLR, with reproducibility certification v2 of our JEPA-WM study (arXiv:2512.24497) is out, with new data-s…

X AI KOLs Following Papers

Summary

Basile Terver and colleagues' paper on Joint-Embedding Predictive World Models (JEPA-WM) for robotics has been accepted to TMLR with a reproducibility certification. The updated version includes new data-scaling experiments, a Lipschitz analysis of multistep rollout training, and extended discussions.

Accepted to TMLR, with reproducibility certification v2 of our JEPA-WM study (arXiv:2512.24497) is out, with new data-scaling experiments, a Lipschitz analysis of multistep rollout training, and extended discussions. Recap + what's new w/ @JimmyTYYang1, Jean Ponce, @AdrienBardes, @ylecun
Original Article
View Cached Full Text

Cached at: 05/25/26, 02:40 PM

Accepted to TMLR, with reproducibility certification

v2 of our JEPA-WM study (arXiv:2512.24497) is out, with new data-scaling experiments, a Lipschitz analysis of multistep rollout training, and extended discussions.

Recap + what’s new

w/ @JimmyTYYang1, Jean Ponce, @AdrienBardes, @ylecun

Basile Terver (@BasileTerv987): My first PhD paper is out! 🎓

“What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?”

tl:dr: JEPA-WMs for robotics: learn dynamics on top of visual encoders, optimize actions towards goal 👇

w/ @JimmyTYYang1, Jean Ponce, @AdrienBardes, @ylecun

Similar Articles

@artemZholus: thanks! in the second paper (https://arxiv.org/abs/2605.06388) we used your (and RAE's) recipe and it worked.

X AI KOLs Following

This paper systematically compares reconstruction-based and semantic latent spaces for action-conditioned latent diffusion world models in robotics. It finds that semantic encoders like V-JEPA 2.1 generally outperform reconstruction encoders on policy-relevant metrics, advocating for semantic latent spaces as a stronger foundation for robotics world models.

Representation Without Reward: A JEPA Audit for LLM Fine-Tuning

arXiv cs.LG

This paper audits Joint-embedding predictive architectures (JEPA) for LLM fine-tuning on a natural-language-to-regex task, testing twenty-two auxiliary objectives. The results show that hidden-state representation improvements are only weakly coupled to decoded-task accuracy, with no auxiliary surviving family-wise correction.

LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

Papers with Code Trending

LeWorldModel introduces a stable, end-to-end Joint-Embedding Predictive Architecture that trains directly from pixels with minimal hyperparameters and provable anti-collapse guarantees. It achieves significant speedups in planning compared to foundation models while maintaining competitive performance on robotic manipulation tasks.