Sub-JEPA: Subspace Gaussian Regularization for Stable End-to-End World Models
Summary
The authors introduce Sub-JEPA, a method using Subspace Gaussian Regularization to improve the stability of end-to-end world models like LeWM, showing consistent performance gains on continuous-control benchmarks.
View Cached Full Text
Cached at: 05/12/26, 07:33 AM
Paper page - Sub-JEPA: Subspace Gaussian Regularization for Stable End-to-End World Models
Source: https://huggingface.co/papers/2605.09241 We’re releasing Sub-JEPA 🌐
LeWM (from LeCun’s group) is the first end-to-end trainable JEPA world model — it uses isotropic Gaussian regularization to prevent representation collapse. Clean and effective.
Our take: latent representations sit on low-dimensional manifolds, so enforcing a full-space Gaussian is too strong a bias.
We propose Subspace Gaussian Regularization: instead of constraining the full embedding space, we project latents into multiple orthogonal subspaces and apply Gaussian constraints there. Simple change, better inductive bias.
Results on 4 continuous-control benchmarks consistently outperform LeWM, with gains correlated to reductions in effective rank — the lower the task’s intrinsic dimensionality, the larger the gain.
Similar Articles
LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels
LeWorldModel introduces a stable, end-to-end Joint-Embedding Predictive Architecture that trains directly from pixels with minimal hyperparameters and provable anti-collapse guarantees. It achieves significant speedups in planning compared to foundation models while maintaining competitive performance on robotic manipulation tasks.
CGM-JEPA: Learning Consistent Continuous Glucose Monitor Representations via Predictive Self-Supervised Pretraining
Introduces CGM-JEPA, a self-supervised pretraining framework for continuous glucose monitor data that improves cross-modal and cross-cohort performance through masked latent prediction and distributional objectives.
AeroJEPA: Learning Semantic Latent Representations for Scalable 3D Aerodynamic Field Modeling
This paper introduces AeroJEPA, a Joint-Embedding Predictive Architecture for scalable 3D aerodynamic field modeling. It addresses limitations in current surrogate models by predicting semantic latent representations of flow fields, enabling efficient high-fidelity analysis and design optimization.
GitHub - keon/jepa: implementing minimal versions of joint-embedding predictive architecture (JEPA)
A GitHub repository providing minimal, standalone PyTorch reimplementations of JEPA family models (I-JEPA, V-JEPA, V-JEPA 2, C-JEPA) for educational purposes, including tutorials and visualization tools.
Learning Visual Feature-Based World Models via Residual Latent Action
This paper introduces RLA-WM, a visual feature-based world model that leverages residual latent actions and flow matching to efficiently predict future visual states. The method outperforms existing video-diffusion and feature-based approaches while enabling novel robot learning techniques from offline, actionless demonstration videos.