latent-representations

#latent-representations

VisualPatchWorld: Code World Models as Latent Structured Representations for Planning

arXiv cs.CL ↗ · yesterday Cached

VisualPatchWorld introduces a method for learning world dynamics as code, enabling inspectable and editable simulators from data. It achieves strong planning success in navigation and manipulation tasks.

0 favorites 0 likes

#latent-representations

Training Continuous Chain of Thought Models: A Tale of Two Regimes

arXiv cs.AI ↗ · 2026-07-21 Cached

This paper introduces C-MTP, a direct supervision method for training continuous chain-of-thought models that compresses reasoning traces into latent representations. The method performs competitively on simple tasks but reveals that both direct and indirect supervision methods struggle with complex long reasoning traces, showing about 65% performance drop.

0 favorites 0 likes

#latent-representations

The Rank-One Corner: How Much Value Equivalence Does a Task Need from a World Model?

arXiv cs.LG ↗ · 2026-07-09 Cached

This paper investigates how much structure a task needs from a world model, showing that the objective's dimensionality determines how many predictive directions the model installs, with the common scalar reward objective being only the rank-one corner of value equivalence.

0 favorites 0 likes

#latent-representations

@alesfav: AI needs vastly more data than we do. One idea might close the gap: don't predict raw signals (tokens), predict your ow…

X AI KOLs Following ↗ · 2026-05-29 Cached

This thread presents a theoretical result showing that predicting abstract latent representations (as in JEPA and data2vec) instead of raw tokens can exponentially reduce the data gap between AI and human learning.

0 favorites 0 likes

#latent-representations

Learned Relay Representations for Forward-Thinking Discrete Diffusion Models

arXiv cs.LG ↗ · 2026-05-25 Cached

This paper introduces Learned Relay Representations (Relay), a method that allows masked diffusion models to propagate latent information across denoising steps, overcoming the hard reset problem and improving performance-latency trade-offs. The method is shown to outperform standard supervised finetuning on coding tasks while reducing inference latency by up to 32%.

0 favorites 0 likes

#latent-representations

Sub-JEPA: a simple fix to LeCun group's LeWorldModel that consistently improves performance [P]

Reddit r/MachineLearning ↗ · 2026-05-18

Sub-JEPA improves LeWorldModel by applying Gaussian regularization in frozen random orthogonal subspaces, consistently outperforming the original on benchmarks with up to +10.7 percentage points improvement.

0 favorites 0 likes

#latent-representations

Next-Latent Prediction Transformers Learn Compact World Models

Papers with Code Trending ↗ · 2025-11-08 Cached

Introduces Next-Latent Prediction (NextLat), a self-supervised objective that trains transformers to predict their next latent state, encouraging compact internal world models and improving generalization across sequence modeling tasks.

0 favorites 0 likes

latent-representations

VisualPatchWorld: Code World Models as Latent Structured Representations for Planning

Training Continuous Chain of Thought Models: A Tale of Two Regimes

The Rank-One Corner: How Much Value Equivalence Does a Task Need from a World Model?

@alesfav: AI needs vastly more data than we do. One idea might close the gap: don't predict raw signals (tokens), predict your ow…

Learned Relay Representations for Forward-Thinking Discrete Diffusion Models

Sub-JEPA: a simple fix to LeCun group's LeWorldModel that consistently improves performance [P]

Next-Latent Prediction Transformers Learn Compact World Models

Submit Feedback