Looped World Models
Summary
Looped World Models introduce iterative latent state refinement through shared transformer blocks, achieving 100x parameter efficiency while adapting computational depth to prediction complexity.
View Cached Full Text
Cached at: 06/17/26, 03:35 AM
Paper page - Looped World Models
Source: https://huggingface.co/papers/2606.18208 Published on Jun 16
·
Submitted byhttps://huggingface.co/hongyyyyy
luon Jun 17
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Abstract
Looped World Models introduce iterative latent state refinement through shared transformer blocks, achieving 100x parameter efficiency while adapting computational depth to prediction complexity.
Currentworld modelsface a fundamental tension: faithful long-horizon simulation demands deep computation, but deeper models are expensive to deploy and prone to compounding errors. We resolve this by introducing LoopedWorld Models(LoopWM), which are the firstlooped architecturesfor world modelling. Our method iteratively refineslatent environment statesthrough aparameter-shared transformer block. This yield up to 100x parameter efficiency over conventional approaches withadaptive computationthat automatically scales depth to match the complexity of each prediction step. Orthogonal to scaling model size and training data, LoopWM establishes iterative latent depth as a new scaling axis for world simulation, which might significantly push the community forward.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2606\.18208
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.18208 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.18208 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.18208 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@DorothyDDU: LoopCoder-v2 is out Loop Transformers reuse the same block for recurrent hidden-state refinement — letting models “thin…
This paper introduces LoopCoder-v2, a family of 7B parameter parallel loop transformers for code generation, and studies the optimal number of loops, finding that two loops yield significant gains while more loops cause degradation.
World Machine: Towards Generative World Modeling for Time-Series
World Machine proposes a transformer-based generative world modeling architecture for time series that uses latent states to adapt to varying context lengths, addressing the quadratic memory cost of traditional transformers. Experiments on a synthetic dataset validate its feasibility and show improvements over conventional transformers.
@askalphaxiv: Another cool research on Looped Transformers They ask the question: "Can we loop a frozen, off-the-shelf checkpoint dir…
This research introduces a technique to loop frozen, off-the-shelf transformer checkpoints at inference time by using damped Runge-Kutta substeps, treating transformer layers as Euler steps in a residual ODE. This allows extra latent compute without fine-tuning, architecture changes, or new weights, showing gains on knowledge tasks like MMLU-Pro, GPQA, and ARC.
Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models
Proposes Memory-Efficient Looped Transformer (MELT), a novel recurrent LLM architecture that decouples reasoning depth from memory consumption by sharing a single KV cache across loops and using chunk-wise training with interpolated transition and attention-aligned distillation.
Next-Latent Prediction Transformers Learn Compact World Models
Introduces Next-Latent Prediction (NextLat), a self-supervised objective that trains transformers to predict their next latent state, encouraging compact internal world models and improving generalization across sequence modeling tasks.