DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

Hugging Face Daily Papers 05/29/26, 12:00 AM Papers

Summary

DecMem introduces a decoupled memory architecture with Sparse Global Memory and Anchored Local Memory to achieve consistent minute-long video generation, outperforming state-of-the-art methods.

Recent advances in video generative models have promoted rapid progress in controllable world models. However, maintaining fine-grained spatio-temporal consistency under long-horizon reasoning remains a key challenge. In this work, we move beyond explicit 3D memory and coarse frame-level implicit modeling, and propose a fine-grained, learnable, and scalable memory for consistent world generation. We first identify two fundamental limitations of naïve learnable memory architectures in long-horizon extrapolation, namely computational inefficiency and attention dispersion. Through a systematic analysis of attention dispersion, we propose DecMem, a decoupled memory architecture that employs Sparse Global Memory for efficient fine-grained access to global history and Anchored Local Memory for stable and high-quality extrapolation. Extensive experiments demonstrate that DecMem significantly outperforms current state-of-the-art methods. By ensuring precise and efficient long-term memory and achieving superior extrapolation capabilities, DecMem enables minute-level controllable long video generation with high fidelity and consistency.

Original Article

View Cached Full Text

Cached at: 06/01/26, 11:20 AM

Paper page - DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

Source: https://huggingface.co/papers/2605.31336

Abstract

A novel decoupled memory architecture called DecMem is introduced for consistent long-horizon video generation, addressing computational inefficiency and attention dispersion issues in learnable memory systems.

Recent advances invideo generative modelshave promoted rapid progress in controllableworld models. However, maintaining fine-grainedspatio-temporal consistencyunderlong-horizon reasoningremains a key challenge. In this work, we move beyond explicit 3D memory and coarse frame-level implicit modeling, and propose a fine-grained, learnable, and scalable memory for consistent world generation. We first identify two fundamental limitations of naïvelearnable memoryarchitectures in long-horizonextrapolation, namely computational inefficiency andattention dispersion. Through a systematic analysis ofattention dispersion, we propose DecMem, a decoupled memory architecture that employsSparse Global Memoryfor efficient fine-grained access to global history andAnchored Local Memoryfor stable and high-qualityextrapolation. Extensive experiments demonstrate that DecMem significantly outperforms current state-of-the-art methods. By ensuring precise and efficient long-term memory and achieving superiorextrapolationcapabilities, DecMem enables minute-level controllable longvideo generationwith high fidelity and consistency.

View arXiv page View PDF Project page GitHub3 Add to collection

Get this paper in your agent:

hf papers read 2605\.31336

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper1

#### KlingTeam/DecMem Video-to-Video• Updatedabout 4 hours ago • 2

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.31336 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.31336 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

Paper page - DecMem: Towards Minute-Long Consistent World Generation with Decoupled Memory

Abstract

Models citing this paper1

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

DimMem: Dimensional Structuring for Efficient Long-Term Agent Memory

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

SimpleMem: Efficient Lifelong Memory for LLM Agents

MemForest: An Efficient Agent Memory System with Hierarchical Temporal Indexing

Long Video Generation (4 minute read)

Submit Feedback

Similar Articles

DimMem: Dimensional Structuring for Efficient Long-Term Agent Memory

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

SimpleMem: Efficient Lifelong Memory for LLM Agents

MemForest: An Efficient Agent Memory System with Hierarchical Temporal Indexing

Long Video Generation (4 minute read)