Tag
Echo-Forcing introduces a scene memory framework for interactive long video generation, using hierarchical temporal memory, scene recall frames, and difference-aware memory decay to handle prompt switching and long-term recall. The method is training-free and achieves strong performance on VBench-Long.
AnyRecon proposes a scalable framework for 3D reconstruction from arbitrary sparse inputs using a video diffusion model with persistent scene memory and geometry-aware conditioning.