@AdinaYakup: JD just released JoyAI-Echo An interesting long video generation model 5 minute multi shot video generation Cross modal…

X AI KOLs Following Models

Summary

JD released JoyAI-Echo, a long video generation model capable of 5-minute multi-shot video with cross-modal memory for character and voice consistency, native audio+video generation, and 7.5x speed improvement via DMD distillation.

JD just released JoyAI-Echo 📹 An interesting long video generation model ✨ 5 minute multi shot video generation ✨ Cross modal memory for character & voice consistency ✨ Native audio + video generation ✨ 7.5× faster via DMD distillation (without quality loss) https://t.co/qIel5Gc8qX
Original Article
View Cached Full Text

Cached at: 06/03/26, 01:51 PM

JD just released JoyAI-Echo 📹 An interesting long video generation model

✨ 5 minute multi shot video generation ✨ Cross modal memory for character & voice consistency ✨ Native audio + video generation ✨ 7.5× faster via DMD distillation (without quality loss) https://t.co/qIel5Gc8qX

Similar Articles

jdopensource/JoyAI-Echo

Hugging Face Models Trending

JD Open Source releases JoyAI-Echo (Echo-LongVideo), a text-to-audio-video diffusion model capable of generating minute-level multi-shot videos with consistent character identity and voice, using DMD distillation for 7.5x speedup.

Long Video Generation (4 minute read)

TLDR AI

The article introduces A²RD, a novel architecture for generating consistent long videos using agentic autoregressive diffusion. It proposes a Retrieve–Synthesize–Refine–Update cycle and a new benchmark, LVBench-C, to address semantic drift in long-horizon video synthesis.