One Sentence, One Drama: Personalized Short-Form Drama Generation via Multi-Agent Systems
Summary
A hierarchical multi-agent framework generates short dramas from single sentences by enforcing narrative pacing, ensuring spatial consistency, and implementing quality control through iterative refinement and reviewer loops. It introduces a new benchmark, Short-Drama-Bench, for evaluation.
View Cached Full Text
Cached at: 05/22/26, 06:27 AM
Paper page - One Sentence, One Drama: Personalized Short-Form Drama Generation via Multi-Agent Systems
Source: https://huggingface.co/papers/2605.22144
Abstract
A hierarchical multi-agent framework generates short dramas from single sentences by enforcing narrative pacing, ensuring spatial consistency, and implementing quality control through iterative refinement and reviewer loops.
Existing approaches for digital short-drama production typically rely on one-shot LLM generated scripts and loosely coupled pipelines, which fail to satisfy three key requirements of short-drama generation: (1)narrative pacing, resulting in weak hooks, insufficient escalation, and unattractive endings; (2)spatial consistency, leading to drifting scene layouts and inconsistent character positions across clips; and (3)production-level quality control, requiring extensive manual review and correction across script and visual stages. We present One Sentence, One Drama, a hierarchicalmulti-agent frameworkthat transforms a user’s single-sentence idea into a fully produced short drama through structured intermediate modules and iterative refinement. Our approach is built upon three key components: (1) a multi-agent debate-basedstory generation modulethat enforces short-drama pacing and narrative coherence; (2) a3D-grounded first-frame generationmechanism that establishes a shared spatial reference for consistent character positioning and scene layout across clips; and (3)multi-stage reviewer loopsthat perform comprehensive error detection and targeted revision across script, visual, and video generation stages. We also introducescene-level BGM matchingandscene transition planningto improve the audience’s immersive experience. To systematically evaluate this task, we introduceShort-Drama-Bench, a benchmark that extends standard video quality metrics with short-drama-specific criteria. Experimental results demonstrate that our method significantly outperforms existing pipelines in narrative quality, cross-clip consistency, and overall viewing experience.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2605\.22144
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.22144 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.22144 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.22144 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
How I stopped context window bloat in continuous Anthropic agent loops (Opus + Sonnet architecture)
A developer shares an architectural pattern to manage context window bloat in continuous Anthropic agent loops, using KV caching, dynamic tool schema loading, and decoupling executor/advisor roles with Claude 3.5 Sonnet and Claude 3 Opus.
@victorialslocum: Most agentic chatbots forget like goldfish or remember like hoarders. There's a better way. Rant time: I'm 𝘴𝘰 tired o…
Weaviate launches Engram, a fully managed memory service for AI agents that actively maintains memory through reconciliation, deduplication, and scoped isolation, treating memory as infrastructure rather than data hoarding.
@FinanceYF5: Seedance 2.0's realism is already absurdly scary. This is 100% AI-generated content. 10 mind-blowing examples:
The AI-generated content from Seedance 2.0 is incredibly realistic, almost absurd, showcasing 10 stunning examples.
@0xSero: Best models for your hardware this week. 8-12GB - https://huggingface.co/LiquidAI/LFM2.5-8B-A1B… incredible model, so f…
A curated weekly roundup of the best AI models for different hardware configurations, from 8GB to 768GB VRAM, highlighting performance and benchmarks.
@polynoamial: https://x.com/polynoamial/status/2064210146558136827
This article argues that LLM benchmark performance is increasingly a function of test-time compute, and that current evaluation methods fail to capture capability improvements when controlling for inference budget. It advocates for plotting performance vs. tokens, cost, or time, and discusses implications for safety evaluations.