SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

Hugging Face Daily Papers 05/27/26, 12:00 AM Papers

Summary

SmartDirector is a framework that enhances video generation by using multiple keyframes to improve narrative structure and temporal pacing, operating in a two-stage process of low-resolution generation and high-resolution refinement.

The narrative quality of a video fundamentally determines its perceptual value. Although existing video generation methods can produce visually appealing content, they predominantly rely on sparse conditioning signals such as text prompts or first/last frames, which limits precise control over narrative structure and temporal pacing. In this paper, we propose SmartDirector, a framework that enhances the narrative capacity of video generation models through multiple keyframes. SmartDirector supports flexible generation scenarios including single-shot generation, multi-shot narrative synthesis, and video extension. The framework operates in two stages: Director-Gen generates a low-resolution video conditioned on the provided keyframes, and Director-SR refines the output by exploiting high-resolution keyframes as semantic anchors to recover fine-grained details. To enable robust multi-keyframe training, we construct a data pipeline that curates single-shot and multi-shot sequences from movies. Extensive experiments demonstrate that SmartDirector substantially outperforms existing state-of-the-art approaches. We will release the code to facilitate further research.

Original Article

View Cached Full Text

Cached at: 05/29/26, 03:00 AM

Paper page - SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

Source: https://huggingface.co/papers/2605.27891

Abstract

SmartDirector enhances video generation by using multiple keyframes to improve narrative structure and temporal pacing through a two-stage process of low-resolution generation and high-resolution refinement.

The narrative quality of a video fundamentally determines its perceptual value. Although existingvideo generationmethods can produce visually appealing content, they predominantly rely on sparse conditioning signals such as text prompts or first/last frames, which limits precise control overnarrative structureandtemporal pacing. In this paper, we propose SmartDirector, a framework that enhances the narrative capacity ofvideo generationmodels through multiplekeyframes. SmartDirector supports flexible generation scenarios includingsingle-shot generation,multi-shot narrative synthesis, andvideo extension. The framework operates in two stages: Director-Gen generates alow-resolution videoconditioned on the providedkeyframes, and Director-SR refines the output by exploiting high-resolutionkeyframesas semantic anchors to recover fine-grained details. To enable robust multi-keyframe training, we construct adata pipelinethat curates single-shot and multi-shot sequences from movies. Extensive experiments demonstrate that SmartDirector substantially outperforms existing state-of-the-art approaches. We will release the code to facilitate further research.

View arXiv page View PDF Project page GitHub5 Add to collection

Get this paper in your agent:

hf papers read 2605\.27891

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.27891 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.27891 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.27891 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

Paper page - SmartDirector: Keyframe-Conditioned Cinematic Video Generation with Narrative Pacing Control

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Experimenting with storyboard-planned AI cinematics instead of single-prompt generation

CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives

MotiMotion: Motion-Controlled Video Generation with Visual Reasoning

@DeRonin_: This tool just changed what motion design looks like one prompt in = finished motion piece out [ how it works ]: - 10+ …

Made a cinematic futuristic car trailer using only a text prompt

Submit Feedback

Similar Articles

Experimenting with storyboard-planned AI cinematics instead of single-prompt generation

CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives

MotiMotion: Motion-Controlled Video Generation with Visual Reasoning

@DeRonin_: This tool just changed what motion design looks like one prompt in = finished motion piece out [ how it works ]: - 10+ …

Made a cinematic futuristic car trailer using only a text prompt