PersonaLive! Expressive Portrait Image Animation for Live Streaming

Papers with Code Trending 12/12/25, 03:24 AM Papers

portrait-animation diffusion-models real-time live-streaming image-to-video computer-vision

Summary

PersonaLive is a diffusion-based framework for real-time expressive portrait animation in live streaming, achieving significant speedups through hybrid implicit signals and autoregressive streaming generation.

Current diffusion-based portrait animation models predominantly focus on enhancing visual quality and expression realism, while overlooking generation latency and real-time performance, which restricts their application range in the live streaming scenario. We propose PersonaLive, a novel diffusion-based framework towards streaming real-time portrait animation with multi-stage training recipes. Specifically, we first adopt hybrid implicit signals, namely implicit facial representations and 3D implicit keypoints, to achieve expressive image-level motion control. Then, a fewer-step appearance distillation strategy is proposed to eliminate appearance redundancy in the denoising process, greatly improving inference efficiency. Finally, we introduce an autoregressive micro-chunk streaming generation paradigm equipped with a sliding training strategy and a historical keyframe mechanism to enable low-latency and stable long-term video generation. Extensive experiments demonstrate that PersonaLive achieves state-of-the-art performance with up to 7-22x speedup over prior diffusion-based portrait animation models.

Original Article

View Cached Full Text

Cached at: 05/08/26, 08:42 AM

Paper page - PersonaLive! Expressive Portrait Image Animation for Live Streaming

Source: https://huggingface.co/papers/2512.11253

Abstract

PersonaLive is a diffusion-based portrait animation framework that improves real-time performance through hybrid implicit signals, appearance distillation, and autoregressive streaming generation.

Currentdiffusion-based portrait animationmodels predominantly focus on enhancing visual quality and expression realism, while overlooking generation latency and real-time performance, which restricts their application range in the live streaming scenario. We propose PersonaLive, a novel diffusion-based framework towards streaming real-time portrait animation with multi-stage training recipes. Specifically, we first adopthybrid implicit signals, namelyimplicit facial representationsand3D implicit keypoints, to achieve expressive image-level motion control. Then, a fewer-stepappearance distillationstrategy is proposed to eliminate appearance redundancy in the denoising process, greatly improving inference efficiency. Finally, we introduce anautoregressive micro-chunk streaming generationparadigm equipped with asliding training strategyand ahistorical keyframe mechanismto enable low-latency and stable long-term video generation. Extensive experiments demonstrate that PersonaLive achieves state-of-the-art performance with up to 7-22x speedup over priordiffusion-based portrait animationmodels.

View arXiv page View PDF GitHub3.05k Add to collection

Get this paper in your agent:

hf papers read 2512\.11253

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash