zai-org/SCAIL-2 · Hugging Face

Reddit r/LocalLLaMA 06/09/26, 06:43 PM Models

character-animation video-generation end-to-end open-source hugging-face controlled-animation multi-character

Summary

SCAIL-2 is an open-source model for end-to-end controlled character animation that animates a reference character with a driving video, supporting character replacement and multi-character scenarios without intermediate pose representations.

# SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning SCAIL-2 is an open-source model for **end-to-end controlled character animation**. It animates a reference character with a driving video, and also supports character replacement and multi-character scenarios without relying on intermediate pose representations. Overview Prior approaches to character animation depend heavily on intermediate representations such as skeleton maps or inpainting masks. These intermediates are ambiguous under complex motion, restrict driving sources to human movements, and limit the reach of replacement and multi-character animation. SCAIL-2 removes this dependence and achieve **End-to-end Driving**. Using several off-the-shelf models (SCAIL-Preview, Wan-Animate, MoCha), 60K motion pairs were synthesized and trained through a Unified Motion Transfer Interface with dedicated masking channels and RoPE design. The reverse driving training recipe with the unification lets the model learn capabilities beyond its teacher models, yielding emergent abilities such as: * Cross-identity character replacement * Animal-driving scenarios * Zero-shot support for advanced control intermediates like SAM3D-Body mesh rendering

Original Article

View Cached Full Text

Cached at: 06/10/26, 12:21 AM

zai-org/SCAIL-2 · Hugging Face

Source: https://huggingface.co/zai-org/SCAIL-2

https://huggingface.co/zai-org/SCAIL-2#scail-2-unifying-controlled-character-animation-with-end-to-end-in-context-conditioningSCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

SCAIL-2 is an open-source model forend-to-end controlled character animation. It animates a reference character with a driving video, and also supports character replacement and multi-character scenarios without relying on intermediate pose representations.

Teaser

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%94%8E-overview🔎 Overview

Prior approaches to character animation depend heavily on intermediate representations such as skeleton maps or inpainting masks. These intermediates are ambiguous under complex motion, restrict driving sources to human movements, and limit the reach of replacement and multi-character animation.

SCAIL-2 removes this dependence and achieveEnd-to-end Driving. Using several off-the-shelf models (SCAIL-Preview, Wan-Animate, MoCha), 60K motion pairs were synthesized and trained through a Unified Motion Transfer Interface with dedicated masking channels and RoPE design. The reverse driving training recipe with the unification lets the model learn capabilities beyond its teacher models, yielding emergent abilities such as:

Cross-identity character replacement
Animal-driving scenarios
Zero-shot support for advanced control intermediates like SAM3D-Body mesh rendering

pipeline

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%93%A6-model📦 Model

ItemDetailResolutionsEnd-to-end driving supports both 512p and 704p; pose-driven and replacement performs better at 704pConstraintsH and W must both be divisible by 32 (e.g. 704×1280)TrainingMixed resolutions and fpsBundled modulesWan VAE and T5 are integrated into the checkpoint for convenience File layout after download:

SCAIL-2/
├── Wan2.1_VAE.pth
├── model
│   ├── 1
│   │   └── fsdp2_rank_0000_checkpoint.pt
│   └── latest
└── umt5-xxl
    └── ...

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%9A%80-usage🚀 Usage

Inference code, environment setup, and detailed instructions are provided in the project repository. Please refer to theProject Pageand the code repo for how to run the model.

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%93%84-citation📄 Citation

@article{yan2025scail,
  title={SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations},
  author={Yan, Wenhao and Ye, Sheng and Yang, Zhuoyi and Teng, Jiayan and Dong, ZhenHui and Wen, Kairui and Gu, Xiaotao and Liu, Yong-Jin and Tang, Jie},
  journal={arXiv preprint arXiv:2512.05905},
  year={2025}
}

zai-org/SCAIL-2 · Hugging Face

zai-org/SCAIL-2 · Hugging Face

https://huggingface.co/zai-org/SCAIL-2#scail-2-unifying-controlled-character-animation-with-end-to-end-in-context-conditioningSCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%94%8E-overview🔎 Overview

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%93%A6-model📦 Model

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%9A%80-usage🚀 Usage

https://huggingface.co/zai-org/SCAIL-2#%F0%9F%93%84-citation📄 Citation

Similar Articles

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

meituan-longcat/LongCat-Video-Avatar-1.5 · Hugging Face

@Saboo_Shubham_: INSANE...this is an Open Source Video model available for free on Hugging Face. LongCat just dropped an amazing video a…

Single Reference to a Fully Rigged 3D Character Using AI 3D Generation

@victormustar: New: LongCat just dropped an excellent open-source talking-avatar model (probably SOTA) + MIT licensed Made a Hugging F…

Submit Feedback

Similar Articles

SCAIL-2: Unifying Controlled Character Animation with End-to-end In-Context Conditioning

meituan-longcat/LongCat-Video-Avatar-1.5 · Hugging Face

@Saboo_Shubham_: INSANE...this is an Open Source Video model available for free on Hugging Face. LongCat just dropped an amazing video a…

Single Reference to a Fully Rigged 3D Character Using AI 3D Generation

@victormustar: New: LongCat just dropped an excellent open-source talking-avatar model (probably SOTA) + MIT licensed Made a Hugging F…