Implicit Preference Alignment for Human Image Animation

Hugging Face Daily Papers 05/08/26, 12:00 AM Papers

Summary

This paper introduces Implicit Preference Alignment (IPA), a data-efficient post-training framework that improves hand motion generation in human image animation without requiring paired preference data. It utilizes implicit reward maximization and hand-aware local optimization to enhance generation quality while reducing data curation costs.

Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. While reinforcement learning from human feedback, particularly direct preference optimization, offers a potential solution, it necessitates the construction of strict preference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise inconsistencies. In this paper, we propose Implicit Preference Alignment (IPA), a data-efficient post-training framework that eliminates the need for paired preference data. Theoretically grounded in implicit reward maximization, IPA aligns the model by maximizing the likelihood of self-generated high-quality samples while penalizing deviations from the pretrained prior. Furthermore, we introduce a Hand-Aware Local Optimization mechanism to explicitly steer the alignment process toward hand regions. Experiments demonstrate that our method achieves effective preference optimization to enhance hand generation quality, while significantly lowering the barrier for constructing preference data. Codes are released at https://github.com/mdswyz/IPA

Original Article

View Cached Full Text

Cached at: 05/13/26, 08:11 AM

Paper page - Implicit Preference Alignment for Human Image Animation

Source: https://huggingface.co/papers/2605.07545

Abstract

Implicit Preference Alignment (IPA) addresses hand motion generation challenges through data-efficient post-training that eliminates need for paired preference data while using hand-aware local optimization for improved quality.

Human image animation has witnessed significant advancements, yet generating high-fidelity hand motions remains a persistent challenge due to their high degrees of freedom and motion complexity. Whilereinforcement learning from human feedback, particularlydirect preference optimization, offers a potential solution, it necessitates the construction of strictpreference pairs. However, curating such pairs for dynamic hand regions is prohibitively expensive and often impractical due to frame-wise inconsistencies. In this paper, we proposeImplicit Preference Alignment(IPA), a data-efficientpost-training frameworkthat eliminates the need for paired preference data. Theoretically grounded inimplicit reward maximization, IPA aligns the model by maximizing the likelihood of self-generated high-quality samples while penalizing deviations from the pretrained prior. Furthermore, we introduce aHand-Aware Local Optimizationmechanism to explicitly steer the alignment process toward hand regions. Experiments demonstrate that our method achieves effective preference optimization to enhance hand generation quality, while significantly lowering the barrier for constructing preference data. Codes are released at https://github.com/mdswyz/IPA

View arXiv page View PDF Project page GitHub Add to collection

Get this paper in your agent:

hf papers read 2605\.07545

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.07545 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.07545 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.07545 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Implicit Preference Alignment for Human Image Animation

Paper page - Implicit Preference Alignment for Human Image Animation

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

IAPO: Input Attribution-Aware Policy Optimization for Tool Use in Small Multimodal Agents

See Before You Code: Learning Visual Priors for Spatially Aware Educational Animation Generation

From Correctness to Preference: A Framework for Personalized Agentic Reinforcement Learning

Offline Preference Optimization for Rectified Flow with Noise-Tracked Pairs

Learning from human preferences

Submit Feedback

Similar Articles

IAPO: Input Attribution-Aware Policy Optimization for Tool Use in Small Multimodal Agents

See Before You Code: Learning Visual Priors for Spatially Aware Educational Animation Generation

From Correctness to Preference: A Framework for Personalized Agentic Reinforcement Learning

Offline Preference Optimization for Rectified Flow with Noise-Tracked Pairs

Learning from human preferences