ID-Animator

This repository is the official implementation of ID-Animator. It is a Zero-shot ID-Preserving Human Video Generation framework. It can generate high-quality ID-specific human video with only one ID image as reference.

ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
Xuanhua He, Quande Liu*, Shengju Qian, Xin Wang, Tao Hu, Ke Cao, Keyu Yan, Man Zhou, Jie Zhang* (*Corresponding Author)

Inference

Online demo:

Inference script: inference script

Abstract

Generating high fidelity human video with specified identities has attracted significant attention in the content generation community. However, existing techniques struggle to strike a balance between training efficiency and identity preservation, either requiring tedious case-by-case finetuning or usually missing the identity details in video generation process. In this study, we present ID-Animator, a zero-shot human-video generation approach that can perform personalized video generation given single reference facial image without further training. ID-Animator inherits existing diffusion-based video generation backbones with an face adapter to encode the ID-relevant embeddings from learnable facial latent queries. To facilitate the extraction of identity information in video generation, we introduce an ID-oriented dataset construction pipeline, which incorporates decoupled human attribute and action captioning technique from a constructed facial image pool. Based on this pipeline, a random face reference training method is further devised to precisely capture the ID-relevant embeddings from reference images, thus improving the fidelity and generalization capacity of our model for ID-specific video generation. Extensive experiments demonstrates the superiority of ID-Animator to generate personalized human videos over previous models. Moreover, our method is highly compatible with popular pre-trained T2V models like animatediff and various community backbone models, showing high extendability in real-world applications for video generation where identity preservation is highly desired.

Human Video Generation Demos

Recontextualization

Reference Image	Output Video	Output Video	Output Video

Reference Image	Output Video	Output Video	Output Video

Inference with Community Models

Reference Image	Output Video	Output Video	Output Video

Reference Image	Output Video	Output Video	Output Video

Identity Mixing

Reference Image 1	Reference Image 2	Output Video	Output Video

Reference Image 1	Reference Image 2	Output Video	Output Video

Combination with ControlNet

Reference Image	Sketch Image	Output Video	Output Video

Reference Image	Sketch Sequence	Output Video	Output Video

Contact Us

Xuanhua He: hexuanhua@mail.ustc.edu.cn

Quande Liu: qdliu0226@gmail.com

Shengju Qian: thesouthfrog@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
ID-Animator		ID-Animator
__assets__		__assets__
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ID-Animator

ID-Animator

assets

assets

README.md

README.md

Repository files navigation

ID-Animator

Inference

Next

Abstract

Human Video Generation Demos

Recontextualization

Inference with Community Models

Identity Mixing

Combination with ControlNet

Contact Us

About

Releases

Packages

Languages

ID-Animator/ID-Animator

Folders and files

Latest commit

History

Repository files navigation

ID-Animator

Inference

Next

Abstract

Human Video Generation Demos

Recontextualization

Inference with Community Models

Identity Mixing

Combination with ControlNet

Contact Us

About

Resources

Stars

Watchers

Forks

Languages