What is LivePortrait and what is it used for?

LivePortrait is an AI model that creates animated facial movements from a single portrait photo. It brings photos to life by transferring expressions from a source video to the target portrait. It is used in social media content, educational videos, VTuber, and digital avatar applications.

Can LivePortrait work in real-time?

Yes, LivePortrait can perform real-time portrait animation. It can generate live animation from webcam input with low latency on GPU. This feature can be used in VTuber and live streaming applications for interactive content creation.

What is the difference between LivePortrait and D-ID or HeyGen?

LivePortrait is open source and can be run locally for free. D-ID and HeyGen are commercial SaaS solutions that typically offer additional features like voice cloning and multi-language support. LivePortrait provides data privacy and cost advantages.

What hardware is needed to run LivePortrait?

LivePortrait can run with a GPU having at least 4GB VRAM. RTX series cards are recommended for real-time animation. It can also run in CPU mode but real-time performance cannot be achieved. It can be easily used with ComfyUI and Gradio interfaces.

What types of photos give the best results with LivePortrait?

Portrait photos with plain backgrounds, good lighting, and a clearly visible face give the best results. It is recommended that the face is shot from the front or at a slight angle. Accessories like glasses and masks may negatively affect results.

Does LivePortrait raise ethical concerns?

Yes, portrait animation technologies bring deepfake concerns. Using LivePortrait with others' photos without permission can lead to ethical and legal issues. It is recommended to follow responsible use principles and indicate that content is AI-generated.

LivePortrait

Open Source

4.5

Kuaishou

LivePortrait is an efficient AI portrait animation model developed by Kuaishou Technology that generates expressive and lifelike facial animations from a single static portrait photograph. The model takes a source portrait image and a driving video containing facial movements, then transfers the expressions, head rotations, eye movements, and mouth gestures from the video onto the portrait while maintaining the original person's identity and appearance. Built on an implicit keypoint detection architecture with warping-based rendering, LivePortrait achieves real-time inference speeds that make it practical for interactive applications and live content creation. The model introduces stitching and retargeting modules that prevent common artifacts in portrait animation such as face boundary distortion, neck disconnection, and unnatural eye movements, producing seamless results that preserve the natural appearance of the subject. LivePortrait handles diverse portrait types including photographs, paintings, illustrations, and even cartoon characters, adapting its animation approach to different artistic styles. The model supports fine-grained control over individual facial action units, allowing selective animation of specific facial features like eyebrow raises, eye blinks, or smile intensity independently. Released under the MIT license, LivePortrait is fully open source and has been integrated into ComfyUI and other creative tools. Common applications include creating animated avatars for social media and messaging, producing animated portrait NFTs, generating facial animations for virtual presenters and digital humans, creating engaging content from historical photographs, and building interactive portrait experiences for museums and exhibitions.

Face Generation

Image to Video

Visit Website

Key Highlights

Real-Time Portrait Animation

Creates real-time portrait animation from a single photo, enabling live expression transfer.

Expression and Facial Transfer

Creates live animation by naturally transferring facial expressions from source video to target portrait.

Seamless Stitching

Produces natural results by seamlessly blending the animated face region with the original image.

Eye and Lip Tracking

Provides realistic animation by precisely tracking eye movements and lip synchronization.

About

LivePortrait is an AI model developed by Kuaishou Technology in 2024 that generates lively and expressive animations from a single portrait photograph. The user provides a source portrait and a driver video; the model transfers the facial movements and expressions from the video to animate the portrait, creating natural-looking animations. LivePortrait has set new standards in the portrait animation field by offering significantly faster inference times and higher quality results compared to previous portrait animation models, making real-time interactive applications feasible for the first time.

From a technical architecture perspective, LivePortrait employs an implicit keypoint-based approach. The model first separately extracts canonical keypoints and their transformation parameters (rotation, translation, expression deformation) from both the source portrait and the driver video. A warping module then applies spatial deformations to the source portrait using these parameters to create the motion transfer. Finally, a decoder network synthesizes the final animated frame from the deformed feature maps. The model's lightweight architecture operating at 256x256 resolution delivers real-time performance with approximately 12ms inference time on an RTX 4090 GPU.

LivePortrait achieves remarkable results in balancing animation quality with processing speed. It naturally transfers subtle movements including eye blinking, eyebrow motion, mouth opening and closing, head turning, and facial expression changes. The stitching module seamlessly blends the animated face region with the rest of the original image, avoiding the jarring boundaries common in earlier approaches. The model produces successful results across both photographic portraits and illustrations including anime-style images, expanding its practical versatility across different visual styles and artistic mediums.

The applications range from entertainment to professional use cases across multiple industries. Social media content creators use LivePortrait to produce engaging and attention-grabbing animated content, educational platforms bring historical figures to life for immersive learning experiences, game developers generate NPC facial animations, and marketing teams create dynamic advertising content with talking spokespersons. Additional applications include face animation for virtual assistants and chatbot interfaces, lip-sync effects for music videos, and personal gift creation such as animating old family photographs to create memorable keepsakes.

LivePortrait is published as open source under the Apache 2.0 license on GitHub. The PyTorch-based implementation includes pretrained weights and a Gradio-based web interface for immediate use. ComfyUI integration is provided through community-developed custom nodes. The model can be accelerated on NVIDIA GPUs using ONNX Runtime and TensorRT optimizations for production deployments. An interactive demo is available on Hugging Face Spaces. A minimum of 4 GB VRAM is sufficient for basic operation, though 8 GB or more is recommended for optimal performance and higher resolution processing.

In the portrait animation domain, LivePortrait represents a new pinnacle in the speed-quality balance. Building upon predecessors such as First Order Motion Model and Face-vid2vid, the model enables interactive applications through its real-time inference capability. The seamless blending provided by the stitching module and precise retargeting control make LivePortrait the most accessible and practical portrait animation solution for real-world applications. Kuaishou's contribution through this project is actively shaping the future of portrait animation in mobile and real-time applications, bridging the gap between research breakthroughs and consumer-ready technology.

Use Cases

Virtual Presentation and Training

Transforming static photos into talking animations for educational videos and presentations.

Social Media Content Creation

Creating portrait animations for entertaining and viral social media content.

Digital Avatar and VTuber

Real-time facial expression transfer for VTuber and digital avatar applications.

Historical and Artistic Revival

Reviving old photographs and historical portraits for museum and educational use.

Pros & Cons

Pros

Live portrait animation from a single photo — 12.8ms/frame on RTX 4090
Open-source project developed by Kuaishou
Trained on 69 million high-quality frames
Precise facial animation with eye and lip retargeting control
Solution adopted by major platforms like Kuaishou, Douyin, WeChat Channels

Cons

Requires driving video — no fully text-based control
Artifacts at profile angles and extreme movements
Additional fine-tuning needed for animal animation
Background and body animation not supported — face only

Technical Details

Parameters

Unknown

Architecture

Implicit Keypoints + Warping

Training Data

VoxCeleb + proprietary

License

MIT

Features

Portrait animation
Expression transfer
Stitching
Retargeting
Real-time capable
Eye tracking
Lip sync

Benchmark Results

Metric	Value	Compared To	Source
İfade Aktarım Doğruluğu (AKD)	1.47	Face Vid2Vid: 2.12 (düşük daha iyi)	LivePortrait Paper (arXiv:2407.03168)
Kimlik Koruma (CSIM)	0.79	DaGAN: 0.72	LivePortrait Paper
İşleme Hızı	~30 FPS (RTX 4090)	SadTalker: ~15 FPS	GitHub Repository
Çözünürlük	512×512 (native)	—	Hugging Face Model Card

Available Platforms

GitHub

ComfyUI

Replicate

Frequently Asked Questions

Related Models

This Person Does Not Exist

Philip Wang|N/A

This Person Does Not Exist is a web-based demonstration created by Uber software engineer Philip Wang that generates photorealistic portraits of entirely fictional people using NVIDIA's StyleGAN technology. Launched in February 2019, the website became a viral sensation by producing a new AI-generated human face each time the page is refreshed, showcasing the capability of generative adversarial networks to synthesize convincing portraits indistinguishable from real photographs. The underlying model was trained on the FFHQ dataset containing 70,000 high-resolution photographs of real human faces, learning to generate novel facial compositions with realistic skin textures, hair patterns, lighting, eye reflections, and natural asymmetries. The generated faces span diverse demographics including various ages, ethnicities, and genders, demonstrating the model's understanding of facial diversity. While outputs are convincing at first glance, careful examination occasionally reveals telltale artifacts such as asymmetric earrings, distorted backgrounds, or inconsistencies in hair at image edges. The project serves multiple purposes beyond demonstration: it has been widely used in discussions about deepfake technology and media literacy, serves as a privacy-preserving source of placeholder portraits for design mockups and UI prototyping, and provides stock-photo-like imagery without licensing concerns. The website itself is proprietary, though the underlying StyleGAN architecture is open source. This Person Does Not Exist remains one of the most recognized public demonstrations of GAN capabilities and continues to spark conversations about AI-generated media authenticity and digital trust in an era of increasingly sophisticated synthetic content.

Proprietary

4.3

StyleGAN3

NVIDIA|N/A

StyleGAN3 is the third generation of NVIDIA's groundbreaking StyleGAN series of generative adversarial networks, designed to produce high-quality, photorealistic images with unprecedented control over visual attributes. Presented at NeurIPS 2021, StyleGAN3 addresses a fundamental limitation of its predecessors by eliminating texture sticking artifacts that occurred during continuous transformations and animations. Previous GAN architectures suffered from features that appeared fixed to pixel coordinates rather than moving naturally with objects, creating noticeable visual glitches during interpolation. StyleGAN3 solves this through alias-free generation using continuous signal processing principles, ensuring that fine details move smoothly and naturally with the underlying content. The architecture introduces rotation and translation equivariance, meaning generated features transform correctly and consistently when the image undergoes geometric transformations. This makes StyleGAN3 particularly suited for video generation, animation, and any application requiring smooth transitions between generated frames. The model supports configurable output resolutions and maintains the style mixing capabilities from earlier versions, allowing granular control over coarse features like pose and face shape independently from fine details like hair texture and skin quality. StyleGAN3 has been trained on various domains including human faces (FFHQ dataset), animal faces (AFHQv2), and other image categories. The model is fully open source under a custom NVIDIA license permitting research and commercial use, with official PyTorch implementations available on GitHub. It continues to serve as a benchmark reference for unconditional image generation quality and has influenced numerous subsequent GAN architectures and diffusion model designs in the generative AI landscape.

Open Source

4.5

ProGAN

NVIDIA|N/A

ProGAN (Progressive Growing of GANs) is a generative adversarial network architecture developed by NVIDIA researchers Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen, introduced in 2017, that pioneered progressively growing both generator and discriminator networks during training to produce high-resolution face images. Instead of training at the target resolution directly, ProGAN starts at 4x4 pixels and incrementally adds layers handling progressively higher resolutions, smoothly fading in each detail level. This progressive strategy stabilizes training by learning large-scale structure before fine details, reduces training time compared to full-resolution training from scratch, and enables much higher resolution output than previously possible with GANs. ProGAN was the first GAN to convincingly generate 1024x1024 photorealistic face images, a milestone that captured widespread attention. The model was trained on CelebA-HQ, a high-quality celebrity faces dataset curated for this research. Beyond faces, ProGAN successfully generated high-resolution images of bedrooms, cars, and other categories, demonstrating versatility. The architecture introduced minibatch standard deviation for output diversity and equalized learning rate for training stability. ProGAN is fully open source with official TensorFlow implementations and community PyTorch ports. While subsequent architectures like StyleGAN built upon ProGAN's progressive training foundation to achieve higher quality and controllability, ProGAN remains a landmark contribution that changed how high-resolution GANs are trained and inspired an entire generation of improved generative models.

Open Source

4.0

DCGAN Face

Radford et al.|N/A

DCGAN (Deep Convolutional Generative Adversarial Network) Face is a pioneering architecture introduced by Alec Radford, Luke Metz, and Soumith Chintala in their influential 2015 paper that established foundational principles for using convolutional neural networks in GAN architectures. DCGAN was among the first models to demonstrate that deep convolutional networks could reliably generate coherent images, particularly human faces, moving GANs beyond simple fully-connected architectures into practical image generation. The architecture introduces key design guidelines that became standard practice: replacing pooling layers with strided convolutions in the discriminator and fractional-strided convolutions in the generator, using batch normalization to stabilize training, removing fully connected hidden layers, and applying ReLU activation in the generator with LeakyReLU in the discriminator. Trained on the CelebA celebrity faces dataset, DCGAN Face produces 64x64 pixel facial images that, while modest by modern standards, were groundbreaking at publication. The model also demonstrated meaningful latent space arithmetic, showing that vector operations produce semantically meaningful results such as combining features from different faces. This work has become one of the most cited papers in GAN literature and remains essential reading in deep learning education. DCGAN is fully open source with implementations in PyTorch, TensorFlow, and other frameworks. While surpassed in quality by ProGAN, StyleGAN, and diffusion models, DCGAN remains historically significant as the architecture that proved convolutional GANs were viable for image generation and established design patterns still used in modern generative models.

Open Source

3.5

Quick Info

ParametersUnknown

TypeImplicit Keypoints

LicenseMIT

Released2024-07

ArchitectureImplicit Keypoints + Warping

Rating4.5 / 5

CreatorKuaishou

Links

Official Website GitHub

LivePortrait

Key Highlights

Real-Time Portrait Animation

Expression and Facial Transfer

Seamless Stitching

Eye and Lip Tracking

About

Use Cases

Virtual Presentation and Training

Social Media Content Creation

Digital Avatar and VTuber

Historical and Artistic Revival

Pros & Cons

Pros

Cons

Technical Details

Features

Benchmark Results

Available Platforms

Frequently Asked Questions

What is LivePortrait and what is it used for?

Can LivePortrait work in real-time?

What is the difference between LivePortrait and D-ID or HeyGen?

What hardware is needed to run LivePortrait?

What types of photos give the best results with LivePortrait?

Does LivePortrait raise ethical concerns?

Related Models

This Person Does Not Exist

StyleGAN3

ProGAN

DCGAN Face

Quick Info

Links

Tags