What types of face degradation can GFPGAN fix?

GFPGAN handles a wide range of face degradation types including blur from motion or focus issues, JPEG compression artifacts, low resolution and pixelation, noise from high ISO photography, color degradation, and general quality loss from multiple re-encoding cycles. It excels at restoring fine details like individual eyelashes, skin pores, teeth definition, and hair strands that are lost in degraded images.

How does GFPGAN preserve identity?

GFPGAN uses a channel-split spatial feature transform architecture that balances between generating realistic facial details and maintaining identity fidelity. The model leverages both the degraded input and StyleGAN2 generative priors, with the spatial transform layers ensuring that identity-specific features like facial structure, nose shape, and eye characteristics are preserved while texture and detail quality is enhanced.

What is the difference between GFPGAN and CodeFormer?

Both are face restoration models, but they use different approaches. GFPGAN uses GAN-based generative priors from StyleGAN2 for restoration, while CodeFormer uses a discrete codebook and transformer architecture. GFPGAN tends to produce sharper, more detailed results but may occasionally generate artifacts. CodeFormer often produces more natural-looking results with better identity preservation but slightly softer details.

Can GFPGAN process video?

GFPGAN itself processes individual images, but it can be applied to video frame-by-frame through various pipeline tools. Real-ESRGAN includes GFPGAN integration for video processing. The single forward pass design makes GFPGAN fast enough for practical video enhancement, typically processing each face in 30-100ms on a modern GPU, enabling near-real-time face restoration for standard video resolutions.

What hardware is needed for GFPGAN?

GFPGAN is very lightweight and efficient. It can run on GPUs with as little as 2GB VRAM for single face processing. CPU inference is also possible though slower. The model processes a single face in approximately 30-100ms on a modern NVIDIA GPU. For batch processing or video applications, 4-6GB VRAM provides comfortable performance. The model weights are approximately 330MB in size.

Is GFPGAN open source?

Yes, GFPGAN is fully open source under the Apache 2.0 license, developed by Tencent ARC Lab. The model weights, training code, and inference pipeline are all available on GitHub. GFPGAN is also integrated into the Real-ESRGAN project for combined face and background restoration. The permissive license allows unrestricted use in both research and commercial applications across all platforms.

GFPGAN

Open Source

4.5

Tencent ARC

GFPGAN is a practical face restoration algorithm developed by Tencent ARC that leverages generative facial priors embedded in a pre-trained StyleGAN2 model to restore severely degraded face images with remarkable quality. First released in December 2021, GFPGAN addresses the challenging problem of blind face restoration where input images may suffer from unknown combinations of low resolution, blur, noise, compression artifacts, and other forms of degradation. The model's architecture combines a degradation removal module with a StyleGAN2-based generative prior, using a novel channel-split spatial feature transform layer that balances fidelity to the original face with the high-quality facial details provided by the generative model. This approach allows GFPGAN to restore fine facial details including skin textures, eye clarity, hair strands, and tooth definition that are completely lost in the degraded input. The model processes faces through a U-Net encoder that extracts multi-resolution features from the degraded image, which then modulate the StyleGAN2 decoder's feature maps to produce a restored output that preserves the original identity while dramatically enhancing quality. GFPGAN excels in old photo restoration, enhancing low-resolution surveillance footage, improving video call quality, recovering damaged family photographs, and preparing low-quality source material for professional use. The model is open source under Apache 2.0, available on Hugging Face and Replicate, and has become a foundational component integrated into numerous creative AI tools and pipelines. Its ability to handle real-world degradation patterns rather than just synthetic corruption makes it particularly valuable for practical restoration tasks encountered by photographers, archivists, and content creators.

Image to Image

Visit Website

Key Highlights

StyleGAN2 Prior Integration

Generates realistic skin texture, sharp eyes, and natural teeth by leveraging pretrained StyleGAN2 features as generative priors.

Single-Pass Fast Processing

Operates in a single forward pass, providing much faster face restoration compared to iterative methods for real-time workflows.

Identity-Preserving Restoration

Prioritizes preserving the person's identity features and recognizability while enhancing facial details and restoring quality.

Universal Tool Integration

Seamlessly integrates with all major AI art tools including ComfyUI, Automatic1111, Fooocus, and Real-ESRGAN for versatile use.

About

GFPGAN (Generative Facial Prior GAN) is a pioneering open-source face restoration model developed by Tencent ARC Lab, first released in 2021 and continuously refined through 2023. The model specializes in restoring severely degraded facial images by correcting blur, noise, JPEG compression artifacts, low resolution, and general degradation while faithfully preserving the subject's identity and producing photorealistic facial details. With over 35,000 GitHub stars, GFPGAN has established itself as one of the most widely adopted face restoration tools in the AI art, photography, and image processing communities worldwide.

The architecture employs a novel approach that leverages pretrained face GAN priors derived from StyleGAN2 weights, combined with channel-split spatial feature transform (CS-SFT) layers. This innovative design enables the model to extract high-level texture and detail information from GAN priors while preserving low-level geometric structure. The U-Net backbone in the encoder-decoder architecture generates multi-scale feature maps that are fused with StyleGAN2 features through CS-SFT layers. Operating in a single forward pass, GFPGAN achieves remarkable speed compared to iterative restoration methods, producing results in well under a second on an average GPU.

In terms of performance benchmarks, GFPGAN consistently achieves strong scores across standard metrics including PSNR, SSIM, and LPIPS. The v1.3 and v1.4 releases brought significant improvements in eye restoration clarity, teeth sharpness, and hair strand detail reconstruction. In the blind face restoration category, GFPGAN demonstrates superior results compared to prior methods such as DFDNet and PSFRGAN in both quantitative metrics and user studies. The model processes face crops at 512x512 resolution while accepting input images at any resolution.

The use cases for GFPGAN span a remarkably wide range of applications. It serves as a critical post-processing step in AI image generation workflows, particularly for enhancing facial quality in Stable Diffusion outputs, correcting artifacts after face swapping operations, and restoring old or damaged photographs. Professional photographers use it for improving portraits captured in low-light conditions, forensic imaging specialists leverage it for enhancing low-quality surveillance footage, and family archivists employ it to revitalize decades-old photographs with degraded facial details.

GFPGAN is fully open source under the Apache 2.0 license, making it suitable for both personal and commercial use. Model weights and source code are freely accessible on GitHub. It integrates seamlessly with virtually every major AI art tool including ComfyUI, Automatic1111 WebUI, Fooocus, FaceSwap, and numerous standalone applications. Its direct integration into the Real-ESRGAN project enables combined face and background restoration through a single unified pipeline.

Within the AI face restoration landscape, GFPGAN occupies a unique position balancing speed, quality, and accessibility. While alternatives like CodeFormer offer finer control over the fidelity-quality tradeoff, GFPGAN's single-pass speed and extensive integration support make it indispensable for batch processing and real-time workflows. The model's open-source nature and active community support have played a pivotal role in democratizing face restoration technology across the creative and professional imaging industries.

Use Cases

Post-AI Generation Processing

Enhancing face quality in AI-generated images to achieve more realistic results.

Old Photo Restoration

Restoring faces in old, degraded, or low-resolution family photographs.

Post-Face Swap Enhancement

Enhancing face quality after face swapping with ROOP or similar tools.

Video Frame Restoration

Improving video quality by enhancing faces in low-quality video frames.

Pros & Cons

Pros

Effectively restores old, blurry, low-resolution, compressed, or damaged photographs
Can recover fine details like skin texture, facial hair, and even makeup
Runs at suitable speed for both batch processing and real-time applications
Open-source structure allows community contributions and customizations

Cons

May still struggle with extremely low-quality images with barely recognizable faces
Can generate features not present in the original image in some cases (hallucination)
High-quality restoration requires GPU and can be resource-intensive
Challenges in preserving high fidelity of restored faces in the vast continuous latent space
May not always match the quality of results produced by GPEN or CodeFormer

Technical Details

Parameters

N/A

Architecture

GAN (StyleGAN2-based)

Training Data

FFHQ (Flickr-Faces-HQ) dataset

License

Apache 2.0

Features

Face Restoration from Degraded Images
Identity-Preserving Enhancement
StyleGAN2 Prior Integration
Single Forward Pass Speed
Eye and Teeth Detail Restoration
Skin Texture Generation
Multi-Resolution Input Support
Real-ESRGAN Integration

Benchmark Results

Metric	Value	Compared To	Source
Yüz Restorasyon Kalitesi (FID)	49.51 (CelebA-Test)	DFDNet: 52.58	GFPGAN Paper (CVPR 2021)
LPIPS Skoru	0.3672	PSFRGAN: 0.4028	GFPGAN Paper (CVPR 2021)
Inference Süresi	~80ms (GPU), ~1.5s (CPU)	CodeFormer: ~120ms (GPU)	GFPGAN GitHub Benchmarks
Parametre Sayısı	~60M	CodeFormer: ~75M	Tencent ARC / GFPGAN GitHub

Available Platforms

hugging face

replicate

News & References

GFPGAN widely integrated into many AI pipelines

GitHub · 2024-02

Frequently Asked Questions

Related Models

ControlNet

Lvmin Zhang|1.4B

ControlNet is a conditional control framework for Stable Diffusion models that enables precise structural guidance during image generation through various conditioning inputs such as edge maps, depth maps, human pose skeletons, segmentation masks, and normal maps. Developed by Lvmin Zhang and Maneesh Agrawala at Stanford University, ControlNet adds trainable copy branches to frozen diffusion model encoders, allowing the model to learn spatial conditioning without altering the original model's capabilities. This architecture preserves the base model's generation quality while adding fine-grained control over composition, structure, and spatial layout of generated images. ControlNet supports multiple conditioning types simultaneously, enabling complex multi-condition workflows where users can combine pose, depth, and edge information to guide generation with extraordinary precision. The framework revolutionized professional AI image generation workflows by solving the fundamental challenge of maintaining consistent spatial structures across generated images. It has become an essential tool for professional artists and designers who need precise control over character poses, architectural layouts, product placements, and scene compositions. ControlNet is open-source and available on Hugging Face with pre-trained models for various Stable Diffusion versions including SD 1.5 and SDXL. It integrates seamlessly with ComfyUI and Automatic1111. Concept artists, character designers, architectural visualizers, fashion designers, and animation studios rely on ControlNet for production workflows. Its influence has extended beyond Stable Diffusion, inspiring similar control mechanisms in FLUX.1 and other modern image generation models.

Open Source

4.8

InstantID

InstantX Team|N/A

InstantID is a zero-shot identity-preserving image generation framework developed by InstantX Team that can generate images of a specific person in various styles, poses, and contexts using only a single reference photograph. Unlike traditional face-swapping or personalization methods that require multiple reference images or time-consuming fine-tuning, InstantID achieves accurate identity preservation from just one facial photograph through an innovative architecture combining a face encoder, IP-Adapter, and ControlNet for facial landmark guidance. The system extracts detailed facial identity features from the reference image and injects them into the generation process, ensuring that the generated person maintains recognizable facial features, proportions, and characteristics across diverse output scenarios. InstantID supports various creative applications including generating portraits in different artistic styles, placing the person in imagined scenes or contexts, creating profile pictures and avatars, and producing marketing materials featuring consistent character representations. The model works with Stable Diffusion XL as its base and is open-source, available on GitHub and Hugging Face for local deployment. It integrates with ComfyUI through community-developed nodes and can be accessed through cloud APIs. Portrait photographers, social media content creators, marketing teams creating personalized campaigns, game developers designing character variants, and digital artists exploring identity-based creative work all use InstantID. The framework has influenced subsequent identity-preservation models and remains one of the most effective solutions for single-image identity transfer in the open-source ecosystem.

Open Source

4.7

IP-Adapter

Tencent|22M

IP-Adapter is an image prompt adapter developed by Tencent AI Lab that enables image-guided generation for text-to-image diffusion models without requiring any fine-tuning of the base model. The adapter works by extracting visual features from reference images using a CLIP image encoder and injecting these features into the diffusion model's cross-attention layers through a decoupled attention mechanism. This allows users to provide reference images as visual prompts alongside text prompts, guiding the generation process to produce images that share stylistic elements, compositional features, or visual characteristics with the reference while still following the text description. IP-Adapter supports multiple modes of operation including style transfer, where the generated image adopts the artistic style of the reference, and content transfer, where specific subjects or elements from the reference appear in the output. The adapter is lightweight, adding minimal computational overhead to the base model's inference process. It can be combined with other control mechanisms like ControlNet for multi-modal conditioning, enabling sophisticated workflows where pose, style, and content can each be controlled independently. IP-Adapter is open-source and available for various Stable Diffusion versions including SD 1.5 and SDXL. It integrates with ComfyUI and Automatic1111 through community extensions. Digital artists, product designers, brand managers, and content creators who need to maintain visual consistency across generated images or transfer specific aesthetic qualities from reference material particularly benefit from IP-Adapter's capabilities.

Open Source

4.6

IP-Adapter FaceID

Tencent|22M (adapter)

IP-Adapter FaceID is a specialized adapter module developed by Tencent AI Lab that injects facial identity information into the diffusion image generation process, enabling the creation of new images that faithfully preserve a specific person's facial features. Unlike traditional face-swapping approaches, IP-Adapter FaceID extracts face recognition feature vectors from the InsightFace library and feeds them into the diffusion model through cross-attention layers, allowing the model to generate diverse scenes, styles, and compositions while maintaining consistent facial identity. With only approximately 22 million adapter parameters layered on top of existing Stable Diffusion models, FaceID achieves remarkable identity preservation without requiring per-subject fine-tuning or multiple reference images. A single clear face photo is sufficient to generate the person in various artistic styles, different clothing, diverse environments, and novel poses. The adapter supports both SDXL and SD 1.5 base models and can be combined with other ControlNet adapters for additional control over pose, depth, and composition. IP-Adapter FaceID Plus variants incorporate additional CLIP image features alongside face embeddings for improved likeness and detail preservation. Released under the Apache 2.0 license, the model is fully open source and widely integrated into ComfyUI workflows and the Diffusers library. Common applications include personalized avatar creation, custom portrait generation in various artistic styles, character consistency in storytelling and comic creation, personalized marketing content, and social media content creation where maintaining a recognizable likeness across multiple generated images is essential.

Open Source

4.5

Quick Info

ParametersN/A

Typegan

LicenseApache 2.0

Released2021-12

ArchitectureGAN (StyleGAN2-based)

Rating4.5 / 5

CreatorTencent ARC

Links

Official Website GitHub arXiv Paper

GFPGAN

Key Highlights

StyleGAN2 Prior Integration

Single-Pass Fast Processing

Identity-Preserving Restoration

Universal Tool Integration

About

Use Cases

Post-AI Generation Processing

Old Photo Restoration

Post-Face Swap Enhancement

Video Frame Restoration

Pros & Cons

Pros

Cons

Technical Details

Features

Benchmark Results

Available Platforms

News & References

Frequently Asked Questions

What types of face degradation can GFPGAN fix?

How does GFPGAN preserve identity?

What is the difference between GFPGAN and CodeFormer?

Can GFPGAN process video?

What hardware is needed for GFPGAN?

Is GFPGAN open source?

Related Models

ControlNet

InstantID

IP-Adapter

IP-Adapter FaceID

Quick Info

Links

Tags