What makes CodeFormer different from GFPGAN?

CodeFormer and GFPGAN are both face restoration models, but they differ in approach and controllability. GFPGAN uses a GAN-based architecture with pre-trained StyleGAN2 features as a generative prior. CodeFormer uses a discrete codebook learned from high-quality faces combined with a Transformer prediction network. CodeFormer's key advantage is its controllable fidelity weight parameter, allowing users to balance between identity preservation and quality, which GFPGAN does not offer in the same way.

How does the fidelity weight parameter work?

The fidelity weight (w) in CodeFormer ranges from 0 to 1 and controls the balance between two restoration objectives. At w=0, the model relies entirely on its learned codebook prior, generating the highest quality facial details but potentially changing facial identity. At w=1, the model preserves the original facial features as closely as possible but with less refined details. Values between 0.5 and 0.7 typically provide the best balance for most use cases, preserving recognizable identity while still generating high-quality facial details.

Can CodeFormer restore very blurry faces?

Yes, CodeFormer is specifically designed to handle severe facial degradations including extreme blur, heavy noise, and very low resolution. The discrete codebook approach allows the model to generate plausible facial details even when the input face is barely recognizable. However, when faces are extremely degraded, the model may generate details that look realistic but don't precisely match the original person's appearance. Using a lower fidelity weight produces more visually pleasing results on severely degraded inputs.

Is CodeFormer open source?

Yes, CodeFormer is open source with code and pre-trained model weights available on GitHub. The project is released under the S-Lab License 1.0, which permits non-commercial research and educational use. For commercial applications, you should review the license terms carefully. The model can be easily integrated into Python-based image processing pipelines and is also available through various web-based interfaces and tools like Replicate and Hugging Face Spaces.

How does CodeFormer integrate with Real-ESRGAN?

CodeFormer can be used as the face enhancement component in the Real-ESRGAN pipeline. When processing a full image, Real-ESRGAN first detects face regions using a face detection model, then passes each detected face to CodeFormer for specialized face restoration, and finally blends the restored faces back into the overall upscaled image. This combination leverages Real-ESRGAN's strength in general image upscaling with CodeFormer's specialized face restoration capabilities for optimal results.

What are the hardware requirements for CodeFormer?

CodeFormer is relatively lightweight and can run efficiently on consumer hardware. A GPU with at least 2GB VRAM is sufficient for processing individual face images, and 4GB VRAM handles most practical scenarios comfortably. The model also runs on CPU, though processing is slower — approximately 1-3 seconds per face on GPU versus 10-30 seconds on CPU. For batch processing of many faces, a GPU with 4GB or more VRAM is recommended for reasonable throughput.

CodeFormer

Open Source

4.6

Tencent ARC

CodeFormer is a state-of-the-art blind face restoration model developed by researchers at Nanyang Technological University in collaboration with Tencent ARC, presented at NeurIPS 2022. The model employs a unique Transformer-based architecture with a discrete codebook lookup mechanism to restore severely degraded facial images with exceptional fidelity. Its most distinguishing feature is an adjustable w parameter ranging from 0.0 to 1.0 that gives users precise control over the balance between identity preservation and restoration quality. Architecturally, CodeFormer consists of three core components: a VQGAN encoder-decoder that learns discrete visual codes from high-quality face datasets, a codebook that stores these learned representations, and a Transformer module that predicts optimal code combinations during restoration. This approach enables the model to produce plausible facial details even under extreme degradation because it draws information from learned priors rather than solely from the corrupted input. In benchmark evaluations on CelebA-HQ and WIDER-Face datasets, CodeFormer achieves superior results across FID, NIQE, and identity similarity metrics compared to previous methods. Practical applications include restoring old family photographs, enhancing faces in AI-generated images, extracting facial details from low-resolution video frames, and professional photo retouching. The model is open source, integrates with popular tools like ComfyUI, AUTOMATIC1111 WebUI, and Fooocus, and offers cloud inference through Replicate API and Hugging Face Spaces demos for accessible experimentation.

Image Upscale

Visit Website

Key Highlights

Controllable Fidelity-Quality Trade-off

Ability to adjust the balance between preserving original identity and maximizing visual quality according to user needs via fidelity weight parameter

Discrete Codebook Approach

An innovative restoration method that can generate realistic facial details using a discrete codebook learned from high-quality face images

Severe Degradation Handling

Robust performance that can produce high-quality restoration even on severely degraded faces where other methods fail

Wide Ecosystem Integration

A widely used face restoration solution integrated into the Real-ESRGAN pipeline and other popular image processing tools

About

CodeFormer is a state-of-the-art blind face restoration model developed by Nanyang Technological University and Microsoft Research Asia, presented at the NeurIPS 2022 conference. The model employs a Transformer-based architecture with a discrete code lookup mechanism to restore degraded facial images with exceptional fidelity and quality. CodeFormer's most distinguishing feature is its adjustable parameter that gives users precise control over the balance between identity fidelity and restoration quality, making it an invaluable tool for professional workflows requiring fine-tuned results.

Architecturally, CodeFormer consists of three core components: a VQGAN encoder-decoder, a discrete codebook, and a Transformer module. In the first stage, the VQGAN encoder learns discrete visual codes from a large dataset of high-quality facial images and stores them in a codebook. During restoration, the degraded input image passes through the encoder, and the Transformer module predicts the optimal code combination, retrieving correct facial details from the codebook. This approach enables the model to produce plausible facial details even under severe degradation because it draws information from the learned codebook rather than solely from the corrupted input image.

Through its controllable feature transformation module, CodeFormer allows adjustment of the fidelity-quality balance via a w parameter ranging from 0.0 to 1.0. Lower w values produce higher quality and detail at the potential cost of slightly reduced identity fidelity, while higher w values maintain closer adherence to the original face with more conservative restoration. This flexibility enables a single model to address diverse use case requirements. In benchmark evaluations, CodeFormer achieves superior results on the CelebA-HQ and WIDER-Face datasets across FID, NIQE, and identity similarity metrics, consistently outperforming previous methods.

The practical applications of CodeFormer span numerous domains. It excels at restoring old and damaged family photographs, enhancing facial quality in AI-generated image outputs, extracting facial details from low-resolution video frames, and supporting professional photo retouching workflows. It is particularly preferred for challenging restoration tasks where severe degradation is present, as it produces more reliable results than GFPGAN in extreme cases. For video restoration, it can be applied frame by frame to enhance facial quality in vintage films and archival footage.

CodeFormer is published as open source on GitHub and is free for research and personal use. The model integrates with popular AI art tools including ComfyUI, Automatic1111 WebUI, and Fooocus. A live demo is available on Hugging Face Spaces, and cloud-based inference is accessible through the Replicate API. Its Python-based installation with pip dependency management and comprehensive documentation makes it accessible to developers across skill levels.

In the face restoration landscape, CodeFormer stands out as the model that best balances controllability with output quality. While it may not match GFPGAN's raw processing speed, it demonstrates superior performance in challenging degradation scenarios and workflows requiring professional-grade control. Its Transformer-based codebook approach has established a new paradigm in face restoration technology, inspiring subsequent research and advancing the field's understanding of how learned priors can be effectively leveraged for image reconstruction tasks.

Use Cases

Old Photo Face Restoration

Naturally and realistically restoring degraded faces in old family photographs

Video Face Enhancement

Improving video quality by restoring faces in low-quality video frames individually

Image Upscaling Support

Using as a complementary component that enhances face region quality in upscaling operations alongside Real-ESRGAN

Identity Verification Pre-processing

Improving accuracy of identity verification systems by restoring faces in low-quality security camera images

Pros & Cons

Pros

Handles low-quality inputs effectively; addresses low resolution, noise, compression artifacts, and blur simultaneously
Adjustable fidelity weight parameter balances quality enhancement with identity preservation
Demonstrates superior performance and robustness on synthetic and real-world datasets over state-of-the-art methods
Rich detail and facial expression reconstruction through Codebook Lookup Transformer architecture

Cons

Reconstructs faces creatively with more detail but sometimes less identity fidelity
Specialized in face restoration but may not address all image enhancement or manipulation aspects
Slower processing time at approximately 10 seconds vs GFPGAN's average 6 seconds
Lower fidelity weight yields higher quality but weakens identity preservation; balancing requires user expertise

Technical Details

Parameters

N/A

Architecture

Transformer with learned discrete codebook (VQGAN-based)

Training Data

FFHQ dataset (70K high-quality face images)

License

Apache 2.0

Features

Discrete Codebook Face Restoration
Adjustable Fidelity Weight Parameter
Transformer-Based Global Composition
Multi-Degradation Type Support
Real-ESRGAN Pipeline Integration
NeurIPS 2022 Published Research

Benchmark Results

Metric	Value	Compared To	Source
FID Score (CelebA-HQ)	52.42	GFPGAN: 56.82, VQFR: 55.13	CodeFormer Paper (NeurIPS 2022)
PSNR (CelebA-Test)	24.89 dB	GFPGAN: 23.84 dB	CodeFormer Paper (NeurIPS 2022)
SSIM (CelebA-Test)	0.68	GFPGAN: 0.65	CodeFormer Paper (NeurIPS 2022)
Fidelity-Quality Denge (w parametresi)	0.0 - 1.0 ayarlanabilir	—	GitHub sczhou/CodeFormer

Available Platforms

hugging face

replicate

fal ai

Frequently Asked Questions

Related Models

Real-ESRGAN

Tencent ARC|N/A

Real-ESRGAN is an open-source image upscaling and restoration model developed by Xintao Wang and collaborators at Tencent ARC Lab that enhances low-resolution, degraded, or compressed images to high-resolution outputs with remarkable detail recovery. Released in 2021 under the BSD license, Real-ESRGAN builds on the original ESRGAN architecture by introducing a high-order degradation modeling approach that simulates the complex, unpredictable quality loss found in real-world images, including compression artifacts, noise, blur, and downsampling. The model uses a U-Net architecture with Residual-in-Residual Dense Blocks as its generator network, trained with a combination of perceptual loss, GAN loss, and pixel loss to produce sharp, natural-looking upscaled results. Real-ESRGAN supports upscaling factors of 2x, 4x, and higher, and includes specialized model variants for anime and illustration content alongside the general-purpose photographic model. The model handles real-world degradations far better than its predecessor ESRGAN, which was trained only on synthetic degradation patterns. Real-ESRGAN has become one of the most widely deployed AI upscaling solutions, integrated into numerous applications including desktop tools, web services, mobile apps, and professional image editing workflows. The model runs efficiently on both CPU and GPU, with the lighter RealESRGAN-x4plus-anime variant optimized for consumer hardware. As a fully open-source project available on GitHub with pre-trained weights, it serves as the backbone for popular tools like Upscayl and various ComfyUI nodes. Real-ESRGAN is essential for photographers, content creators, game developers, and anyone who needs to enhance image resolution while preserving natural appearance and adding realistic detail.

Open Source

4.7

Topaz Gigapixel AI

Topaz Labs|N/A

Topaz Gigapixel AI is a commercial desktop application for AI-powered image upscaling and enhancement developed by Topaz Labs, positioned as an industry-standard tool for professional photographers, graphic designers, and image processing specialists. Available on Windows and macOS, the software uses a proprietary hybrid neural network architecture that combines multiple AI models to upscale images by up to 600 percent while preserving and even enhancing fine details, textures, and sharpness. Topaz Gigapixel AI includes specialized processing modes for different content types including faces, standard photography, computer graphics, and low-resolution sources, with each mode optimized to produce the best possible results for its target content. The software features intelligent face detection and enhancement that improves facial details during upscaling, producing natural-looking results even from very low-resolution source images. Topaz Gigapixel AI supports batch processing for handling large volumes of images and integrates with Adobe Lightroom and Photoshop as a plugin, fitting seamlessly into professional photography workflows. The application processes images locally on the user's machine using GPU acceleration, ensuring privacy and fast processing without requiring an internet connection. Output quality is widely regarded as among the best available in commercial upscaling software, with particular strength in preserving natural textures and avoiding the artificial smoothing common in many AI upscalers. As a proprietary product with a one-time purchase or subscription model, Topaz Gigapixel AI is particularly valued by professional photographers enlarging prints, real estate photographers enhancing property images, forensic analysts improving evidence imagery, and archivists restoring historical photographs to modern resolution standards.

Proprietary

4.6

Upscayl

Upscayl Team|N/A

Upscayl is a free and open-source desktop application for AI-powered image upscaling, built on top of Real-ESRGAN and other super-resolution models. Developed by Nayam Amarshe and TGS963, Upscayl provides a user-friendly graphical interface that makes advanced AI image upscaling accessible to non-technical users on Windows, macOS, and Linux platforms. The application wraps multiple AI upscaling models in an Electron-based desktop app, allowing users to enhance image resolution with just a few clicks without any command-line knowledge or Python environment setup. Upscayl includes several pre-installed upscaling models optimized for different content types including general photography, digital art, anime, and sharpening, with each model producing different aesthetic characteristics suited to its target content. Users can select upscaling factors of 2x, 3x, or 4x and process individual images or entire folders through batch processing. The application supports common image formats including PNG, JPG, and WebP, and provides options for output format and quality settings. Upscayl also supports custom model loading, allowing users to import additional NCNN-compatible upscaling models from the community. Released under the AGPL-3.0 license, Upscayl is fully open source with its code available on GitHub and has accumulated a large community of users and contributors. The application runs entirely locally with no internet connection required, ensuring privacy for sensitive images. Upscayl is particularly popular among photographers, graphic designers, content creators, and hobbyists who need a simple, free solution for enhancing image quality without subscriptions or cloud processing dependencies.

Open Source

4.5

SUPIR

Tencent ARC|N/A

SUPIR is an advanced AI image restoration and upscaling model developed by Tencent ARC researchers in 2024 that harnesses the generative power of SDXL, a large-scale Stable Diffusion model, for photo-realistic image enhancement. SUPIR stands for Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration in the Wild. The model introduces a degradation-aware encoder that analyzes the specific types of quality loss present in an input image and generates intelligent text prompts to guide the restoration process, effectively telling the diffusion model what kind of content needs to be restored and how. This intelligent prompting approach enables SUPIR to produce remarkably detailed and natural-looking upscaled results that go beyond simple pixel interpolation to generate semantically meaningful detail. The model leverages the vast visual knowledge embedded in SDXL's pre-trained weights to synthesize realistic textures, facial features, text, and fine patterns during upscaling. SUPIR excels particularly at restoring severely degraded images where traditional upscaling methods fail, including old photographs, heavily compressed web images, and low-resolution captures. The model supports high upscaling factors while maintaining coherent content and natural appearance. Released under a research-only license, SUPIR is open source with code and weights available on GitHub. While computationally intensive due to its SDXL backbone, the model produces results that represent the current frontier of AI-powered image restoration quality. SUPIR is particularly valuable for professional photographers restoring archival images, forensic analysts enhancing surveillance footage, and digital artists who need maximum quality from limited source material.

Open Source

4.6

Quick Info

ParametersN/A

Typetransformer

LicenseApache 2.0

Released2022-06

ArchitectureTransformer with learned discrete codebook (VQGAN-based)

Rating4.6 / 5

CreatorTencent ARC

Links

Official Website GitHub arXiv Paper HuggingFace

CodeFormer

Key Highlights

Controllable Fidelity-Quality Trade-off

Discrete Codebook Approach

Severe Degradation Handling

Wide Ecosystem Integration

About

Use Cases

Old Photo Face Restoration

Video Face Enhancement

Image Upscaling Support

Identity Verification Pre-processing

Pros & Cons

Pros

Cons

Technical Details

Features

Benchmark Results

Available Platforms

Frequently Asked Questions

What makes CodeFormer different from GFPGAN?

How does the fidelity weight parameter work?

Can CodeFormer restore very blurry faces?

Is CodeFormer open source?

How does CodeFormer integrate with Real-ESRGAN?

What are the hardware requirements for CodeFormer?

Related Models

Real-ESRGAN

Topaz Gigapixel AI

Upscayl

SUPIR

Quick Info

Links

Tags