CodeFormer icon

CodeFormer

Open Source
4.6
Tencent ARC

CodeFormer is a state-of-the-art blind face restoration model developed by researchers at Nanyang Technological University in collaboration with Tencent ARC, presented at NeurIPS 2022. The model employs a unique Transformer-based architecture with a discrete codebook lookup mechanism to restore severely degraded facial images with exceptional fidelity. Its most distinguishing feature is an adjustable w parameter ranging from 0.0 to 1.0 that gives users precise control over the balance between identity preservation and restoration quality. Architecturally, CodeFormer consists of three core components: a VQGAN encoder-decoder that learns discrete visual codes from high-quality face datasets, a codebook that stores these learned representations, and a Transformer module that predicts optimal code combinations during restoration. This approach enables the model to produce plausible facial details even under extreme degradation because it draws information from learned priors rather than solely from the corrupted input. In benchmark evaluations on CelebA-HQ and WIDER-Face datasets, CodeFormer achieves superior results across FID, NIQE, and identity similarity metrics compared to previous methods. Practical applications include restoring old family photographs, enhancing faces in AI-generated images, extracting facial details from low-resolution video frames, and professional photo retouching. The model is open source, integrates with popular tools like ComfyUI, AUTOMATIC1111 WebUI, and Fooocus, and offers cloud inference through Replicate API and Hugging Face Spaces demos for accessible experimentation.

Image Upscale

Key Highlights

Controllable Fidelity-Quality Trade-off

Ability to adjust the balance between preserving original identity and maximizing visual quality according to user needs via fidelity weight parameter

Discrete Codebook Approach

An innovative restoration method that can generate realistic facial details using a discrete codebook learned from high-quality face images

Severe Degradation Handling

Robust performance that can produce high-quality restoration even on severely degraded faces where other methods fail

Wide Ecosystem Integration

A widely used face restoration solution integrated into the Real-ESRGAN pipeline and other popular image processing tools

About

CodeFormer is a state-of-the-art blind face restoration model developed by Nanyang Technological University and Microsoft Research Asia, presented at the NeurIPS 2022 conference. The model employs a Transformer-based architecture with a discrete code lookup mechanism to restore degraded facial images with exceptional fidelity and quality. CodeFormer's most distinguishing feature is its adjustable parameter that gives users precise control over the balance between identity fidelity and restoration quality, making it an invaluable tool for professional workflows requiring fine-tuned results.

Architecturally, CodeFormer consists of three core components: a VQGAN encoder-decoder, a discrete codebook, and a Transformer module. In the first stage, the VQGAN encoder learns discrete visual codes from a large dataset of high-quality facial images and stores them in a codebook. During restoration, the degraded input image passes through the encoder, and the Transformer module predicts the optimal code combination, retrieving correct facial details from the codebook. This approach enables the model to produce plausible facial details even under severe degradation because it draws information from the learned codebook rather than solely from the corrupted input image.

Through its controllable feature transformation module, CodeFormer allows adjustment of the fidelity-quality balance via a w parameter ranging from 0.0 to 1.0. Lower w values produce higher quality and detail at the potential cost of slightly reduced identity fidelity, while higher w values maintain closer adherence to the original face with more conservative restoration. This flexibility enables a single model to address diverse use case requirements. In benchmark evaluations, CodeFormer achieves superior results on the CelebA-HQ and WIDER-Face datasets across FID, NIQE, and identity similarity metrics, consistently outperforming previous methods.

The practical applications of CodeFormer span numerous domains. It excels at restoring old and damaged family photographs, enhancing facial quality in AI-generated image outputs, extracting facial details from low-resolution video frames, and supporting professional photo retouching workflows. It is particularly preferred for challenging restoration tasks where severe degradation is present, as it produces more reliable results than GFPGAN in extreme cases. For video restoration, it can be applied frame by frame to enhance facial quality in vintage films and archival footage.

CodeFormer is published as open source on GitHub and is free for research and personal use. The model integrates with popular AI art tools including ComfyUI, Automatic1111 WebUI, and Fooocus. A live demo is available on Hugging Face Spaces, and cloud-based inference is accessible through the Replicate API. Its Python-based installation with pip dependency management and comprehensive documentation makes it accessible to developers across skill levels.

In the face restoration landscape, CodeFormer stands out as the model that best balances controllability with output quality. While it may not match GFPGAN's raw processing speed, it demonstrates superior performance in challenging degradation scenarios and workflows requiring professional-grade control. Its Transformer-based codebook approach has established a new paradigm in face restoration technology, inspiring subsequent research and advancing the field's understanding of how learned priors can be effectively leveraged for image reconstruction tasks.

Use Cases

1

Old Photo Face Restoration

Naturally and realistically restoring degraded faces in old family photographs

2

Video Face Enhancement

Improving video quality by restoring faces in low-quality video frames individually

3

Image Upscaling Support

Using as a complementary component that enhances face region quality in upscaling operations alongside Real-ESRGAN

4

Identity Verification Pre-processing

Improving accuracy of identity verification systems by restoring faces in low-quality security camera images

Pros & Cons

Pros

  • Handles low-quality inputs effectively; addresses low resolution, noise, compression artifacts, and blur simultaneously
  • Adjustable fidelity weight parameter balances quality enhancement with identity preservation
  • Demonstrates superior performance and robustness on synthetic and real-world datasets over state-of-the-art methods
  • Rich detail and facial expression reconstruction through Codebook Lookup Transformer architecture

Cons

  • Reconstructs faces creatively with more detail but sometimes less identity fidelity
  • Specialized in face restoration but may not address all image enhancement or manipulation aspects
  • Slower processing time at approximately 10 seconds vs GFPGAN's average 6 seconds
  • Lower fidelity weight yields higher quality but weakens identity preservation; balancing requires user expertise

Technical Details

Parameters

N/A

Architecture

Transformer with learned discrete codebook (VQGAN-based)

Training Data

FFHQ dataset (70K high-quality face images)

License

Apache 2.0

Features

  • Discrete Codebook Face Restoration
  • Adjustable Fidelity Weight Parameter
  • Transformer-Based Global Composition
  • Multi-Degradation Type Support
  • Real-ESRGAN Pipeline Integration
  • NeurIPS 2022 Published Research

Benchmark Results

MetricValueCompared ToSource
FID Score (CelebA-HQ)52.42GFPGAN: 56.82, VQFR: 55.13CodeFormer Paper (NeurIPS 2022)
PSNR (CelebA-Test)24.89 dBGFPGAN: 23.84 dBCodeFormer Paper (NeurIPS 2022)
SSIM (CelebA-Test)0.68GFPGAN: 0.65CodeFormer Paper (NeurIPS 2022)
Fidelity-Quality Denge (w parametresi)0.0 - 1.0 ayarlanabilir—GitHub sczhou/CodeFormer

Available Platforms

hugging face
replicate
fal ai

Frequently Asked Questions

Related Models

Real-ESRGAN icon

Real-ESRGAN

Tencent ARC|N/A

Real-ESRGAN is an open-source image upscaling and restoration model developed by Xintao Wang and collaborators at Tencent ARC Lab that enhances low-resolution, degraded, or compressed images to high-resolution outputs with remarkable detail recovery. Released in 2021 under the BSD license, Real-ESRGAN builds on the original ESRGAN architecture by introducing a high-order degradation modeling approach that simulates the complex, unpredictable quality loss found in real-world images, including compression artifacts, noise, blur, and downsampling. The model uses a U-Net architecture with Residual-in-Residual Dense Blocks as its generator network, trained with a combination of perceptual loss, GAN loss, and pixel loss to produce sharp, natural-looking upscaled results. Real-ESRGAN supports upscaling factors of 2x, 4x, and higher, and includes specialized model variants for anime and illustration content alongside the general-purpose photographic model. The model handles real-world degradations far better than its predecessor ESRGAN, which was trained only on synthetic degradation patterns. Real-ESRGAN has become one of the most widely deployed AI upscaling solutions, integrated into numerous applications including desktop tools, web services, mobile apps, and professional image editing workflows. The model runs efficiently on both CPU and GPU, with the lighter RealESRGAN-x4plus-anime variant optimized for consumer hardware. As a fully open-source project available on GitHub with pre-trained weights, it serves as the backbone for popular tools like Upscayl and various ComfyUI nodes. Real-ESRGAN is essential for photographers, content creators, game developers, and anyone who needs to enhance image resolution while preserving natural appearance and adding realistic detail.

Open Source
4.7
Topaz Gigapixel AI icon

Topaz Gigapixel AI

Topaz Labs|N/A

Topaz Gigapixel AI is a commercial desktop application for AI-powered image upscaling and enhancement developed by Topaz Labs, positioned as an industry-standard tool for professional photographers, graphic designers, and image processing specialists. Available on Windows and macOS, the software uses a proprietary hybrid neural network architecture that combines multiple AI models to upscale images by up to 600 percent while preserving and even enhancing fine details, textures, and sharpness. Topaz Gigapixel AI includes specialized processing modes for different content types including faces, standard photography, computer graphics, and low-resolution sources, with each mode optimized to produce the best possible results for its target content. The software features intelligent face detection and enhancement that improves facial details during upscaling, producing natural-looking results even from very low-resolution source images. Topaz Gigapixel AI supports batch processing for handling large volumes of images and integrates with Adobe Lightroom and Photoshop as a plugin, fitting seamlessly into professional photography workflows. The application processes images locally on the user's machine using GPU acceleration, ensuring privacy and fast processing without requiring an internet connection. Output quality is widely regarded as among the best available in commercial upscaling software, with particular strength in preserving natural textures and avoiding the artificial smoothing common in many AI upscalers. As a proprietary product with a one-time purchase or subscription model, Topaz Gigapixel AI is particularly valued by professional photographers enlarging prints, real estate photographers enhancing property images, forensic analysts improving evidence imagery, and archivists restoring historical photographs to modern resolution standards.

Proprietary
4.6
Upscayl icon

Upscayl

Upscayl Team|N/A

Upscayl is a free and open-source desktop application for AI-powered image upscaling, built on top of Real-ESRGAN and other super-resolution models. Developed by Nayam Amarshe and TGS963, Upscayl provides a user-friendly graphical interface that makes advanced AI image upscaling accessible to non-technical users on Windows, macOS, and Linux platforms. The application wraps multiple AI upscaling models in an Electron-based desktop app, allowing users to enhance image resolution with just a few clicks without any command-line knowledge or Python environment setup. Upscayl includes several pre-installed upscaling models optimized for different content types including general photography, digital art, anime, and sharpening, with each model producing different aesthetic characteristics suited to its target content. Users can select upscaling factors of 2x, 3x, or 4x and process individual images or entire folders through batch processing. The application supports common image formats including PNG, JPG, and WebP, and provides options for output format and quality settings. Upscayl also supports custom model loading, allowing users to import additional NCNN-compatible upscaling models from the community. Released under the AGPL-3.0 license, Upscayl is fully open source with its code available on GitHub and has accumulated a large community of users and contributors. The application runs entirely locally with no internet connection required, ensuring privacy for sensitive images. Upscayl is particularly popular among photographers, graphic designers, content creators, and hobbyists who need a simple, free solution for enhancing image quality without subscriptions or cloud processing dependencies.

Open Source
4.5
SUPIR icon

SUPIR

Tencent ARC|N/A

SUPIR is an advanced AI image restoration and upscaling model developed by Tencent ARC researchers in 2024 that harnesses the generative power of SDXL, a large-scale Stable Diffusion model, for photo-realistic image enhancement. SUPIR stands for Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration in the Wild. The model introduces a degradation-aware encoder that analyzes the specific types of quality loss present in an input image and generates intelligent text prompts to guide the restoration process, effectively telling the diffusion model what kind of content needs to be restored and how. This intelligent prompting approach enables SUPIR to produce remarkably detailed and natural-looking upscaled results that go beyond simple pixel interpolation to generate semantically meaningful detail. The model leverages the vast visual knowledge embedded in SDXL's pre-trained weights to synthesize realistic textures, facial features, text, and fine patterns during upscaling. SUPIR excels particularly at restoring severely degraded images where traditional upscaling methods fail, including old photographs, heavily compressed web images, and low-resolution captures. The model supports high upscaling factors while maintaining coherent content and natural appearance. Released under a research-only license, SUPIR is open source with code and weights available on GitHub. While computationally intensive due to its SDXL backbone, the model produces results that represent the current frontier of AI-powered image restoration quality. SUPIR is particularly valuable for professional photographers restoring archival images, forensic analysts enhancing surveillance footage, and digital artists who need maximum quality from limited source material.

Open Source
4.6

Quick Info

ParametersN/A
Typetransformer
LicenseApache 2.0
Released2022-06
ArchitectureTransformer with learned discrete codebook (VQGAN-based)
Rating4.6 / 5
CreatorTencent ARC

Links

Tags

codeformer
face
restoration
image-upscale
Visit Website