IDM-VTON icon

IDM-VTON

Open Source
4.5
Yisol Studio

IDM-VTON (Improving Diffusion Models for Virtual Try-On) is a groundbreaking diffusion-based model developed by Yisol Studio that enables highly realistic virtual clothing try-on by combining a person's photograph with a garment image. The model uses a sophisticated two-stage architecture built on Stable Diffusion with specialized garment encoding that captures clothing details including texture, pattern, fabric drape, and structural elements with exceptional fidelity. Given a person image and a flat-lay or mannequin clothing photo, IDM-VTON generates a photorealistic visualization of the person wearing the garment while preserving their body shape, skin tone, pose, and background context. The model handles diverse clothing types from casual wear to formal attire, accessories, and layered outfits with remarkable accuracy. With over one billion parameters, IDM-VTON achieves state-of-the-art results on standard virtual try-on benchmarks, producing outputs that are often indistinguishable from real photographs. The garment encoding module specifically preserves fine details such as logos, text, buttons, and stitching patterns that previous models often blurred or lost. Released under the CC BY-NC-SA 4.0 license for research and non-commercial use, the model has been widely adopted by fashion technology startups, e-commerce platforms, and creative agencies. Applications include online shopping virtual try-on experiences, fashion design prototyping, social media content creation, and catalog generation without physical photo shoots. The model integrates with popular inference frameworks and can be deployed through cloud APIs for scalable production use.

Virtual Try-On

Key Highlights

Realistic Virtual Try-On Experience

Reflects fabric texture, wrinkles, and shadows of clothing realistically using diffusion-based architecture.

Support for All Garment Types

Offers virtual try-on capability for all garment types including tops, bottoms, dresses, and accessories.

Advanced Pose Adaptation

Produces realistic results by naturally adapting clothing to different body poses and angles.

Occlusion Management

Provides accurate clothing simulation even when body parts like arms and hair occlude the garment.

About

IDM-VTON (Improving Diffusion Models for Virtual Try-On) is a groundbreaking diffusion-based model in the virtual clothing try-on domain. It takes a person's photo and a clothing image, then generates a highly realistic visualization of how the garment would look on that person. The model uses a two-stage architecture consisting of GarmentNet and TryOnNet; the first stage extracts garment features while the second stage transfers these features to match the target person's body structure accurately. This two-stage design enables simultaneous optimization of both garment fidelity and body alignment for photorealistic results.

The model's unique capability is preserving both the person's body structure and the clothing's texture, pattern, and cut details during garment transfer with exceptional accuracy. Challenging elements such as complex patterns, text, logos, and fine fabric details are accurately transferred without distortion or loss of detail. It delivers consistent results across different body types and poses. This success stems from the model's ability to process garment features separately as high-level semantic information and low-level texture details. The IP-Adapter-based garment encoding mechanism ensures that fabric texture and color tones remain faithful to the original garment image. Cross-attention mechanisms guarantee correct positioning of garment details on the target person.

Achieving the best results on VITON-HD and DressCode benchmark datasets, IDM-VTON notably outperforms previous methods in terms of FID (Fréchet Inception Distance) and SSIM (Structural Similarity Index) metrics across all evaluation categories. The difference becomes particularly pronounced with complex poses, different body types, and detailed garment patterns where previous methods struggle. The model generates realistic folds, shadows, and fabric physics simulation to deliver photorealistic results that are nearly indistinguishable from actual photographs. Even in sitting, bending, or dynamic poses, it naturally maintains the garment's conformity to the body.

IDM-VTON is particularly revolutionary for the e-commerce sector, offering transformative potential for online retail. Online shopping platforms can show customers how products would look on them, creating a virtual fitting room experience that bridges the gap between physical and online shopping. This reduces return rates and increases customer satisfaction significantly. Research indicates that virtual try-on features can increase conversion rates by over thirty percent. Fashion designers can also visualize new collections before physical production, speeding up the design process and reducing prototype costs substantially. Fashion editors and style consultants can quickly experiment with different combinations to create style recommendations for clients.

The model supports upper body garments, lower body garments, and one-piece clothing items. It produces high-quality results across a wide range of garments including dresses, jackets, t-shirts, pants, and skirts with accurate pattern preservation. The garment image background is automatically removed to ensure clean transfer without manual preprocessing. It adapts to different lighting conditions and background environments to create natural-looking compositions that maintain visual coherence throughout. It can also handle complex scenarios such as fabric transparency and layering effects.

Accessible as open source on Hugging Face, IDM-VTON can be quickly tested through a Gradio demo interface for evaluation. Its Stable Diffusion-based architecture allows community fine-tuning and customization for domain-specific needs. Source code is available on GitHub, and researchers can train the model with their own datasets. API integration is possible for e-commerce platforms and fashion applications, and the model can be optimized for use in large-scale production environments handling thousands of try-on requests daily. Future developments include video-based virtual try-on and 3D garment simulation capabilities.

Use Cases

1

E-Commerce Product Images

Creating clothing images on different models for online stores to increase product visual variety.

2

Personalized Shopping

Enabling customers to virtually try on clothes with their own photos to facilitate purchase decisions.

3

Fashion Design Prototyping

Enabling designers to visualize new collections digitally before physical production.

4

Social Media Content Creation

Enabling rapid visual content creation with different outfit combinations for influencers and brands.

Pros & Cons

Pros

  • Image-based virtual try-on — high-quality results
  • Natural transfer of fabric texture, pattern, and folds
  • Deformation and fitting appropriate to body pose
  • Open-source research project — demo on Hugging Face

Cons

  • Difficulty with complex garments (layered, accessorized)
  • Inconsistencies across different body types
  • High GPU requirements
  • Not optimized for real-time use

Technical Details

Parameters

1B+

Architecture

Stable Diffusion + Garment Encoding

Training Data

VITON-HD, DressCode

License

CC BY-NC-SA 4.0

Features

  • Realistic try-on
  • Any garment type
  • Pose handling
  • High resolution
  • Occlusion handling
  • Multiple clothing layers

Benchmark Results

MetricValueCompared ToSource
SSIM (VITON-HD)0.867GP-VTON: 0.843IDM-VTON Paper (arXiv:2403.05139)
FID (VITON-HD)8.56StableVITON: 9.23IDM-VTON Paper (arXiv:2403.05139)
LPIPS (VITON-HD)0.073OOTDiffusion: 0.081Papers With Code
Giysi Koruma Skoru0.92—Hugging Face Model Card

Available Platforms

GitHub
HuggingFace
Replicate

Frequently Asked Questions

Quick Info

Parameters1B+
TypeDiffusion
LicenseCC BY-NC-SA 4.0
Released2024-03
ArchitectureStable Diffusion + Garment Encoding
Rating4.5 / 5
CreatorYisol Studio

Links

Tags

virtual-try-on
fashion
e-commerce
clothing
Visit Website