One-2-3-45 icon

One-2-3-45

Open Source
4.0
UC San Diego

One-2-3-45 is a single-image 3D reconstruction system developed by researchers at UC San Diego that generates textured 3D meshes from a single input image through a two-stage pipeline combining multi-view generation with sparse-view 3D reconstruction. The name reflects the core process: from one image, generate two to three to four to five views, then reconstruct a complete 3D object. In the first stage, a fine-tuned Zero123 model generates multiple novel views of the object from different angles based on the single input photograph. In the second stage, these generated multi-view images are fed into a cost-volume-based sparse-view reconstruction network that produces a textured 3D mesh with consistent geometry. Released in June 2023 under the MIT license, One-2-3-45 was among the first systems to demonstrate that combining 2D diffusion models with 3D reconstruction could produce reasonable 3D assets in under a minute. The model handles a variety of object types including everyday items, animals, vehicles, and artistic objects. Unlike optimization-based approaches like DreamFusion that require per-object optimization taking tens of minutes, One-2-3-45 runs in a feed-forward manner making it significantly faster. The output meshes include color and texture information and can be exported for use in standard 3D applications. As a fully open-source project with code available on GitHub, it has served as an influential reference for subsequent research in single-image 3D generation. The system is particularly useful for researchers and developers exploring rapid 3D content creation from limited input data.

Image to 3D

Key Highlights

Pioneering Multi-View-Then-Reconstruct Paradigm

One of the first systems to demonstrate the two-stage approach of generating multi-view images then performing 3D reconstruction, establishing a pattern now standard in the field

45-Second Complete Pipeline

Completes the entire single-image to textured 3D mesh pipeline in approximately 45 seconds, dramatically faster than optimization-based methods requiring hours

Cost-Volume 3D Reconstruction

Uses cost-volume-based reconstruction that aggregates information from multiple generated views, providing robustness against view inconsistencies

MIT Licensed Academic Research

Fully open-source research from UC San Diego under MIT license with reproducible code, serving as an important reference for the 3D generation research community

About

One-2-3-45 is a single-image 3D reconstruction system developed by researchers at UC San Diego that generates textured 3D meshes from a single input image through a two-stage pipeline combining multi-view generation with sparse-view 3D reconstruction. The name reflects the process: from one image, generate 2D multi-view images, then reconstruct 3D geometry, achieving this in approximately 45 seconds. The model left a lasting and profound impact on the field as one of the pioneering works demonstrating the viability of the multi-view-then-reconstruct paradigm.

The system operates in two key stages. First, a view-conditioned 2D diffusion model (based on Zero123) generates multiple consistent views of the object from different angles. Second, a cost-volume-based sparse-view reconstruction module processes these generated multi-view images to produce a 3D mesh with texture maps. This two-stage approach separates the challenging tasks of view synthesis and 3D reconstruction into manageable sub-problems. The ability to independently optimize each stage facilitates improvement of the system's overall performance and allows different components to be developed separately.

One-2-3-45's contribution lies in demonstrating that combining pre-trained 2D diffusion models with multi-view 3D reconstruction is a viable and efficient approach to single-image 3D generation. The system achieves reasonable 3D reconstruction quality in under a minute, which was a significant improvement over optimization-based methods at the time of publication that could take hours per object. This speed advantage made iterative design cycles practical in 3D content creation workflows and provided researchers with the ability to perform rapid experimental iteration.

The reconstruction module uses a cost-volume approach that aggregates information from multiple generated views to estimate 3D geometry. This approach is more robust to inconsistencies between generated views than methods that rely on single-view depth estimation. The cost volume evaluates different depth hypotheses to determine the most likely 3D geometry, coherently combining information from multiple views throughout this process. The resulting meshes include both geometry and texture information, providing usable 3D assets for visualization and prototyping purposes.

Among the system's limitations, inconsistencies between generated views can affect the final 3D reconstruction quality. These inconsistencies may become more pronounced for objects with complex geometries or fine details. However, the robustness of the cost-volume approach significantly mitigates the impact of such inconsistencies, maintaining reasonable quality across a variety of input types and object categories.

Released under the MIT license, One-2-3-45 is fully open-source with code and pre-trained weights available on GitHub. While newer models like InstantMesh and TripoSR have since achieved higher quality and faster generation, One-2-3-45 remains historically important as one of the first systems to demonstrate the multi-view-then-reconstruct paradigm that has become standard in the field. The architectural design principles established by the model continue to form the foundation of subsequent research and applications in single-image 3D reconstruction.

Use Cases

1

3D Generation Research Baseline

Serves as a standard comparison baseline for evaluating new single-image 3D reconstruction methods in academic research publications

2

Quick 3D Prototyping

Generate rough 3D models from reference images in under a minute for design prototyping and concept visualization purposes

3

Educational Tool for 3D AI

Learn about multi-view 3D reconstruction concepts through a well-documented, accessible implementation with clear two-stage pipeline design

4

Pipeline Architecture Reference

Use as an architectural reference for building custom 3D generation pipelines that combine 2D diffusion with 3D reconstruction modules

Pros & Cons

Pros

  • Creates 3D model from a single 2D image in 45 seconds
  • Zero-shot approach — no retraining needed for each object
  • Consistent angle synthesis with multi-view diffusion
  • Open-source research project

Cons

  • Mesh quality behind commercial tools
  • Quality loss in fine details and edge areas
  • Difficulty with asymmetric objects
  • Limited texture quality

Technical Details

Parameters

N/A

License

MIT

Features

  • Single Image to 3D
  • Multi-View Generation Stage
  • SparseView Reconstruction
  • Zero123 Based Pipeline
  • Open-Source MIT License
  • UC San Diego Research
  • Mesh Output with Textures
  • Academic Reference Implementation

Benchmark Results

MetricValueCompared ToSource
Novel View PSNR18.8 dB (GSO)Unique3D: 20.1 dBarXiv 2306.16928
Üretim Süresi~45 saniyeWonder3D: ~3 dakikaGitHub One-2-3-45
SSIM (GSO)0.842Unique3D: 0.922arXiv 2306.16928

Available Platforms

hugging face
replicate

Frequently Asked Questions

Related Models

TripoSR icon

TripoSR

Stability AI & Tripo|N/A

TripoSR is a fast feed-forward 3D reconstruction model jointly developed by Stability AI and Tripo AI that generates detailed 3D meshes from single input images in under one second. Unlike optimization-based methods that require minutes of processing per object, TripoSR uses a transformer-based architecture built on the Large Reconstruction Model framework to predict 3D geometry directly from a single 2D photograph in a single forward pass. The model accepts any standard image as input and produces a textured 3D mesh suitable for use in game engines, 3D modeling software, and augmented reality applications. TripoSR excels at reconstructing everyday objects, furniture, vehicles, characters, and organic shapes with impressive geometric accuracy and surface detail. Released under the MIT license in March 2024, the model is fully open source and can run on consumer-grade GPUs without specialized hardware. It supports batch processing for efficient conversion of multiple images and integrates seamlessly with popular 3D pipelines including Blender, Unity, and Unreal Engine. The model is particularly valuable for game developers, product designers, and e-commerce teams who need rapid 3D asset creation from product photographs. Output meshes can be exported in OBJ and GLB formats with configurable resolution settings. TripoSR represents a significant step toward democratizing 3D content creation by making high-quality reconstruction accessible without expensive scanning equipment or manual modeling expertise.

Open Source
4.5
TRELLIS icon

TRELLIS

Microsoft Research|Unknown

TRELLIS is a revolutionary AI model developed by Microsoft Research that generates high-quality 3D assets from text descriptions or single 2D images using a novel Structured Latent Diffusion architecture. Released in December 2024, TRELLIS represents a fundamental advancement in 3D content generation by operating in a structured latent space that encodes geometry, texture, and material properties simultaneously rather than treating them as separate stages. The model produces complete 3D meshes with detailed PBR (Physically Based Rendering) textures, enabling direct use in game engines, 3D rendering pipelines, and AR/VR applications without extensive manual post-processing. TRELLIS supports both text-to-3D generation where users describe desired objects in natural language and image-to-3D reconstruction where a single photograph is converted into a full 3D model with inferred geometry from occluded viewpoints. The structured latent representation ensures geometric consistency and prevents the common artifacts seen in other 3D generation approaches such as floating geometry, texture seams, and unrealistic proportions. TRELLIS outputs standard 3D formats including GLB and OBJ with UV-mapped textures, making integration with professional tools like Blender, Unity, and Unreal Engine straightforward. Released under the MIT license, the model is fully open source and available on GitHub. Key applications include rapid 3D asset prototyping for game development, architectural visualization, product design mockups, virtual staging for real estate, educational 3D content creation, and metaverse asset generation. The model particularly benefits indie developers and small studios who lack resources for traditional 3D modeling workflows.

Open Source
4.5
Stable Point Aware 3D (SPA3D) icon

Stable Point Aware 3D (SPA3D)

Stability AI|Unknown

Stable Point Aware 3D (SPA3D) is an advanced feed-forward 3D reconstruction model developed by Stability AI that generates high-quality textured 3D meshes from a single input image in seconds. Unlike iterative optimization-based approaches that require minutes of processing, SPA3D uses a direct feed-forward architecture that predicts 3D geometry and texture in a single pass, making it practical for interactive workflows and production pipelines. The model employs point cloud alignment techniques that significantly improve geometric consistency compared to other single-view reconstruction methods, ensuring that generated 3D models maintain accurate proportions and structural integrity from multiple viewpoints. SPA3D produces industry-standard mesh outputs with clean topology and UV-mapped textures, enabling direct import into 3D software including Blender, Unity, Unreal Engine, and professional CAD tools. The model handles diverse object categories from organic shapes like characters and animals to hard-surface objects like furniture and vehicles, adapting its reconstruction approach to the structural characteristics of each input. Released under the Stability AI Community License, the model is open source for personal and commercial use with revenue-based restrictions. Key applications include rapid 3D asset creation for game development, augmented reality content production, 3D printing preparation, virtual product photography, architectural visualization, and e-commerce 3D product displays. SPA3D is particularly valuable for creative professionals who need quick 3D mockups from concept sketches or photographs without investing hours in manual modeling. The model runs on consumer GPUs and is available through cloud APIs for scalable deployment.

Open Source
4.3
Zero123++ icon

Zero123++

Stability AI|N/A

Zero123++ is a multi-view image generation model developed by Stability AI that generates six consistent canonical views of an object from a single input image. Released in 2023 under the Apache 2.0 license, the model extends the original Zero123 approach with significantly improved view consistency and serves as a critical component in modern 3D reconstruction pipelines. Zero123++ takes a single photograph or rendered image of an object and produces six evenly spaced views covering the full 360-degree range around the object, all maintaining consistent geometry, lighting, and appearance. The model is built on a fine-tuned Stable Diffusion backbone with specialized conditioning mechanisms that ensure multi-view coherence. Unlike the original Zero123 which generates views independently and often produces inconsistent results, Zero123++ generates all six views simultaneously in a single diffusion process, dramatically improving 3D consistency. The generated multi-view images serve as input for downstream 3D reconstruction methods like NeRF, Gaussian Splatting, or direct mesh reconstruction, enabling high-quality 3D model creation from a single photograph. Zero123++ is fully open source with pre-trained weights available on Hugging Face, making it accessible to researchers and developers building 3D generation systems. The model has become a foundational component in many state-of-the-art 3D generation pipelines and is widely used in academic research. It is particularly valuable for applications in game development, product visualization, and virtual reality where converting 2D images to 3D assets is a frequent workflow requirement.

Open Source
4.3

Quick Info

ParametersN/A
Typediffusion
LicenseMIT
Released2023-06
Rating4.0 / 5
CreatorUC San Diego

Links

Tags

one-2-3-45
3d
reconstruction
image-to-3d
Visit Website