What is 3D Gaussian Splatting and why does LGM use it?

3D Gaussian Splatting is a 3D representation method that describes objects as collections of 3D Gaussian ellipsoids, each with position, size, rotation, color, and opacity properties. LGM uses it because Gaussian splatting enables real-time rendering from any viewpoint without requiring neural network inference, unlike NeRF-based methods. The Gaussian representation is also differentiable, making it suitable for optimization, and produces visually high-quality results with natural handling of view-dependent effects and semi-transparent materials.

How does LGM compare to TripoSR in speed and quality?

LGM generates 3D objects in approximately 5 seconds compared to TripoSR's sub-second speed, making TripoSR significantly faster. However, LGM typically produces higher visual quality output with better view consistency and surface detail, particularly when using the Gaussian splatting output format for rendering. TripoSR outputs mesh-only results with simpler textures. For batch processing prioritizing speed, TripoSR is preferable. For higher quality individual asset creation, LGM offers a better quality-speed tradeoff.

Can LGM output be used in game engines?

Yes, but through different paths. LGM's primary output is 3D Gaussian Splatting format, which can be used directly with Gaussian splatting renderers and viewers but is not natively supported in most game engines like Unity or Unreal Engine. However, LGM includes a mesh extraction pipeline that converts the Gaussian representation to textured polygonal meshes that are compatible with standard game engines. The mesh extraction adds processing time but produces industry-standard format assets.

Yes, LGM is released under the MIT license, which is one of the most permissive open-source licenses available. The complete source code and pre-trained model weights are available on GitHub. The MIT license permits unrestricted commercial use, modification, and distribution, making LGM freely usable in both research and production environments. This includes creating proprietary derivative works and commercial services based on the model.

What hardware does LGM require?

LGM requires a GPU with at least 12-16GB VRAM for standard generation. NVIDIA RTX 3080, RTX 4070 Ti, or RTX 4080 GPUs provide good performance. The 5-second generation time is measured on high-end GPUs; mid-range hardware may take somewhat longer. The multi-view generation stage is the most memory-intensive part of the pipeline. For viewing the Gaussian splatting output, a GPU supporting real-time splatting rendering is needed, which includes most modern NVIDIA GPUs.

What types of objects does LGM reconstruct best?

LGM performs well with single objects that have clear geometry and visible surface details. Objects with distinct shapes, reasonable symmetry, and recognizable features produce the best results. The four-view generation approach works well for objects that can be meaningfully represented from orthogonal viewpoints. Very thin objects, highly detailed organic forms, or objects with complex internal structures may be more challenging. Clean background images with well-lit subjects generally yield the best reconstruction quality.

LGM

Open Source

4.2

Peking University

LGM (Large Gaussian Model) is a 3D generation model developed by researchers at Peking University that produces high-quality 3D objects from single images or text prompts in approximately five seconds using 3D Gaussian Splatting representation. Released in 2024 under the MIT license, LGM combines multi-view image generation with Gaussian-based 3D reconstruction in an end-to-end framework. The model first generates multiple consistent views of the target object using a multi-view diffusion backbone, then a U-Net-based Gaussian decoder predicts 3D Gaussian parameters from these views to construct the full 3D representation. Unlike mesh-based approaches, the Gaussian Splatting output enables real-time rendering with high visual quality including accurate lighting, transparency, and reflective surface effects. LGM supports resolutions up to 512 pixels for the generated views and produces detailed 3D content with clean geometry and vivid textures. The model can be used for both image-to-3D conversion from photographs and text-to-3D generation when paired with a text-to-image model as a front end. As an open-source project with code and pre-trained weights available on GitHub, LGM is accessible to researchers and developers for both academic study and practical applications. The model is particularly suited for interactive 3D visualization, virtual reality content, game asset prototyping, and any scenario where real-time rendering of generated 3D content is required. LGM demonstrates that Gaussian Splatting provides a compelling alternative to traditional mesh representations for AI-generated 3D content.

Text to 3D

Visit Website

Key Highlights

5-Second Gaussian Splatting Generation

Generates complete 3D Gaussian Splatting representations from single images in approximately 5 seconds, balancing speed with high visual quality output

Real-Time Renderable Output

Produces 3D Gaussian representations that enable real-time rendering from any viewpoint without neural network inference, suitable for interactive 3D applications

Multi-View Consistency

Generates four consistent orthogonal views before reconstruction, ensuring accurate geometry from multiple perspectives for robust 3D shape recovery

Dual Output: Gaussians and Meshes

Supports both 3D Gaussian Splatting format for real-time rendering and mesh extraction pipeline for traditional 3D workflow compatibility

About

LGM (Large Gaussian Model) is a 3D generation model developed by researchers at Peking University that generates 3D objects from single images in approximately 5 seconds using 3D Gaussian Splatting representation. Released in 2024, LGM combines multi-view image generation with Gaussian-based 3D reconstruction to produce high-quality 3D assets that can be rendered in real-time using Gaussian splatting renderers. The model played a pioneering role in the field as one of the first works to successfully demonstrate the integration of Gaussian Splatting into 3D generation pipelines.

The model operates through a two-stage pipeline. First, a multi-view diffusion model generates four consistent orthogonal views of the object from the input image. These four views are then processed by the Large Gaussian Model, which predicts a set of 3D Gaussians that represent the object's geometry, appearance, and transparency. The resulting Gaussian representation can be rendered from any viewpoint in real-time using splatting-based renderers. The asymmetric U-Net architecture directly regresses Gaussian parameters from multi-view images, enabling fast inference without requiring additional optimization steps. Each individual Gaussian element encodes its position in 3D space, scale, orientation, and color information.

LGM's use of 3D Gaussian Splatting as the output representation offers several advantages. Gaussian splatting enables real-time rendering without the computational overhead of neural radiance fields, making the generated assets immediately usable in interactive 3D applications. The representation naturally handles view-dependent effects like specular highlights and translucency, providing photorealistic visualization. For applications requiring traditional mesh format, LGM includes a mesh extraction pipeline that converts the Gaussian representation to textured polygonal meshes. This dual output flexibility makes the model suitable for both interactive visualization and traditional 3D workflows.

The model's 5-second generation time represents an excellent balance between speed and quality. While not as fast as TripoSR (sub-second), LGM typically produces higher visual quality output, particularly in terms of view consistency and surface detail. The generation time is fast enough for interactive workflows while delivering results competitive with slower optimization-based methods. Parameters including per-Gaussian position, scale, rotation, opacity, and spherical harmonic coefficients are learned, and this rich parameter set enables detailed appearance modeling across diverse object types.

LGM was trained on the Objaverse dataset and processes input images at 512x512 resolution. The model performs particularly well on objects with smooth surfaces, while it may show limitations on objects with very fine geometric details or complex internal structures. The number and distribution of Gaussians in the output automatically adjust according to geometry complexity, with typically thousands of Gaussian elements used per object.

Released under the MIT license, LGM is fully open-source with code and pre-trained weights available on GitHub. The model has been influential in demonstrating the viability of Gaussian splatting as an output format for 3D generation models and has contributed to the growing ecosystem of Gaussian-based 3D tools and applications. The research community continues to build upon LGM's approach to develop higher-resolution and more detailed Gaussian-based 3D generation methods.

Use Cases

Interactive 3D Web Experiences

Generate Gaussian splatting assets for web-based 3D viewers that render in real-time, creating interactive product showcases and virtual galleries

Rapid 3D Asset Prototyping

Create 3D prototypes from concept images in seconds for design review, client presentations, and iterative creative development processes

Real-Time 3D Applications

Feed generated Gaussian assets into real-time applications including AR experiences, interactive demos, and spatial computing environments

3D Content Pipeline Integration

Integrate into automated content pipelines where images are converted to 3D assets at scale for catalogs, inventories, and digital twin creation

Pros & Cons

Pros

Generates 3D objects from image or text within 5 seconds at 512x512 resolution with up to 65,536 Gaussians
ECCV 2024 Oral paper — demonstrates superior visual quality compared to DreamGaussian and TriplaneGaussian
Effectively addresses blurry back views and flat geometry common in prior image-to-3D methods
Gaussian Splatting representation is more expressive and faster to render than triplane-based NeRFs
Achieves high-resolution generation (512x512) while maintaining fast 5-second generation speed

Cons

Output quality is inherently tied to upstream multi-view diffusion model quality — inconsistent inputs degrade results
May not follow text prompts effectively for unconventional or unusual objects
Limited to object-centric scenes — cannot handle full scene reconstruction
Requires multi-view images as input, adding dependency on separate diffusion model for generation pipeline
Gaussian Splatting output requires additional conversion for use in standard 3D applications

Technical Details

Parameters

N/A

License

MIT

Features

Single Image to 3D Gaussian
Ultra-Fast 5-Second Generation
3D Gaussian Splatting Output
High-Quality Multi-View Synthesis
Mesh Extraction Support
Open-Source MIT License
Peking University Research
Real-Time 3D Rendering

Benchmark Results

Metric	Value	Compared To	Source
Üretim Süresi	~5 saniye	InstantMesh: ~10 saniye	ECCV 2024 / arXiv 2402.05054
Eğitim Çözünürlüğü	512×512 px	OpenLRM: 256×256	GitHub 3DTopia/LGM
Gaussian Sayısı	~40K 3D Gaussian	—	ECCV 2024 Paper
Novel View PSNR	21.5 dB (GSO)	InstantMesh: 22.2 dB	arXiv 2402.05054

Available Platforms

hugging face

replicate

Frequently Asked Questions

Related Models

TripoSR

Stability AI & Tripo|N/A

TripoSR is a fast feed-forward 3D reconstruction model jointly developed by Stability AI and Tripo AI that generates detailed 3D meshes from single input images in under one second. Unlike optimization-based methods that require minutes of processing per object, TripoSR uses a transformer-based architecture built on the Large Reconstruction Model framework to predict 3D geometry directly from a single 2D photograph in a single forward pass. The model accepts any standard image as input and produces a textured 3D mesh suitable for use in game engines, 3D modeling software, and augmented reality applications. TripoSR excels at reconstructing everyday objects, furniture, vehicles, characters, and organic shapes with impressive geometric accuracy and surface detail. Released under the MIT license in March 2024, the model is fully open source and can run on consumer-grade GPUs without specialized hardware. It supports batch processing for efficient conversion of multiple images and integrates seamlessly with popular 3D pipelines including Blender, Unity, and Unreal Engine. The model is particularly valuable for game developers, product designers, and e-commerce teams who need rapid 3D asset creation from product photographs. Output meshes can be exported in OBJ and GLB formats with configurable resolution settings. TripoSR represents a significant step toward democratizing 3D content creation by making high-quality reconstruction accessible without expensive scanning equipment or manual modeling expertise.

Open Source

4.5

TRELLIS

Microsoft Research|Unknown

TRELLIS is a revolutionary AI model developed by Microsoft Research that generates high-quality 3D assets from text descriptions or single 2D images using a novel Structured Latent Diffusion architecture. Released in December 2024, TRELLIS represents a fundamental advancement in 3D content generation by operating in a structured latent space that encodes geometry, texture, and material properties simultaneously rather than treating them as separate stages. The model produces complete 3D meshes with detailed PBR (Physically Based Rendering) textures, enabling direct use in game engines, 3D rendering pipelines, and AR/VR applications without extensive manual post-processing. TRELLIS supports both text-to-3D generation where users describe desired objects in natural language and image-to-3D reconstruction where a single photograph is converted into a full 3D model with inferred geometry from occluded viewpoints. The structured latent representation ensures geometric consistency and prevents the common artifacts seen in other 3D generation approaches such as floating geometry, texture seams, and unrealistic proportions. TRELLIS outputs standard 3D formats including GLB and OBJ with UV-mapped textures, making integration with professional tools like Blender, Unity, and Unreal Engine straightforward. Released under the MIT license, the model is fully open source and available on GitHub. Key applications include rapid 3D asset prototyping for game development, architectural visualization, product design mockups, virtual staging for real estate, educational 3D content creation, and metaverse asset generation. The model particularly benefits indie developers and small studios who lack resources for traditional 3D modeling workflows.

Open Source

4.5

Meshy

Meshy AI|N/A

Meshy is a proprietary AI-powered 3D generation platform developed by Meshy AI that creates detailed, production-ready 3D models from text descriptions and images. The platform combines text-to-3D and image-to-3D capabilities with advanced AI texturing features, positioning itself as a comprehensive solution for rapid 3D content creation. Meshy uses a transformer-based architecture that generates textured 3D meshes with PBR-compatible materials, making outputs directly usable in game engines like Unity and Unreal Engine without additional processing. The platform offers multiple generation modes including text-to-3D for creating objects from written descriptions, image-to-3D for converting photographs into 3D models, and AI texturing for applying realistic materials to existing untextured meshes. Generated models include proper UV mapping, normal maps, and physically based rendering materials suitable for professional workflows. Meshy provides both a web-based interface and an API for programmatic access, making it accessible to individual artists and scalable for enterprise pipelines. The platform is particularly popular among game developers, animation studios, and AR/VR content creators who need to produce large volumes of 3D assets efficiently. As a proprietary commercial service launched in 2023, Meshy operates on a subscription model with free tier access for limited generations. The platform continuously updates its models to improve output quality, topology optimization, and texture fidelity, competing directly with other AI 3D generation services in the rapidly evolving market.

Proprietary

4.4

Meshy v4

Meshy AI|undisclosed

Meshy v4 is the fourth generation of Meshy AI's 3D model generation platform, capable of creating detailed, textured 3D models from text descriptions and images in minutes. Released in late 2024, Meshy v4 represents a major upgrade in mesh quality, texture fidelity, and topology optimization over previous versions. The model generates production-ready 3D assets with clean topology suitable for game engines, animation pipelines, and 3D printing. Meshy v4 supports both text-to-3D and image-to-3D generation workflows, with the image-to-3D mode producing particularly impressive results by accurately capturing shape, proportions, and surface details from reference photographs. The platform generates textured meshes with PBR (Physically Based Rendering) materials including diffuse, normal, roughness, and metallic maps, making outputs immediately compatible with Unity, Unreal Engine, and Blender. Generated models can be exported in multiple formats including GLB, OBJ, FBX, and STL. Meshy v4 features improved detail preservation, better handling of thin structures and complex geometries, and more accurate color and texture mapping. The platform serves game developers, 3D artists, architects, product designers, and content creators who need rapid 3D asset creation without manual modeling expertise. A freemium model offers limited free generations with paid plans providing higher quality, more generations, and commercial licensing.

Proprietary

4.5

Quick Info

ParametersN/A

Typetransformer

LicenseMIT

Released2024-02

Rating4.2 / 5

CreatorPeking University

Links

Official Website GitHub arXiv Paper HuggingFace

Explore More

All Text to 3D Models

Browse category

3D Modeling with AI: From Text to Object

Read guide

AI 3D Modeling Beginner's Guide

Read guide

All AI Models

Browse all models

LGM

Key Highlights

5-Second Gaussian Splatting Generation

Real-Time Renderable Output

Multi-View Consistency

Dual Output: Gaussians and Meshes

About

Use Cases

Interactive 3D Web Experiences

Rapid 3D Asset Prototyping

Real-Time 3D Applications

3D Content Pipeline Integration

Pros & Cons

Pros

Cons

Technical Details

Features

Benchmark Results

Available Platforms

Frequently Asked Questions

What is 3D Gaussian Splatting and why does LGM use it?

How does LGM compare to TripoSR in speed and quality?

Can LGM output be used in game engines?

Is LGM open-source?

What hardware does LGM require?

What types of objects does LGM reconstruct best?

Related Models

TripoSR

TRELLIS

Meshy

Meshy v4

Quick Info

Links

Tags

Explore More