What is TRELLIS and how does it work?

TRELLIS is an AI system developed by Microsoft Research that generates 3D models from text and images. It creates high-quality, textured 3D meshes using Structured 3D Latents. It can produce detailed 3D models from a single image or text description.

What 3D formats can TRELLIS export to?

TRELLIS can output in common 3D formats such as GLB, OBJ, and FBX. These formats are directly compatible with popular software like Blender, Unity, and Unreal Engine. PBR material information is also preserved in outputs that support these formats.

How detailed are the 3D models generated from a single image?

TRELLIS can produce quite detailed models depending on the quality of the input image. For best results, it is recommended to use clear, well-lit photographs where the entire object is visible. Providing input from multiple angles for complex objects can improve results.

What hardware is needed to run TRELLIS?

TRELLIS requires a powerful GPU for high-quality 3D model generation. At least 12GB VRAM is recommended, with optimal performance on cards like A100 or RTX 4090. Memory usage may vary depending on model size and resolution settings.

What is the difference between TRELLIS and TripoSR?

TRELLIS produces more detailed and textured models using its Structured 3D Latents architecture. TripoSR is speed-focused and offers faster inference but texture quality may be lower. TRELLIS particularly stands out with PBR material support and multiple format output.

Can TRELLIS be used in commercial projects?

TRELLIS was developed by Microsoft Research and license terms are specified on the project page. Research use is generally free. For commercial use, you should check the license terms and contact Microsoft if necessary for enterprise licensing.

TRELLIS

Open Source

4.5

Microsoft Research

TRELLIS is a revolutionary AI model developed by Microsoft Research that generates high-quality 3D assets from text descriptions or single 2D images using a novel Structured Latent Diffusion architecture. Released in December 2024, TRELLIS represents a fundamental advancement in 3D content generation by operating in a structured latent space that encodes geometry, texture, and material properties simultaneously rather than treating them as separate stages. The model produces complete 3D meshes with detailed PBR (Physically Based Rendering) textures, enabling direct use in game engines, 3D rendering pipelines, and AR/VR applications without extensive manual post-processing. TRELLIS supports both text-to-3D generation where users describe desired objects in natural language and image-to-3D reconstruction where a single photograph is converted into a full 3D model with inferred geometry from occluded viewpoints. The structured latent representation ensures geometric consistency and prevents the common artifacts seen in other 3D generation approaches such as floating geometry, texture seams, and unrealistic proportions. TRELLIS outputs standard 3D formats including GLB and OBJ with UV-mapped textures, making integration with professional tools like Blender, Unity, and Unreal Engine straightforward. Released under the MIT license, the model is fully open source and available on GitHub. Key applications include rapid 3D asset prototyping for game development, architectural visualization, product design mockups, virtual staging for real estate, educational 3D content creation, and metaverse asset generation. The model particularly benefits indie developers and small studios who lack resources for traditional 3D modeling workflows.

Text to 3D

Image to 3D

Visit Website

Key Highlights

3D Model Generation from Text and Image

Capability to create detailed and textured 3D models from both text descriptions and a single image.

High-Detail Textured Meshes

Produces industry-standard textured mesh outputs with PBR material support for professional-ready models.

Multi-View Consistency

Advanced multi-view algorithm ensuring the generated 3D model looks consistent and correct from every angle.

Wide Format Support

Exports in common 3D formats like GLB, OBJ, and FBX for compatibility with Blender, Unity, and Unreal.

About

TRELLIS is a revolutionary AI model developed by Microsoft Research that generates high-quality 3D assets from a single 2D image. Released in December 2024, TRELLIS achieves groundbreaking results in image-to-3D conversion and offers rapid prototyping capabilities particularly suited for game development, product design, and virtual reality applications. The model provides significant advantages over other solutions in the field in both speed and quality.

TRELLIS's technical architecture is built on Structured LATents (SLAT), a novel representation format that encodes 3D geometry, texture, and rendering information in a structured latent space for efficient and high-quality generation. The training process utilizes Objaverse and similar large-scale 3D datasets. The diffusion-based generation pipeline follows a multi-stage process working from image to SLAT representations and then to mesh, texture, and radiance field outputs. The model supports two different input modes: text-to-3D and image-to-3D generation.

In terms of performance, TRELLIS delivers impressive results. Image-to-3D generation completes in approximately 12 seconds on an A100 GPU, representing a significant speed improvement compared to InstantMesh's 30-second duration. An F-Score of 0.473 is achieved on the GSO dataset, representing a notable improvement when compared to One-2-3-45's score of 0.311. Generated 3D models can be output in mesh, texture map, and radiance field formats.

TRELLIS finds applications in game development pipeline asset generation, e-commerce 3D product visualization, architectural visualization, virtual and augmented reality content creation, and digital twin generation. Its fast generation time provides significant time savings in iterative design processes. It offers an efficient solution particularly for projects requiring high-volume 3D content generation.

TRELLIS is available as open-source under the MIT license. Model weights, training code, and inference pipeline are accessible via GitHub. Built on PyTorch, it is optimized for NVIDIA GPUs. Demos and pre-trained models are available through Hugging Face. A user-friendly Gradio interface provides a browser-based 3D generation experience.

TRELLIS is a significant work demonstrating the power of structured latent representations in single-image 3D generation. Compared to Wonder3D's cross-domain attention approach and SPA3D's point cloud alignment technique, TRELLIS holds a competitive position in both speed and output quality. Microsoft Research's strong research infrastructure reflects the model's technical depth and continuous development potential. Its generation speed and multi-format output support make TRELLIS an ideal choice for professional workflows.

Looking more closely at TRELLIS's technical innovations, the advantages offered by the SLAT (Structured Latent) representation format over other approaches in the field become more apparent. SLAT encodes 3D space on a structured voxel grid, preserving both local geometric details and overall structural coherence. This representation format enables the diffusion model to operate effectively in 3D space and enhances both mesh quality and texture details of generated models. The model's multi-output format support is a significant advantage: users can obtain mesh, Gaussian splatting, and radiance field outputs from the same generation and select the format that suits their needs. TRELLIS also supports conditional generation, allowing guidance through text prompts or reference images. Microsoft Research's active development process on TRELLIS means continuous improvements and new feature additions. The project's high star count on GitHub and community participation serve as concrete indicators of the model's impact in the field, reflecting strong adoption among researchers and practitioners alike.

Use Cases

Game Asset Creation

Accelerating design iterations by creating rapid 3D asset prototypes during game development.

E-Commerce 3D Product Images

Creating 360-degree viewable 3D models of products for online stores.

Architectural Visualization

Creating prototypes for rapid 3D modeling and visualization of architectural concepts.

Education and Simulation

Producing rapid 3D object and scene models for educational materials and simulation environments.

Pros & Cons

Pros

Innovative 3D generation with Microsoft's SLAT (Structured Latent) representation
Output as Radiance Fields, 3D Gaussians, and mesh from single image
Trained on 500K+ high-quality 3D models
Research accepted as CVPR 2025 Spotlight
PBR materials, transparency, and detailed texture support

Cons

Requires Linux and minimum 24GB GPU memory
H100 GPU recommended for full capacity TRELLIS.2
Setup and operation require technical knowledge
Not yet fast enough for real-time 3D generation

Technical Details

Parameters

Unknown

Architecture

Structured Latent Diffusion

Training Data

Objaverse

License

MIT

Features

Text-to-3D
Image-to-3D
Textured meshes
High detail
GLB export
Multi-view consistency
PBR materials

Benchmark Results

Metric	Value	Compared To	Source
Üretim Süresi (Image-to-3D)	~12 saniye (A100)	InstantMesh: ~30 saniye	TRELLIS Paper (arXiv:2412.01506)
F-Score (GSO Dataset)	0.473	CRM: 0.402	TRELLIS Paper
Novel View PSNR	22.8 dB	LGM: 20.5 dB	Papers With Code
Mesh Kalitesi (Chamfer Distance)	0.034	TripoSR: 0.048	TRELLIS Paper

Available Platforms

GitHub

HuggingFace

Frequently Asked Questions

Related Models

TripoSR

Stability AI & Tripo|N/A

TripoSR is a fast feed-forward 3D reconstruction model jointly developed by Stability AI and Tripo AI that generates detailed 3D meshes from single input images in under one second. Unlike optimization-based methods that require minutes of processing per object, TripoSR uses a transformer-based architecture built on the Large Reconstruction Model framework to predict 3D geometry directly from a single 2D photograph in a single forward pass. The model accepts any standard image as input and produces a textured 3D mesh suitable for use in game engines, 3D modeling software, and augmented reality applications. TripoSR excels at reconstructing everyday objects, furniture, vehicles, characters, and organic shapes with impressive geometric accuracy and surface detail. Released under the MIT license in March 2024, the model is fully open source and can run on consumer-grade GPUs without specialized hardware. It supports batch processing for efficient conversion of multiple images and integrates seamlessly with popular 3D pipelines including Blender, Unity, and Unreal Engine. The model is particularly valuable for game developers, product designers, and e-commerce teams who need rapid 3D asset creation from product photographs. Output meshes can be exported in OBJ and GLB formats with configurable resolution settings. TripoSR represents a significant step toward democratizing 3D content creation by making high-quality reconstruction accessible without expensive scanning equipment or manual modeling expertise.

Open Source

4.5

Meshy

Meshy AI|N/A

Meshy is a proprietary AI-powered 3D generation platform developed by Meshy AI that creates detailed, production-ready 3D models from text descriptions and images. The platform combines text-to-3D and image-to-3D capabilities with advanced AI texturing features, positioning itself as a comprehensive solution for rapid 3D content creation. Meshy uses a transformer-based architecture that generates textured 3D meshes with PBR-compatible materials, making outputs directly usable in game engines like Unity and Unreal Engine without additional processing. The platform offers multiple generation modes including text-to-3D for creating objects from written descriptions, image-to-3D for converting photographs into 3D models, and AI texturing for applying realistic materials to existing untextured meshes. Generated models include proper UV mapping, normal maps, and physically based rendering materials suitable for professional workflows. Meshy provides both a web-based interface and an API for programmatic access, making it accessible to individual artists and scalable for enterprise pipelines. The platform is particularly popular among game developers, animation studios, and AR/VR content creators who need to produce large volumes of 3D assets efficiently. As a proprietary commercial service launched in 2023, Meshy operates on a subscription model with free tier access for limited generations. The platform continuously updates its models to improve output quality, topology optimization, and texture fidelity, competing directly with other AI 3D generation services in the rapidly evolving market.

Proprietary

4.4

Meshy v4

Meshy AI|undisclosed

Meshy v4 is the fourth generation of Meshy AI's 3D model generation platform, capable of creating detailed, textured 3D models from text descriptions and images in minutes. Released in late 2024, Meshy v4 represents a major upgrade in mesh quality, texture fidelity, and topology optimization over previous versions. The model generates production-ready 3D assets with clean topology suitable for game engines, animation pipelines, and 3D printing. Meshy v4 supports both text-to-3D and image-to-3D generation workflows, with the image-to-3D mode producing particularly impressive results by accurately capturing shape, proportions, and surface details from reference photographs. The platform generates textured meshes with PBR (Physically Based Rendering) materials including diffuse, normal, roughness, and metallic maps, making outputs immediately compatible with Unity, Unreal Engine, and Blender. Generated models can be exported in multiple formats including GLB, OBJ, FBX, and STL. Meshy v4 features improved detail preservation, better handling of thin structures and complex geometries, and more accurate color and texture mapping. The platform serves game developers, 3D artists, architects, product designers, and content creators who need rapid 3D asset creation without manual modeling expertise. A freemium model offers limited free generations with paid plans providing higher quality, more generations, and commercial licensing.

Proprietary

4.5

InstantMesh

Tencent|N/A

InstantMesh is a feed-forward 3D mesh generation model developed by Tencent that creates high-quality textured 3D meshes from single input images through a multi-view generation and sparse-view reconstruction pipeline. Released in April 2024 under the Apache 2.0 license, InstantMesh combines a multi-view diffusion model with a large reconstruction model to achieve both speed and quality in single-image 3D reconstruction. The pipeline first generates multiple consistent views of the input object using a fine-tuned multi-view diffusion model, then feeds these views into a transformer-based reconstruction network that predicts a triplane neural representation, which is finally converted to a textured mesh. This two-stage approach produces significantly higher quality results than single-stage methods while maintaining generation times of just a few seconds. InstantMesh supports both text-to-3D workflows when combined with an image generation model and direct image-to-3D conversion from photographs or artwork. The output meshes include detailed geometry and texture maps compatible with standard 3D software and game engines. The model handles a wide variety of object types including characters, vehicles, furniture, and organic shapes with good geometric fidelity. As an open-source project with code and weights available on GitHub and Hugging Face, InstantMesh has become a popular choice for developers building 3D asset generation pipelines. It is particularly useful for game development, e-commerce product visualization, and rapid prototyping scenarios where fast turnaround and reasonable quality are both important requirements.

Open Source

4.3

Quick Info

ParametersUnknown

TypeDiffusion + Structured Latent

LicenseMIT

Released2024-12

ArchitectureStructured Latent Diffusion

Rating4.5 / 5

CreatorMicrosoft Research

Links

Official Website GitHub

Explore More

All Text to 3D Models

Browse category

3D Modeling with AI: From Text to Object

Read guide

AI 3D Modeling Beginner's Guide

Read guide

All AI Models

Browse all models

TRELLIS

Key Highlights

3D Model Generation from Text and Image

High-Detail Textured Meshes

Multi-View Consistency

Wide Format Support

About

Use Cases

Game Asset Creation

E-Commerce 3D Product Images

Architectural Visualization

Education and Simulation

Pros & Cons

Pros

Cons

Technical Details

Features

Benchmark Results

Available Platforms

Frequently Asked Questions

What is TRELLIS and how does it work?

What 3D formats can TRELLIS export to?

How detailed are the 3D models generated from a single image?

What hardware is needed to run TRELLIS?

What is the difference between TRELLIS and TripoSR?

Can TRELLIS be used in commercial projects?

Related Models

TripoSR

Meshy

Meshy v4

InstantMesh

Quick Info

Links

Tags

Explore More