TRELLIS.2

Native and Compact Structured Latents for 3D Generation

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxpump

An open-source 4B-parameter image-to-3D model producing up to 15363 PBR textured assets, built on native 3D VAEs with 16x spatial compression, delivering efficient, scalable, high-fidelity asset generation.

Watch Intro

KEY FEATURES

High Quality, Resolution,

Efficiency

Arbitrary

Topology

Rich

Texture

Minimalist

Asset Processing

IMAGE TO 3D ASSET GENERATION

3D ASSET RECONSTRUCTION

TECH INNOVATIONS

3D assets

Instant

Bidirectional

Conversion

O-Voxel

Sparse

Compression

VAE

SLat

Large-Scale

Generative

Modelling

TRELLIS.2's pipeline begins with an Instant Bidirectional Conversion that transforms meshes into our new representation termed O-Voxel. A Sparse Compression VAE then encodes these voxels into a compact Structured Latent space.

Dual Vertices
Intersect Flags
Splitting Weights
Base Color
Metallic
Roughness
Alpha
f
shape
f
mat

O-Voxel

O-Voxel: Omni-Voxel Representation

O-Voxel is a novel "field-free" sparse voxel structure designed to encode both precise geometry and complex appearance simultaneously.

GEO

Geometry (fshape)

Utilizing a Flexible Dual Grids representation to handle arbitrary topologies while preserving sharp edges.

MAT

Appearance (fmat)

Supports full PBR attributes (Base Color, Metallic, Roughness, Alpha) to accurately model rich surface materials.

SC-VAE: Sparse Compression VAE

We introduce a Sparse Compression 3D VAE, employing a Sparse Residual Autoencoding scheme to directly compress voxel data.

16x

Downsampling

~9.6K

Latent Tokens for 10243

It encodes a fully textured 3D asset into a highly compact representation with negligible perceptual degradation, enabling efficient large-scale generative modeling.

fshape
SC-Enc.
z
SLat
SC-Dec.
fmat
O-Voxel to SLat Architecture

RESPONSIBLE AI CONSIDERATIONS

TRELLIS.2 is purely a research project. Responsible AI considerations were factored into all stages. The datasets used in this paper are public and have been reviewed to ensure there is no personally identifiable information or harmful content. However, as these datasets are sourced from the Internet, potential bias may still be present.

MATERIAL DISCLAIMER

The materials made available on this page are provided solely for academic and research purposes in connection with the exploration of 3D generation technologies, as described in our tech report. These materials are not intended for commercial exploitation or use. If you believe that any content on this page infringes upon your intellectual property rights, including but not limited to copyright, please notify us by submitting a takedown request via email to jiaoyan (at) microsoft.com.