Stable Diffusion

4843 readers

5 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Stable Diffusion Art (See its sidebar for more GenAI Art comms)
!aihorde@lemmy.dbzer0.com

Other communities

founded 2 years ago

MODERATORS

db0@lemmy.dbzer0.com

Even_Adder@lemmy.dbzer0.com

Structured 3D Latents for Scalable and Versatile 3D Generation (github.com)

submitted 6 months ago by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

3 comments fedilink hide all child comments

Abstract

We introduce a novel 3D generation method for versatile and high-quality 3D asset creation. The cornerstone is a unified Structured LATent (SLAT) representation which allows decoding to different output formats, such as Radiance Fields, 3D Gaussians, and meshes. This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model, comprehensively capturing both structural (geometry) and textural (appearance) information while maintaining flexibility during decoding. We employ rectified flow transformers tailored for SLAT as our 3D generation models and train models with up to 2 billion parameters on a large 3D asset dataset of 500K diverse objects. Our model generates high-quality results with text or image conditions, significantly surpassing existing methods, including recent ones at similar scales. We showcase flexible output format selection and local 3D editing capabilities which were not offered by previous models. Code, model, and data will be released.

Paper: https://arxiv.org/abs/2412.01506

Code: https://github.com/Microsoft/TRELLIS

Demo: https://huggingface.co/spaces/JeffreyXiang/TRELLIS

Project Page: https://trellis3d.github.io/

top 3 comments

sorted by: hot top controversial new old

[–] WorkingClassCorpse@hexbear.net 6 points 6 months ago

This is what we should be using ML for. Very cool

[–] pennomi@lemmy.world 5 points 6 months ago

This worked better than any image to 3D model I’ve tried so far.

[–] leverage 4 points 6 months ago

Pretty impressive, I took a picture of a kids toy and it generated a passable model. I recall seeing something that would also automatically rig humanoid models, and another that would animate rigged models per a prompt (might have been Disney). Seems like we're not that far away from being able to take a picture of something and have an animation produced. I did a cursory search and didn't find anything, but I wouldn't be shocked if that's not already a thing you can do by stringing publicly available models together.