Stable Diffusion

5678 readers

3 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Stable Diffusion Art (See its sidebar for more GenAI Art comms)
!aihorde@lemmy.dbzer0.com

Other communities

founded 3 years ago

MODERATORS

db0@lemmy.dbzer0.com

Even_Adder@lemmy.dbzer0.com

nvidia/Cosmos3-Super-Text2Image (research.nvidia.com)

submitted 5 days ago* (last edited 5 days ago) by Even_Adder@lemmy.dbzer0.com to c/stable_diffusion@lemmy.dbzer0.com

0 comments fedilink hide all child comments

Abstract

We introduce Cosmos 3, a family of omnimodal world models designed to jointly process and generate lan- guage, image, video, audio, and action sequences within a unified mixture-of-transformers architecture. By supporting highly flexible input-output configurations, Cosmos 3 seamlessly unifies critical modalities for Physical AI—effectively subsuming vision-language models, video generators, world simulators, and world-action models into a single framework. Our evaluation demonstrates that Cosmos 3 establishes a new state-of-the-art across a diverse suite of understanding and generation tasks, demonstrating omnimodal world models as scalable, general-purpose backbones for embodied agents. Our post-trained Cosmos 3 models were ranked as the best open-source Text-to-Image and Image-to-Video models by Arti- ficial Analysis, and the best policy model by RoboArena at the time the technical report was written. To accelerate open research and deployment in Physical AI, we make our code, model checkpoints, curated synthetic datasets, and evaluation benchmark available under the Linux Foundation’s OpenMDW-1.1 License at github.com/nvidia/cosmos and huggingface.co/collections/nvidia/cosmos3 . The project website is available at research.nvidia.com/labs/cosmos-lab/cosmos3 .

Paper: https://research.nvidia.com/labs/cosmos-lab/cosmos3/technical-report.pdf

Code: https://github.com/nvidia/cosmos

Model Collection: https://huggingface.co/collections/nvidia/cosmos3

Project Page: https://research.nvidia.com/labs/cosmos-lab/cosmos3/

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here