Enter your email address below and subscribe to our newsletter

Nvidia releases Cosmos 3, an open AI model for robotics

Share your love

  • Nvidia 2.95% launched Cosmos 3 at GTC Taipei, calling it the first fully open omnimodel for physical AI across text, image, video, sound, and action.globenewswire
  • The mixture-of-transformers architecture ships in two sizes — an 8B-parameter Nano and a 32B-parameter Super — with models open-sourced on Hugging Face and GitHub.huggingface
  • Cosmos 3 tops multiple robotics and physics benchmarks among open models, and Nvidia says it can cut physical AI training cycles from months to days.globenewswire

Nvidia Launches Cosmos 3, an Open Foundation Model for Physical AI and Robotics

Nvidia unveiled Cosmos 3 on Sunday at GTC Taipei, releasing what it calls the world’s first fully open omnimodel for physical AI — a single system that combines vision reasoning, world generation, and action prediction to help robots and autonomous vehicles perceive and act in the real world.

A Unified Architecture for Machines That Think and Move

Built on a mixture-of-transformers architecture, Cosmos 3 pairs a reasoning transformer with an expert generation transformer, allowing the model to understand object interactions, motion, and spatial-temporal relationships before generating video and action trajectories. The system can natively process and generate text, images, video, ambient sound, and actions — eliminating the need for developers to juggle separate models for different capabilities.globenewswire

“The big bang of physical AI is just around the corner thanks to breakthroughs in multimodal reasoning language, vision and world models,” said Jensen Huang, Nvidia’s founder and CEO, during his keynote. “The Cosmos 3 family of open, frontier omnimodels gives developers a generational leap in ability to build robots, autonomous vehicles and vision AI that perceive, reason, plan and act in the physical world.”globenewswire

The release includes two model sizes: Cosmos 3 Nano, an 8-billion-parameter version designed to run on workstation-grade hardware like the RTX PRO 6000 GPU, and Cosmos 3 Super, a 32-billion-parameter model built for large-scale synthetic data generation on Hopper and Blackwell GPUs. A third variant, Cosmos 3 Edge, is coming soon for real-time inference at the edge.huggingface

Open Models and a New Coalition

Nvidia is open-sourcing the models, post-training scripts, and synthetic data generation datasets, making them available on Hugging Face and GitHub. Developers can also deploy the models as Nvidia NIM microservices or access them through cloud partners including Microsoft Azure, CoreWeave, and Nebius.huggingface

Alongside the launch, Nvidia announced the Cosmos Coalition, a collaboration with Agile Robots, Black Forest Labs, Generalist, LTX, Runway, and Skild AI to advance open world models. Physical AI developers already building on the platform include Samsung, LG Electronics, Doosan Robotics, and Li Auto.globenewswire

Benchmark Performance

Among open models, Cosmos 3 ranks first across multiple physical AI benchmarks, including Physics-IQ and PAI-Bench for world generation accuracy, RoboLab and RoboArena for action policy, and the VANTAGE-Bench and TAR leaderboards for vision understanding. The model is designed to reduce physical AI training cycles from months to days by providing a pretrained foundation that requires less data and lower training costs.huggingface

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay informed and not overwhelmed, subscribe now!