Nvidia releases Nemotron 3 Ultra, its largest open AI model

Nvidia ↗2.95% released Nemotron 3 Ultra on Wednesday, a hybrid Transformer-Mamba model with 1 million tokens of context for sustained agent reasoning.lmsys
The model claims to deliver 5x faster inference than comparable open models, according to Nvidia, and scores as the top U.S. open-weight model on Artificial Analysis benchmarks.artificialanalysis
Enterprise partners including Glean, CrowdStrike, and Palantir Technologies ↘1.65% are integrating the model for cybersecurity, knowledge work, and agentic workflows.nvidia

Nvidia Releases Nemotron 3 Ultra, a 550-Billion-Parameter Open Model for Agentic AI

Nvidia on Wednesday released Nemotron 3 Ultra, its largest open-weight AI model to date, completing the Nemotron 3 family first announced in December 2025. The 550-billion-parameter Mixture-of-Experts model activates 55 billion parameters per token and is designed to power long-running autonomous AI agents across coding, research, and enterprise workflows.lmsys

The model became available June 4 via Hugging Face, ModelScope, OpenRouter, and Nvidia’s build.nvidia.com platform as NIM microservices, following its unveiling by CEO Jensen Huang during his keynote at GTC Taipei on May 31.nvidia

Architecture and Performance

Nemotron 3 Ultra uses a hybrid Transformer-Mamba architecture that combines traditional attention mechanisms with state-space models optimized for long sequences. The design supports a context window of up to 1 million tokens, allowing agents to sustain reasoning across extended multi-step tasks without losing coherence.youtube

Nvidia claims the model delivers up to 5x faster inference and up to 30% lower cost compared with open frontier models in its class. According to benchmark platform Artificial Analysis, Nemotron 3 Ultra scores 48 on its Intelligence Index, making it the most capable open-weight model from a U.S. lab, though it still trails some Chinese open-weight competitors.artificialanalysis

The model has been post-trained with multi-environment reinforcement learning and is optimized for agent platforms including Hermes Agent, LangChain Deep Agents, OpenHands, and OpenCode.lmsys

Enterprise Adoption and Ecosystem

The release was accompanied by a wave of enterprise integrations. Glean announced support for Nemotron 3 Ultra, stating the model delivers “91% of frontier-model performance” for everyday enterprise agentic work. Aible, CrowdStrike, and Palantir are also integrating Nemotron models into their platforms for use cases spanning knowledge work, cybersecurity, and operational decision-making.x

SGLang and Miles announced day-zero inference support, providing developers with a high-performance serving stack on Blackwell GPUs in both BF16 and NVFP4 precision. Amazon Web Services has previously made other Nemotron 3 family models available on SageMaker JumpStart with one-click deployment.amazon

The release positions Nvidia not just as the dominant AI hardware supplier but as an increasingly competitive model developer. “The world’s software leaders are bringing AI agents into the systems where work gets done,” Huang said at GTC Taipei, framing agents as “digital coworkers” that “amplify human expertise.”nvidia