NVIDIA Launches Nemotron 3: Open Models for Agentic AI

Summarize this article with:
NVIDIA today announced the Nemotron™ 3 family of open models – in Nano, Super, and Ultra sizes – introducing a new, efficient family designed for building agentic AI applications. These models utilize a breakthrough hybrid latent mixture-of-experts (MoE) architecture to address challenges in multi-agent systems, such as communication overhead and inference costs. Notably, Nemotron 3 Nano delivers four times higher throughput than its predecessor, Nemotron 2 Nano, and maximizes tokens per second at scale. This release marks the first offering of state-of-the-art open models, datasets, and reinforcement learning environments intended to power transparent and efficient AI agent development across industries. Nemotron 3 Family and Key Features NVIDIA has introduced the Nemotron 3 family of open models, available in Nano, Super, and Ultra sizes, designed to improve efficiency and accuracy in building agentic AI applications. Nemotron 3 Nano, a 30-billion-parameter model, activates up to 3 billion parameters for targeted tasks. This model delivers 4x higher throughput than Nemotron 2 Nano and boasts a 1-million-token context window, enhancing accuracy and information connection for complex, multistep tasks. It’s currently available on Hugging Face. Nemotron 3 Super, with approximately 100 billion parameters and up to 10 billion active per token, excels in applications requiring many collaborating agents with low latency. The largest, Nemotron 3 Ultra, has about 500 billion parameters and activates up to 50 billion per token, functioning as an advanced reasoning engine. Both Super and Ultra utilize NVIDIA’s ultraefficient 4-bit NVFP4 training format on the Blackwell architecture, reducing memory demands and accelerating training. To support the Nemotron 3 family, NVIDIA released three trillion tokens of new pretraining, post-training, and reinforcement learning datasets. These datasets provide rich examples for creating domain-specialized agents. Additionally, tools like NeMo Gym, NeMo RL, and NeMo Evaluator are available on GitHub and Hugging Face, accelerating development and allowing for validation of model safety and performance. Model Sizes: Nano, Super, and Ultra The NVIDIA Nemotron 3 family introduces three model sizes – Nano, Super, and Ultra – designed to power agentic AI applications. Nemotron 3 Nano is a 30-billion-parameter model activating up to 3 billion parameters, optimized for efficient tasks like software debugging and content summarization. Nemotron 3 Super boasts approximately 100 billion parameters with up to 10 billion active per token, excelling in multi-agent applications. Finally, Nemotron 3 Ultra is a large, 500-billion-parameter reasoning engine for complex AI workflows. Nemotron 3 Nano delivers 4x higher throughput than Nemotron 2 Nano and reduces reasoning-token generation by up to 60%, lowering inference costs. It utilizes a 1-million-token context window, improving accuracy and information connection across lengthy, multistep tasks. Artificial Analysis ranked it as the most open and efficient model of its size, with leading accuracy. The Super and Ultra models leverage NVIDIA’s ultraefficient 4-bit NVFP4 training format on the Blackwell architecture to cut memory requirements and speed up training. Developers can select the appropriately sized model from the Nemotron 3 family – Nano, Super, or Ultra – scaling from dozens to hundreds of agents. This scalability supports faster, more accurate long-horizon reasoning for complex workflows. The models are designed to optimize tokenomics by routing tasks between the frontier-level models and Nemotron, ensuring agents have both intelligence and efficiency. Efficiency and Performance Improvements The Nemotron 3 family of models introduces significant efficiency improvements for building agentic AI. Notably, Nemotron 3 Nano delivers 4x higher throughput compared to Nemotron 2 Nano, maximizing tokens per second for multi-agent systems. This is achieved through a hybrid mixture-of-experts (MoE) architecture, and further enhanced by reducing reasoning-token generation by up to 60%, which substantially lowers inference costs for applications like software debugging and content summarization. Nemotron 3 models are designed for scalable performance, offering three sizes—Nano, Super, and Ultra—with varying parameter counts and active parameters per token. The Nano model, with 30 billion parameters activating up to 3 billion at a time, is particularly compute-cost-efficient. Utilizing NVIDIA’s ultraefficient 4-bit NVFP4 training format on the Blackwell architecture also reduces memory requirements and speeds up training for the Super and Ultra models without compromising accuracy. Independent benchmarking by Artificial Analysis ranked Nemotron 3 Nano as the most open and efficient model of its size, with leading accuracy. The family’s efficiency is bolstered by new open tools, including three trillion tokens of pretraining, post-training, and reinforcement learning datasets, and libraries like NeMo Gym and NeMo RL, all released to accelerate AI agent customization and development. New Tools and Data for Agent Customization NVIDIA is providing new tools and data for customizing AI agents with the release of the Nemotron 3 family of open models. This includes three trillion tokens of new datasets for pretraining, post-training, and reinforcement learning – providing rich examples for building specialized agents.
The Nemotron Agentic Safety Dataset offers real-world telemetry for evaluating and strengthening agent safety. These resources aim to empower developers to create highly capable, domain-specific AI. To accelerate development, NVIDIA released the NeMo Gym and NeMo RL open-source libraries, offering training environments and post-training foundations for Nemotron models. Complementing these is NeMo Evaluator, used to validate model safety and performance. These tools, alongside the datasets, are available on GitHub and Hugging Face, promoting open innovation and accessibility for developers building agentic AI systems. The Nemotron 3 family itself – Nano, Super, and Ultra – offers varying sizes to suit different workloads. Nemotron 3 Nano, a 30-billion-parameter model, activates up to 3 billion parameters for efficient tasks. Larger models like Super and Ultra utilize NVIDIA’s ultraefficient 4-bit NVFP4 training format on the Blackwell architecture, cutting memory requirements and accelerating training without sacrificing accuracy. Applications and Industry Integration NVIDIA’s Nemotron 3 family of open models is designed to power agentic AI development across industries, with sizes ranging from 30 billion to 500 billion parameters. The models – Nano, Super, and Ultra – address challenges in multi-agent systems like communication overhead and inference costs. Notably, Nemotron 3 Nano delivers up to 4x higher throughput than Nemotron 2 Nano, while reducing reasoning-token generation by up to 60%, offering increased efficiency and scalability for tasks like software debugging and content summarization. Early adopters, including Accenture, Deloitte, and ServiceNow, are integrating Nemotron 3 models into their AI workflows across manufacturing, cybersecurity, and other sectors. Developers can now route tasks between Nemotron and proprietary models to optimize both intelligence and cost, leveraging the open models for efficient tasks while utilizing more powerful models when needed. This approach aims to accelerate innovation, from prototyping to enterprise deployment, particularly for startups within programs like NVIDIA Inception. NVIDIA has also released a suite of open tools and datasets, including three trillion tokens for training and reinforcement learning, along with libraries like NeMo Gym and NeMo RL. These resources are intended to help developers build and customize specialized AI agents, while also prioritizing safety through datasets like the Nemotron Agentic Safety Dataset. Support for tools like LM Studio and integration with platforms like Prime Intellect aim to further accelerate development and access to powerful reinforcement learning training. Source: https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models?ncid=so-nvsh-759425&es_id=1f7cc5dac9 Tags:
