Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products

This year, Together AI is excited to be part of NVIDIA GTC with multiple major announcements and conversations shaping the AI ecosystem — from cutting-edge model releases to new voice AI capabilities, and technical sessions with our research and engineering leaders.

If you’re attending GTC, we’d love to connect.

Key announcements

At GTC 2026, several of the announcements we’re participating in highlight a core theme: AI systems are becoming more open, agentic, and production ready. Together AI, the AI Native Cloud, is designed to support this shift — helping developers train, shape, and deploy large-scale AI systems with the performance and cost-efficiency required for real-world applications. We are making multiple announcements today at GTC.

Use NVIDIA Dynamo 1.0 in Together AI

NVIDIA has launched NVIDIA Dynamo 1.0, an open-source software for generative and agentic inference at scale. We are excited to work with NVIDIA on Dynamo 1.0 and have already been using Dynamo as part of our inference stack to deliver more optimized performance in production use cases. At Together AI, we are committed to open innovation and are looking forward to exploring use cases that Dynamo 1.0 can be applied to.

Connect to Together’s high-performance inference through NVIDIA OpenShell

Together AI and NVIDIA are working together on NVIDIA NemoClaw — an open source stack that simplifies running OpenClaw always-on assistants, more safely, with a single command. As part of the NVIDIA Agent Toolkit, it installs the NVIDIA OpenShell runtime—a secure environment for running autonomous agents, and open source models like NVIDIA Nemotron. Together is excited to host NVIDIA OpenShell runtime created for customers who want high performance models to build agents. Together AI has a model library with over 150 optimized models that can now be easily accessed via NemoClaw. Paired with Together’s dedicated endpoints, developers get the speed and cost efficiency of its inference engine at production scale.

Leverage NVIDIA Nemotron 3 Super for multi-agent workflows

NVIDIA Nemotron 3 Super is a hybrid mixture-of-experts model designed for high-performance reasoning and multi-agent workflows. It combines a Mamba-Transformer architecture with a 1M-token context window to support long-horizon reasoning and complex agent interactions. With 120B total parameters (12B active per token), the model is optimized to run multiple collaborating agents efficiently — even on a single GPU — making it well suited for AI-native workflows like software development agents, financial analysis, and cybersecurity automation. Nemotron 3 Super can be deployed through our Dedicated Model Inference, providing developers with a simple and scalable way to run advanced reasoning models in production.

Build voice agents with NVIDIA Parakeet TDT 0.6B V3

As part of our recent voice solutions launch, NVIDIA Parakeet TDT 0.6b V3 automatic speech recognition (ASR) model is now available in the Together AI Model Library, giving developers access to high-performance, low-latency transcription optimized for real-time voice applications. By combining Parakeet’s ASR accuracy with Together’s high-performance inference infrastructure, AI natives can build production-ready voice agents that deliver fast, reliable, and scalable transcription.

Together sessions

The Together AI team, along with customers like Cursor and Decagon, will share insights across multiple GTC sessions, covering topics from production inference to open AI research.

Sessions include:

Engineering real-world LLM inference: Bridging open-source and production systems
March 17 • 2:00 PM
Yineng Zhang — Senior Director, Together AI
Hard-Won Lessons From Production Inference at Scale
March 17 • 4:00 PM
Yuchen Wu, Engineer, Cursor | Ce Zhang — CTO, Together AI
Build Trust and Discovery Through Open-Source AI in Research
March 18 • 2:00 PM
Percy Liang — Co-Founder, Together AI
Under the Hood of Building and Scaling AI-Native Applications
March 18 • 4:00 PM
Alan Yiu, VP of Product, Decagon | Charles Zedlewski — Chief Product Officer, Together AI

Visit us at booth #1213

Beyond sessions, the Together team will be hosting booth activations and side events throughout the week, including curated executive meetups focused on next-generation AI infrastructure and AI-native applications.

Stop by to:

See live demos of Together AI infrastructure and models
Learn how teams are scaling production inference and agentic systems
Meet researchers and engineers building the future of open AI models and infrastructure

Try Nemotron models now on Together AI serverless endpoints: https://www.together.ai/models

Learn more and request a meeting: https://www.together.ai/gtc-san-jose-2026