📺 Deep Dive on NVIDIA Blackwell with Dylan Patel (Semianalysis) and Ian Buck (NVIDIA) on 10/1 | RSVP →
⚡ Together Instant Clusters: self-service NVIDIA GPUs, now generally available →
🐋 DeepSeek-V3.1 is now available on Together AI →
🔥 Announcing DeepSWE: our SOTA software engineering agent →
🔒 Together AI achieves SOC 2 Type 2 compliance →
Model Platform
Model Platform

Products

Serverless Inference

API for inference on open-source models

Dedicated Endpoints

Deploy models on custom hardware

Fine-Tuning

Train & improve high-quality, fast models

Evaluations

Measure model quality

Together Chat

Chat app for open-source AI

Code Execution

Code Sandbox

Build AI development environments

Code Interpreter

Execute LLM-generated code

Tools

Which LLM to Use

Find the ‘right’ model for your use case

Models

See all models →

OpenAI

gpt-oss
 →
OpenAI
gpt-oss
This is some text inside of a div block.
 →
try it →
DeepSeek
 →
DeepSeek
This is some text inside of a div block.
 →
try it →
Qwen
 →
Qwen3 235B
This is some text inside of a div block.
 →
try it →
Llama
 →
Llama
This is some text inside of a div block.
 →
try it →
Kimi K2
 →
Kimi K2
This is some text inside of a div block.
 →
try it →
Cogito
 →
Cogito
This is some text inside of a div block.
 →
try it →
GPU Cloud
GPU Cloud

Clusters of Any Size

Instant Clusters

Ready to use, self-service GPUs

Reserved Clusters

Dedicated capacity, with expert support

Frontier AI Factory

1K → 10K → 100K+ NVIDIA GPUs

Cloud Services

Data Center Locations

Global GPU power in 25+ cities

Slurm

Cluster management system

GPUs

NVIDIA GB200 NVL72
 →
NVIDIA GB00 NVL72
try it →
NVIDIA HGX B200
 →
NVIDIA HGX B200
try it →
NVIDIA H200
 →
NVIDIA H200
try it →
NVIDIA H100
 →
NVIDIA H100
try it →
Solutions
Solutions

Solutions

Enterprise

Secure, reliable AI infrastructure

Customer Stories

Testimonials from AI pioneers

Why Open Source

How to own your AI

Industries & Use-Cases

Scale your business with Together AI

Customer Stories

How Hedra Scales Viral AI Video Generation with 60% Cost Savings

When Standard Inference Frameworks Failed, Together AI Enabled 5x Performance Breakthrough

Developers
Developers

Developers

Documentation

Technical docs for using Together AI

Research

Advancing the open-source AI frontier

Model Library

All our open-source models

Cookbooks

Practical implementation guides

Example Apps

Our open-source demo apps

Videos

DeepSeek-R1: How It Works, Simplified!

Together Code Sandbox: How To Build AI Coding Agents

Pricing
Pricing

Pricing

Pricing Overview

Our platform & GPU pricing.

Inference

Per-token & per-minute pricing.

Fine-Tuning

LoRA and full fine-tuning pricing.

GPU Clusters

Hourly rates & custom pricing.

Questions?

We’re here to help!

Talk to us →

Company
Company

Company

About us

Get to know us

Values

Our approach to open-source AI

Team

Meet our leadership

Careers

Join our mission

Resources

Blog

Our latest news & blog posts

Research

Advancing the open-source AI frontier

Events

Explore our events calendar

Knowledge Base

Find answers to your questions

Featured Blog Posts

Together AI Delivers Top Speeds for DeepSeek-R1-0528 Inference on NVIDIA Blackwell

Powering Secure AI: Together AI Achieves SOC 2 Type 2 Compliance

Sign InContact Sales
Chat
Docs
Blog
Support
Contact Sales

Contact Together AI

What would you like to do:

Contact Sales

For inquiries about our products and solutions, connect with the Sales Team

Help Center

Check out our constantly expanding knowledge base and get expert help from our Support Team

Subscribe to newsletter

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
  • Products
  • Solutions
  • Research
  • Blog
  • About
  • Pricing
  • Contact
  • Support
  • Status
  • Trust Center

© 2025 San Francisco, CA 94114

  • Consent Preferences
  • Cookie Policy
  • Privacy policy
  • Terms of service
Together.ai