📺 Deep Dive on NVIDIA Blackwell with Dylan Patel (Semianalysis) and Ian Buck (NVIDIA) on 10/1 | RSVP →
⚡ Together Instant Clusters: self-service NVIDIA GPUs, now generally available →
🐋 DeepSeek-V3.1 is now available on Together AI →
🔥 Announcing DeepSWE: our SOTA software engineering agent →
🔒 Together AI achieves SOC 2 Type 2 compliance →
Model Platform
Model Platform

Products

Serverless Inference

API for inference on open-source models

Dedicated Endpoints

Deploy models on custom hardware

Fine-Tuning

Train & improve high-quality, fast models

Evaluations

Measure model quality

Together Chat

Chat app for open-source AI

Code Execution

Code Sandbox

Build AI development environments

Code Interpreter

Execute LLM-generated code

Tools

Which LLM to Use

Find the ‘right’ model for your use case

Models

See all models →

OpenAI

gpt-oss
 →
OpenAI
gpt-oss
This is some text inside of a div block.
 →
try it →
DeepSeek
 →
DeepSeek
This is some text inside of a div block.
 →
try it →
Qwen
 →
Qwen3 235B
This is some text inside of a div block.
 →
try it →
Llama
 →
Llama
This is some text inside of a div block.
 →
try it →
Kimi K2
 →
Kimi K2
This is some text inside of a div block.
 →
try it →
Cogito
 →
Cogito
This is some text inside of a div block.
 →
try it →
GPU Cloud
GPU Cloud

Clusters of Any Size

Instant Clusters

Ready to use, self-service GPUs

Reserved Clusters

Dedicated capacity, with expert support

Frontier AI Factory

1K → 10K → 100K+ NVIDIA GPUs

Cloud Services

Data Center Locations

Global GPU power in 25+ cities

Slurm

Cluster management system

GPUs

NVIDIA GB200 NVL72
 →
NVIDIA GB00 NVL72
try it →
NVIDIA HGX B200
 →
NVIDIA HGX B200
try it →
NVIDIA H200
 →
NVIDIA H200
try it →
NVIDIA H100
 →
NVIDIA H100
try it →
Solutions
Solutions

Solutions

Enterprise

Secure, reliable AI infrastructure

Customer Stories

Testimonials from AI pioneers

Why Open Source

How to own your AI

Industries & Use-Cases

Scale your business with Together AI

Customer Stories

How Hedra Scales Viral AI Video Generation with 60% Cost Savings

When Standard Inference Frameworks Failed, Together AI Enabled 5x Performance Breakthrough

Developers
Developers

Developers

Documentation

Technical docs for using Together AI

Research

Advancing the open-source AI frontier

Model Library

All our open-source models

Cookbooks

Practical implementation guides

Example Apps

Our open-source demo apps

Videos

DeepSeek-R1: How It Works, Simplified!

Together Code Sandbox: How To Build AI Coding Agents

Pricing
Pricing

Pricing

Pricing Overview

Our platform & GPU pricing.

Inference

Per-token & per-minute pricing.

Fine-Tuning

LoRA and full fine-tuning pricing.

GPU Clusters

Hourly rates & custom pricing.

Questions?

We’re here to help!

Talk to us →

Company
Company

Company

About us

Get to know us

Values

Our approach to open-source AI

Team

Meet our leadership

Careers

Join our mission

Resources

Blog

Our latest news & blog posts

Research

Advancing the open-source AI frontier

Events

Explore our events calendar

Knowledge Base

Find answers to your questions

Featured Blog Posts

Together AI Delivers Top Speeds for DeepSeek-R1-0528 Inference on NVIDIA Blackwell

Powering Secure AI: Together AI Achieves SOC 2 Type 2 Compliance

Sign InContact Sales
Chat
Docs
Blog
Support
Contact Sales

Subscribe to newsletter

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
  • Products
  • Solutions
  • Research
  • Blog
  • About
  • Pricing
  • Contact
  • Support
  • Status
  • Trust Center

© 2025 San Francisco, CA 94114

  • Consent Preferences
  • Cookie Policy
  • Privacy policy
  • Terms of service
Together.ai