Cookie Policy

Model Platform

Products

Serverless Inference

API for inference on open-source models

Dedicated Endpoints

Deploy models on custom hardware

Fine-Tuning

Train & improve high-quality, fast models

Evaluations

Measure model quality

Together Chat

Chat app for open-source AI

Code Execution

Code Sandbox

Build AI development environments

Code Interpreter

Execute LLM-generated code

Tools

Which LLM to Use

Find the ‘right’ model for your use case

Models

See all models →

OpenAI 
gpt-oss

→

OpenAI gpt-oss

This is some text inside of a div block.

This is some text inside of a div block.

This is some text inside of a div block.

This is some text inside of a div block.

This is some text inside of a div block.

This is some text inside of a div block.

→

try it →

GPU Cloud

Clusters of Any Size

Instant Clusters

Ready to use, self-service GPUs

Reserved Clusters

Dedicated capacity, with expert support

Frontier AI Factory

1K → 10K → 100K+ NVIDIA GPUs

Cloud Services

Data Center Locations

Global GPU power in 25+ cities

Slurm

Cluster management system

GPUs

Solutions

Customer Stories

Testimonials from AI pioneers

Startup Accelerator

Build and scale your startup

Enterprise

Secure, reliable AI infrastructure

Why Open Source

How to own your AI

Industries & Use-Cases

Scale your business with Together AI

Customer Stories

How Hedra Scales Viral AI Video Generation with 60% Cost Savings

When Standard Inference Frameworks Failed, Together AI Enabled 5x Performance Breakthrough

Developers

Documentation

Technical docs for using Together AI

Research

Advancing the open-source AI frontier

Model Library

All our open-source models

Cookbooks

Practical implementation guides

Example Apps

Our open-source demo apps

Videos

DeepSeek-R1: How It Works, Simplified!

Together Code Sandbox: How To Build AI Coding Agents

Pricing

Pricing Overview

Our platform & GPU pricing.

Inference

Per-token & per-minute pricing.

Fine-Tuning

LoRA and full fine-tuning pricing.

GPU Clusters

Hourly rates & custom pricing.

Questions? 
We’re here to help!

Talk to us →

Company

About us

Get to know us

Values

Our approach to open-source AI

Team

Meet our leadership

Careers

Join our mission

Resources

Blog

Our latest news & blog posts

Research

Advancing the open-source AI frontier

Events

Explore our events calendar

Knowledge Base

Find answers to your questions

Featured Blog Posts

AdapTive-LeArning Speculator System (ATLAS): A New Paradigm in LLM Inference via Runtime-Learning Accelerators

Together AI Delivers Top Speeds for DeepSeek-R1-0528 Inference on NVIDIA Blackwell

Subscribe to newsletter

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Products
Solutions
Research
Blog
About
Careers
Pricing
Contact
Support
Status
Trust Center

Consent Preferences
Cookie Policy
Privacy policy
Terms of service