Fine-Tuning

Fine-tune open-source models for
real production use

Improve accuracy, reduce hallucinations, and control behavior — without managing training infrastructure.

Why fine-tune models
with Together AI?

Build models that are faster, more accurate,  and fully yours

Reliable infrastructure at any scale

Multi-node orchestration that eliminates job failures. Fine-tune 100B+ models (DeepSeek-V3, Qwen3-235B) that break other platforms, with the reliability to experiment rapidly.

Research-driven performance gains

ML systems research built into every job. Train with 2-4x longer contexts at no extra cost, advanced DPO variants from SOTA recipes, and continuous optimizations that make your runs faster over time.

Universal model compatibility

Fine-tune any open-source model from Hugging Face Hub. No vendor lock-in, no format conversions — seamless integration with your existing workflows.

Fine-tune leading models

Explore top-performing models across text, image, video, code, and voice.

New

Image

Nano Banana Pro (Gemini 3 Pro Image)

new

Chat

GLM-5

New

Chat

Kimi K2.5

New

Chat

gpt-oss-120B

New

Code

DeepSeek-V3.2-Exp

new

Chat

Ministral 3 8B Instruct 2512

new

Audio

MiniMax Speech 2.6 Turbo

new

Chat

LFM2 24B A2B

new

Code

Qwen3-Coder-Next

new

Image

Wan 2.6 Image

new

Image

GPT Image 1.5

Chat

Gemma 3 27B

Chat

Llama 4 Maverick

Chat

Qwen3 235B A22B Instruct 2507 FP8

Video

Google Veo 3.0

Image

FLUX.2 [pro]

Chat

NIM Llama 3.1 Nemotron 70B Instruct

New

Chat

Kimi K2 Instruct-0905

New

Video

Sora 2 Pro

new

Chat

Arcee AI Trinity Mini

Fine-tuning options

Choose how fine-tuned models are trained and hosted based on dataset size, cost, and control.

  • LoRA fine-tuning

    Lightweight fine-tuning for fast iteration and lower cost.

    Best for
    Small to medium datasets
    Fast training & deployment
    Easy to update or roll back
    Get started
  • Full fine-tuning

    Train the entire model for maximum control and quality.

    Best for
    Large or complex datasets
    Deeper behavior changes
    Dedicated infrastructure
    Get started

Everything you need to fine-tune at scale

Fine-tune any open-source model on your data. Deploy securely onto scalable infrastructure.

    • Large frontier model support

      100B+ param models
      Multi-GPU training
      Faster training

      Fine-tune large open-source models like Kimi-K2 and GLM-4.7 for tool use, reasoning, and agentic tasks. Drive advanced model behavior through a single API without managing underlying training infrastructure.

    • Vision fine-tuning

      LoRA & full fine-tuning
      PNG, JPEG, WEBP
      Deploy instantly

      Train vision models directly on raw image data without format changes or special preprocessing. Include images alongside text to fine-tune Llama-4, Qwen3-VL, and Gemma-3 via standard APIS.

    • Tool-calling training

      Use existing agent logs
      Native function calling

      Train models for precise tool execution by integrating function definitions and tool calls directly into datasets. Process existing agent logs as-is to improve accuracy without manually restructuring data.

    • Cost estimation

      See cost estimates
      No surprises

      Estimate training costs before launching any job directly from the UI or CLI. Evaluate resource requirements upfront to eliminate budget surprises.

Powered by leading research

Our fine-tuning infrastructure is built on research and optimized for scale, efficiency, and production performance.

  • UPipe
  • FFT Optimizer
  • Throughput (TPS)

    • Upipe
    • FPDT
    • ALST

    UPipe vs other SOTA Approaches

    82.5% less memory

    Long-context training hits a memory wall at the attention layer. UPipe processes attention heads in smaller chunks, cutting peak activation memory by up to 82.5% — enabling 5M token context lengths on a single 8×H100 node.

    learn more
  • Context parallelism approaches on long-context training

    • Together AI (DCT)
    • Baseline (LD)

    FFT Optimizer results

    25% less memory

    Fine-tuning large models is memory-hungry. Our FFT-based optimizer replaces expensive SVD projections with fast Fourier transforms, reducing optimizer memory by up to 25% with no loss in training quality.

    learn more

Advanced model shaping capabilities

For teams pushing models beyond standard fine-tuning

  • Speculative decoding

    Accelerate inference with custom speculative decoding, training lightweight draft models to predict multiple tokens

  • Quantization

    Accelerate inference with custom speculative decoding, training lightweight draft models to predict multiple tokens

  • Reinforcement learning

    Leverage PyTorch-based reinforcement learning to shape model policies for reasoning, tool use, and long-horizon agentic behavior.

Production-grade
security and data privacy

We take security and compliance seriously, with strict data privacy controls to keep your information protected. Your data and models remain fully under your ownership, safeguarded by robust security measures.

Learn More

SOC 2 Type II. HIPAA-aligned options available. Encryption in transit and at rest. Deploy storage in regions matching your data residency requirements—North America, Europe, or Asia/Middle East based on your compliance needs.

  • NVIDIA preferred partner
  • AICPA SOC 2 Type II

Customers running inference in production

  • 2-3x

    Cost savings

  • 13%

    Better accuracy

"Together AI does for fine-tuning and inference what Vercel does for LLM-based apps—it removes the infrastructure layer so we can focus on our product. We fine‑tune and deploy customer‑specific models through simple API calls. That lets our existing team move from weekly to daily iteration, cut costs by 2–3×, and improve accuracy from 77% to 87%."

Lamara De Brouwer

Co-Founder & CTO, XY.AI Labs

    "The technical challenge was running our multi-stage pipeline reliably at the conversation lengths our therapy models require," explains Daniel Cahn. "Together's platform eliminated the context length constraints and job failures we hit elsewhere, letting us experiment rapidly."

    Daniel Cahn

    Co-founder & CEO, Slingshot AI

      "After thoroughly evaluating multiple LLM infrastructure providers, we’re thrilled to be partnering with Together AI for fine-tuning. The new ability to resume from a checkpoint combined with LoRA serving has enabled our customers to deeply tune our foundation model, ShieldLlama, for their enterprise’s precise risk posture. The level of accuracy would never be possible with vanilla open source or prompt engineering."

      Alex Chung

      Founder, Protege AI