Models / Black Forest Labs

Black Forest Labs

Deploy FLUX image generation models from Black Forest Labs on Together AI. State-of-the-art photorealistic output, open weights, and a production-ready API.

Why Black Forest Labs on Together AI?

Designed for production workloads that need 
consistent performance and operational control.

State-of-the-art image generation

FLUX leads on photorealism, text rendering, and prompt precision. The same model family powering production image workloads across the industry.

Full control over deployment

Open weights give you complete flexibility — fine-tune FLUX for your style, run it on your own infrastructure, or scale instantly via Together AI's API. No lock-in.

Production workloads at any scale

SOC 2 Type II certified, HIPAA compliant, and deployed on US-based infrastructure. Consistent throughput for high-volume generation pipelines.

Meet the Black Forest Labs family

Explore top-performing models across text, image, video, code, and voice.

Image

FLUX.2 [pro]

New

Image

FLUX.1 Canny [pro]

Image

FLUX.1 Kontext [dev]

Image

FLUX.2 [dev]

Image

FLUX.2 [flex]

new

Image

FLUX.2 [max]

Free

Image

FLUX.1 [schnell] Free

Image

FLUX.1 [pro]

New

Image

FLUX.1 Krea [dev]

Image

FLUX.1 Kontext [pro]

Image

FLUX1.1 [pro]

Image

FLUX.1 Kontext [max]

Image

FLUX.1 Schnell [fixedres]

Image

FLUX.1 [dev]

Image

FLUX.1 [schnell]

Deployment options

Run models using different deployment options depending on latency needs, traffic patterns, and infrastructure control.

  • Serverless

  • Inference

Serverless Inference

Real-time

A fully managed inference API that automatically scales with request volume.

Best for

Variable or unpredictable traffic

Rapid prototyping and iteration

Cost-sensitive or early-stage production workloads

Batch

Process massive workloads of up to 30 billion tokens asynchronously, at up to 50% less cost.

Best for

Classifying large datasets

Offline summarization

Synthetic data generation

Dedicated Inference

Dedicated Model Inference

An inference endpoint backed by reserved, isolated compute resources and the Together AI inference engine.

Best for

Predictable or steady traffic

Latency-sensitive applications

High-throughput production workloads

Dedicated Container Inference

Run inference with your own engine and model on fully-managed, scalable infrastructure.

Best for

Generative media models

Non-standard runtimes

Custom inference pipelines