Qwen
Deploy Qwen3 and QwQ models on Together AI. Hybrid reasoning, agentic coding, and OpenAI-compatible API — open source under Apache 2.0.
Why Qwen on Together AI?
Designed for production workloads that need consistent performance and operational control.
Drop-in OpenAI replacement
Same API format, hybrid thinking mode, and multilingual support. Migrate from OpenAI with zero code changes.
From edge to frontier, one family
Models spanning sub-1B to 480B+ parameters with adaptive scaling for every use case and budget.
Open source, enterprise licensed
Apache 2.0 licensing gives you full commercial freedom. SOC 2 Type II certified, HIPAA compliant, US-based infrastructure.
Meet the Qwen family
Explore top-performing models across text, image, video, code, and voice.
Deployment options
Run models using different deployment options depending on latency needs, traffic patterns, and infrastructure control.
Real-time
A fully managed inference API that automatically scales with request volume.
Best for
Batch
Process massive workloads of up to 30 billion tokens asynchronously, at up to 50% less cost.
Best for
Dedicated Model Inference
An inference endpoint backed by reserved, isolated compute resources and the Together AI inference engine.
Best for
Dedicated Container Inference
Run inference with your own engine and model on fully-managed, scalable infrastructure.
Best for