Models / Code / Qwen 2.5 Coder 32B Instruct
Qwen 2.5 Coder 32B Instruct
Code
SOTA code LLM with advanced code generation, reasoning, fixing, and support for up to 128K tokens.
Try our Qwen2.5 API

API Usage
Endpoint
Qwen/Qwen2.5-Coder-32B-Instruct
RUN INFERENCE
curl -X POST "https://api.together.xyz/v1/chat/completions" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-Coder-32B-Instruct",
"messages": [{"role": "user", "content": "What are some fun things to do in New York?"}]
}'
JSON RESPONSE
RUN INFERENCE
from together import Together
client = Together()
response = client.chat.completions.create(
model="Qwen/Qwen2.5-Coder-32B-Instruct",
messages=[{"role": "user", "content": "What are some fun things to do in New York?"}],
)
print(response.choices[0].message.content)
JSON RESPONSE
RUN INFERENCE
import Together from "together-ai";
const together = new Together();
const response = await together.chat.completions.create({
messages: [{"role": "user", "content": "What are some fun things to do in New York?"}],
model: "Qwen/Qwen2.5-Coder-32B-Instruct",
});
console.log(response.choices[0].message.content)
JSON RESPONSE
Model Provider:
Qwen
Type:
Code
Variant:
Coder
Parameters:
32B
Deployment:
✔ Serverless
Quantization
FP16
Context length:
32768
Pricing:
$0.80
Run in playground
Deploy model
Quickstart docs
Looking for production scale? Deploy on a dedicated endpoint
Deploy Qwen 2.5 Coder 32B Instruct on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.
