Models / Chat / Qwen2.5 7B Instruct Turbo
Qwen2.5 7B Instruct Turbo
LLM
Instruction-tuned 7.61B Qwen2.5 causal LLM with 131K context, RoPE, SwiGLU, RMSNorm, and advanced attention mechanisms.
Try our Qwen2.5 API

API Usage
How to use Qwen2.5 7B Instruct TurboModel CardPrompting Qwen2.5 7B Instruct TurboApplications & Use CasesAPI Usage
Endpoint
Qwen/Qwen2.5-7B-Instruct-Turbo
RUN INFERENCE
curl -X POST "https://api.together.xyz/v1/chat/completions" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-7B-Instruct-Turbo",
"messages": [{"role": "user", "content": "What are some fun things to do in New York?"}]
}'
JSON RESPONSE
RUN INFERENCE
from together import Together
client = Together()
response = client.chat.completions.create(
model="Qwen/Qwen2.5-7B-Instruct-Turbo",
messages=[{"role": "user", "content": "What are some fun things to do in New York?"}],
)
print(response.choices[0].message.content)
JSON RESPONSE
RUN INFERENCE
import Together from "together-ai";
const together = new Together();
const response = await together.chat.completions.create({
messages: [{"role": "user", "content": "What are some fun things to do in New York?"}],
model: "Qwen/Qwen2.5-7B-Instruct-Turbo",
});
console.log(response.choices[0].message.content)
JSON RESPONSE
Model Provider:
Qwen
Type:
Chat
Variant:
Instruct
Parameters:
7B
Deployment:
✔ Serverless
Quantization
FP8
Context length:
131,072
Pricing:
$0.30
Run in playground
Deploy model
Quickstart docs
Quickstart docs
How to use Qwen2.5 7B Instruct Turbo
Model details
Prompting Qwen2.5 7B Instruct Turbo
Applications & Use Cases
Looking for production scale? Deploy on a dedicated endpoint
Deploy Qwen2.5 7B Instruct Turbo on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.
