Model Library
Build with leading open-source models
Explore top-performing models across chat, vision, code, and audio - optimized for production deployment
Featured
models
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Run any model on the fastest endpoints
Use our API to deploy any open-source model on the fastest inference stack available with optimal cost efficiency.
Scale into a dedicated deployment anytime with a custom number of instances to get optimal throughput.
RUN INFERENCE
curl -X POST "https://api.together.xyz/v1/chat/completions" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/Llama-Vision-Free",
"messages": [{"role": "user", "content": "What are some fun things to do in New York?"}]
}'
RUN INFERENCE
from together import Together
client = Together()
response = client.chat.completions.create(
model="meta-llama/Llama-Vision-Free",
messages=[{"role": "user", "content": "What are some fun things to do in New York?"}],
)
print(response.choices[0].message.content)
RUN INFERENCE
import Together from "together-ai";
const together = new Together({ apiKey: process.env.TOGETHER_API_KEY });
const response = await together.chat.completions.create({
messages: [{"role": "user", "content": "What are some fun things to do in New York?"}],
model: "meta-llama/Llama-Vision-Free",
});
console.log(response.choices[0].message.content)