Today, Mistral released Mixtral 8x7B, a high-quality sparse mixture of experts model (SMoE) with open weights.
Mixtral-8x7b-32kseqlen, DiscoLM-mixtral-8x7b-v2 and are now live on our inference platform! We have optimized the Together Inference Engine for Mixtral and it is available at up to 100 token/s for $0.0006/1K tokens — to our knowledge the fastest performance at the lowest price!
Chat with it in our playground:
Or use this code snippet:
curl -X POST https://api.together.xyz/inference \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $TOGETHER_API_KEY"\
-d '{
"model": "DiscoResearch/DiscoLM-mixtral-8x7b-v2",
"max_tokens": 512,
"prompt": "<|im_start|>user\nTell me about San Francisco<|im_end|>\n<|im_start|>assistant",
"temperature": 0.7,
"top_p": 0.7,
"top_k": 50,
"repetition_penalty": 1,
"stream_tokens": true,
"stop": [
"<|im_end|>",
"<|im_start|>"
]
}'
More on Mixtral
Licensed under Apache 2.0. Mixtral outperforms Llama 2 70B on most benchmarks. It is the strongest open-weight model with a permissive license and the best model overall regarding cost/performance trade-offs. In particular, it matches or outperforms GPT3.5 on most standard benchmarks.
Mixtral...
- Handles a context of 32k tokens.
- Handles English, French, Italian, German and Spanish.
- Shows strong performance in code generation.
- Can be finetuned into an instruction-following model that achieves a score of 8.3 on MT-Bench.
Transitioning from OpenAI?
Here’s how simple it is to switch from Open AI to Together’s Mixtral serverless endpoint -
import openai
import os
client = openai.OpenAI(
api_key=os.environ.get("TOGETHER_API_KEY"),
base_url='https://api.together.xyz',
)
chat_completion = client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Tell me about San Francisco",
}
],
model="mistralai/Mixtral-8x7B-Instruct-v0.1",
)
print(chat_completion.choices[0].message.content)
Simply add your "TOGETHER_API_KEY" (which you can find here), change the base URL to: https://api.together.xyz, and the model name to one of our 100+ open source models, and you'll be off to the races!