Models / Embeddings / M2-BERT 80M 32K Retrieval
M2-BERT 80M 32K Retrieval
Embeddings
80M checkpoint of M2-BERT, pretrained with sequence length 32768, and it has been fine-tuned for long-context retrieval.
Read our Docs

API Usage
Endpoint
togethercomputer/m2-bert-80M-32k-retrieval
RUN INFERENCE
curl -X POST "https://api.together.xyz/v1/embeddings" \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d {
"model": "togethercomputer/m2-bert-80M-32k-retrieval",
"input": "Our solar system orbits the Milky Way galaxy at about 515,000 mph"
}'
JSON RESPONSE
RUN INFERENCE
from together import Together
client = Together()
response = client.embeddings.create(
model="togethercomputer/m2-bert-80M-32k-retrieval",
input="Our solar system orbits the Milky Way galaxy at about 515,000 mph",
)
print(response.data[0].embedding)
JSON RESPONSE
RUN INFERENCE
import Together from "together-ai";
const together = new Together();
const response = await together.embeddings.create({
model: "togethercomputer/m2-bert-80M-32k-retrieval",
input: "Our solar system orbits the Milky Way galaxy at about 515,000 mph",
});
console.log(response.data[0].embedding);
JSON RESPONSE
Model Provider:
Together AI
Type:
Embeddings
Variant:
Parameters:
80M
Deployment:
✔ Serverless
Quantization
Context length:
32768
Pricing:
$0.01
Run in playground
Deploy model
Quickstart docs
Looking for production scale? Deploy on a dedicated endpoint
Deploy M2-BERT 80M 32K Retrieval on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.
