Together AI | Llama 3.1 Nemotron 70B Instruct API

Llama 3.1 Nemotron 70B Instruct API Usage

Endpoint

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF

RUN INFERENCE

curl -X POST "https://api.together.xyz/v1/chat/completions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/Llama-3.1-Nemotron-70B-Instruct-HF",
    "messages": [{"role": "user", "content": "What are some fun things to do in New York?"}]
  }'

JSON RESPONSE

RUN INFERENCE

from together import Together

client = Together()

response = client.chat.completions.create(
    model="nvidia/Llama-3.1-Nemotron-70B-Instruct-HF",
    messages=[{"role": "user", "content": "What are some fun things to do in New York?"}],
)
print(response.choices[0].message.content)

JSON RESPONSE

RUN INFERENCE

import Together from "together-ai";

const together = new Together();

const response = await together.chat.completions.create({
    messages: [{"role": "user", "content": "What are some fun things to do in New York?"}],
    model: "nvidia/Llama-3.1-Nemotron-70B-Instruct-HF",
});

console.log(response.choices[0].message.content)

JSON RESPONSE

Looking for production scale? Deploy on a dedicated endpoint

Subscribe to newsletter

Llama 3.1 Nemotron 70B Instruct API

Llama 3.1 Nemotron 70B Instruct API Usage

How to use Llama 3.1 Nemotron 70B Instruct

Model details

Prompting Llama 3.1 Nemotron 70B Instruct

Applications & Use Cases

Looking for production scale? Deploy on a dedicated endpoint

Subscribe to newsletter