This website uses cookies to anonymously analyze website traffic using Google Analytics.

Models / MetaLlama /  / Llama 4 Maverick API

Llama 4 Maverick API

SOTA 128-expert MoE powerhouse for multilingual image/text understanding, creative writing, and enterprise-scale applications.

Try our Llama 4 API

Together AI offers day 1 support for the new Llama 4 multilingual vision models that can analyze multiple images and respond to queries about them.

Register for a Together AI account  to get an API key. New accounts come with free credits to start. Install the Together AI library for your preferred language.

Llama 4 Maverick API Usage

Endpoint

curl -X POST "https://api.together.xyz/v1/chat/completions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "messages": [
      {
        "role": "user",
        "content": "What are some fun things to do in New York?"
      }
    ]
}'
curl -X POST "https://api.together.xyz/v1/images/generations" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "prompt": "Draw an anime style version of this image.",
    "width": 1024,
    "height": 768,
    "steps": 28,
    "n": 1,
    "response_format": "url",
    "image_url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
  }'
curl -X POST https://api.together.xyz/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image."},
        {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"}}
      ]
    }],
    "max_tokens": 512
  }'
curl -X POST https://api.together.xyz/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "messages": [{
      "role": "user",
      "content": "Given two binary strings `a` and `b`, return their sum as a binary string"
    }]
  }'
curl -X POST https://api.together.xyz/v1/rerank \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "query": "What animals can I find near Peru?",
    "documents": [
      "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
      "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
      "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
      "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations."
    ],
    "top_n": 2
  }'
curl -X POST https://api.together.xyz/v1/embeddings \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Our solar system orbits the Milky Way galaxy at about 515,000 mph.",
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
  }'
curl -X POST https://api.together.xyz/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -d '{
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    "prompt": "A horse is a horse",
    "max_tokens": 32,
    "temperature": 0.1,
    "safety_model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
  }'
curl --location 'https://api.together.ai/v1/audio/generations' \
  --header 'Content-Type: application/json' \
  --header 'Authorization: Bearer $TOGETHER_API_KEY' \
  --output speech.mp3 \
  --data '{
    "input": "Today is a wonderful day to build something people love!",
    "voice": "helpful woman",
    "response_format": "mp3",
    "sample_rate": 44100,
    "stream": false,
    "model": "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
  }'
curl -X POST "https://api.together.xyz/v1/audio/transcriptions" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -F "model=meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8" \
  -F "language=en" \
  -F "response_format=json" \
  -F "timestamp_granularities=segment"
from together import Together

client = Together()

response = client.chat.completions.create(
  model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  messages=[
    {
      "role": "user",
      "content": "What are some fun things to do in New York?"
    }
  ]
)
print(response.choices[0].message.content)
from together import Together

client = Together()

imageCompletion = client.images.generate(
    model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    width=1024,
    height=768,
    steps=28,
    prompt="Draw an anime style version of this image.",
    image_url="https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
)

print(imageCompletion.data[0].url)


from together import Together

client = Together()

response = client.chat.completions.create(
    model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    messages=[{
    	"role": "user",
      "content": [
        {"type": "text", "text": "Describe what you see in this image."},
        {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"}}
      ]
    }]
)
print(response.choices[0].message.content)

from together import Together

client = Together()
response = client.chat.completions.create(
  model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  messages=[
  	{
	    "role": "user", 
      "content": "Given two binary strings `a` and `b`, return their sum as a binary string"
    }
 ],
)

print(response.choices[0].message.content)

from together import Together

client = Together()

query = "What animals can I find near Peru?"

documents = [
  "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
  "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
  "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
  "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations.",
]

response = client.rerank.create(
  model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  query=query,
  documents=documents,
  top_n=2
)

for result in response.results:
    print(f"Relevance Score: {result.relevance_score}")

from together import Together

client = Together()

response = client.embeddings.create(
  model = "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  input = "Our solar system orbits the Milky Way galaxy at about 515,000 mph"
)

from together import Together

client = Together()

response = client.completions.create(
  model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  prompt="A horse is a horse",
  max_tokens=32,
  temperature=0.1,
  safety_model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
)

print(response.choices[0].text)

from together import Together

client = Together()

speech_file_path = "speech.mp3"

response = client.audio.speech.create(
  model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  input="Today is a wonderful day to build something people love!",
  voice="helpful woman",
)
    
response.stream_to_file(speech_file_path)

from together import Together

client = Together()
response = client.audio.transcribe(
    model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    language="en",
    response_format="json",
    timestamp_granularities="segment"
)
print(response.text)
import Together from 'together-ai';
const together = new Together();

const completion = await together.chat.completions.create({
  model: 'meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8',
  messages: [
    {
      role: 'user',
      content: 'What are some fun things to do in New York?'
     }
  ],
});

console.log(completion.choices[0].message.content);
import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.images.create({
    model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    width: 1024,
    height: 1024,
    steps: 28,
    prompt: "Draw an anime style version of this image.",
    image_url: "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png",
  });

  console.log(response.data[0].url);
}

main();

import Together from "together-ai";

const together = new Together();
const imageUrl = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png";

async function main() {
  const response = await together.chat.completions.create({
    model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    messages: [{
      role: "user",
      content: [
        { type: "text", text: "Describe what you see in this image." },
        { type: "image_url", image_url: { url: imageUrl } }
      ]
    }]
  });
  
  console.log(response.choices[0]?.message?.content);
}

main();

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.chat.completions.create({
    model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    messages: [{
      role: "user",
      content: "Given two binary strings `a` and `b`, return their sum as a binary string"
    }]
  });
  
  console.log(response.choices[0]?.message?.content);
}

main();

import Together from "together-ai";

const together = new Together();

const query = "What animals can I find near Peru?";
const documents = [
  "The giant panda (Ailuropoda melanoleuca), also known as the panda bear or simply panda, is a bear species endemic to China.",
  "The llama is a domesticated South American camelid, widely used as a meat and pack animal by Andean cultures since the pre-Columbian era.",
  "The wild Bactrian camel (Camelus ferus) is an endangered species of camel endemic to Northwest China and southwestern Mongolia.",
  "The guanaco is a camelid native to South America, closely related to the llama. Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations."
];

async function main() {
  const response = await together.rerank.create({
    model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    query: query,
    documents: documents,
    top_n: 2
  });
  
  for (const result of response.results) {
    console.log(`Relevance Score: ${result.relevance_score}`);
  }
}

main();


import Together from "together-ai";

const together = new Together();

const response = await client.embeddings.create({
  model: 'meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8',
  input: 'Our solar system orbits the Milky Way galaxy at about 515,000 mph',
});

import Together from "together-ai";

const together = new Together();

async function main() {
  const response = await together.completions.create({
    model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
    prompt: "A horse is a horse",
    max_tokens: 32,
    temperature: 0.1,
    safety_model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8"
  });
  
  console.log(response.choices[0]?.text);
}

main();

import Together from 'together-ai';

const together = new Together();

async function generateAudio() {
   const res = await together.audio.create({
    input: 'Today is a wonderful day to build something people love!',
    voice: 'helpful woman',
    response_format: 'mp3',
    sample_rate: 44100,
    stream: false,
    model: 'meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8',
  });

  if (res.body) {
    console.log(res.body);
    const nodeStream = Readable.from(res.body as ReadableStream);
    const fileStream = createWriteStream('./speech.mp3');

    nodeStream.pipe(fileStream);
  }
}

generateAudio();

import Together from "together-ai";

const together = new Together();

const response = await together.audio.transcriptions.create(
  model: "meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
  language: "en",
  response_format: "json",
  timestamp_granularities: "segment"
});
console.log(response)

How to use Llama 4 Maverick

Input

    
      from together import Together

      client = Together()  # API key via api_key param or TOGETHER_API_KEY env var

      # Query image with Llama 4 Maverick model
      response = client.chat.completions.create(
        model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": "What can you see in this image?"},
                {"type": "image_url", "image_url": {"url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"}}
            ]
        }]
      )

      print(response.choices[0].message.content)
    

Output

    
      The image depicts a serene landscape of Yosemite National Park, featuring a river flowing through a valley surrounded by towering cliffs and lush greenery.

      *   **River:**
          *   The river is calm and peaceful, with clear water that reflects the surrounding scenery.
          *   It flows gently from the bottom-left corner to the center-right of the image.
          *   The riverbank is lined with rocks and grasses, adding to the natural beauty of the scene.
      *   **Cliffs:**
          *   The cliffs are massive and imposing, rising steeply from the valley floor.
          *   They are composed of light-colored rock, possibly granite, and feature vertical striations.
          *   The cliffs are covered in trees and shrubs, which adds to their rugged charm.
      *   **Trees and Vegetation:**
          *   The valley is densely forested, with tall trees growing along the riverbanks and on the cliffsides.
          *   The trees are a mix of evergreen and deciduous species, with some displaying vibrant green foliage.
          *   Grasses and shrubs grow in the foreground, adding texture and color to the scene.
      *   **Sky:**
          *   The sky is a brilliant blue, with only a few white clouds scattered across it.
          *   The sun appears to be shining from the right side of the image, casting a warm glow over the scene.

      In summary, the image presents a breathtaking view of Yosemite National Park, showcasing the natural beauty of the valley and its surroundings. The calm river, towering cliffs, and lush vegetation all contribute to a sense of serenity and wonder.

    

Function Calling

Input

    
      import os
      import json
      import openai

      client = openai.OpenAI(
          base_url = "https://api.together.xyz/v1",
          api_key = os.environ['TOGETHER_API_KEY'],
      )

      tools = [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ]
                }
              }
            }
          }
        }
      ]

      messages = [
          {"role": "system", "content": "You are a helpful assistant that can access external functions. The responses from these function calls will be appended to this dialogue. Please provide responses based on the information from these function calls."},
          {"role": "user", "content": "What is the current temperature of New York, San Francisco and Chicago?"}
      ]
          
      response = client.chat.completions.create(
          model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
          messages=messages,
          tools=tools,
          tool_choice="auto",
      )

      print(json.dumps(response.choices[0].message.model_dump()['tool_calls'], indent=2))
    

Output

    
      [
        {
          "id": "call_1p75qwks0etzfy1g6noxvsgs",
          "function": {
            "arguments": "{\"location\":\"New York, NY\",\"unit\":\"fahrenheit\"}",
            "name": "get_current_weather"
          },
          "type": "function"
        },
        {
          "id": "call_aqjfgn65d0c280fjd3pbzpc6",
          "function": {
            "arguments": "{\"location\":\"San Francisco, CA\",\"unit\":\"fahrenheit\"}",
            "name": "get_current_weather"
          },
          "type": "function"
        },
        {
          "id": "call_rsg8muko8hymb4brkycu3dm5",
          "function": {
            "arguments": "{\"location\":\"Chicago, IL\",\"unit\":\"fahrenheit\"}",
            "name": "get_current_weather"
          },
          "type": "function"
        }
      ]
    

Query models with multiple images

Currently this model supports 5 images as input.

Input

    
      # Multi-modal message with multiple images
      response = client.chat.completions.create(
          model="meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
          messages=[{
              "role": "user",
              "content": [
                  {
                      "type": "text",
                      "text": "Compare these two images."
                  },
                  {
                      "type": "image_url",
                      "image_url": {
                          "url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/yosemite.png"
                      }
                  },
                  {
                      "type": "image_url",
                      "image_url": {
                          "url": "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/slack.png"
                      }
                  }
              ]
          }]
      )
      print(response.choices[0].message.content)

    

Output

    
      The first image is a collage of multiple identical landscape photos showing a natural scene with rocks, trees, and a stream under a blue sky. The second image is a screenshot of a mobile app interface, specifically the navigation menu of the Canva app, which includes icons for Home, DMs (Direct Messages), Activity, Later, Canvases, and More.

      ### Comparison:
      1. **Content**:
        - The first image focuses on a natural landscape.
        - The second image shows a digital interface from an app.

      2. **Purpose**:
        - The first image could be used for showcasing nature, design elements in graphic work, or as a background.
        - The second image represents the functionality and layout of the Canva app's navigation system.

      3. **Visual Style**:
        - The first image has vibrant colors and realistic textures typical of outdoor photography.
        - The second image uses flat design icons with a simple color palette suited for user interface design.

      4. **Context**:
        - The first image is likely intended for artistic or environmental contexts.
        - The second image is relevant to digital design and app usability discussions.
    

Model details

  • Model String: meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
  • Specs:
    • 17B active parameters (400B total)
    • 128-expert MoE architecture
    • 524,288 context length (will be increased to 1M)
    • Support for 12 languages: Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese
    • Multimodal capabilities (text + images)
    • Support Function Calling
  • Best for: Enterprise applications, multilingual support, advanced document intelligence
  • Knowledge Cutoff: August 2024

Prompting Llama 4 Maverick

Applications & Use Cases

  • Multilingual customer support with visual context: Process and respond to customer inquiries with attached screenshots in 12 different languages, enabling support teams to quickly diagnose technical issues by understanding both the user's description and visual evidence simultaneously.
  • Generating marketing content from multimodal PDFs: Create compelling marketing materials by analyzing existing multimedia PDFs containing both text and visuals, extracting key themes, and generating new content that maintains brand consistency across formats.
  • Advanced document intelligence with text, diagrams, and tables: Extract structured information from complex documents containing a mix of text, diagrams, tables, and graphs, enabling automated analysis of technical manuals, financial reports, and research papers with unprecedented accuracy.

Looking for production scale? Deploy on a dedicated endpoint

Deploy Llama 4 Maverick on a dedicated endpoint with custom hardware configuration, as many instances as you need, and auto-scaling.

Get started