What's New
- New video generation API with models like OpenAI Sora 2, Google Veo 3.0, and Minimax Hailuo for high-quality video creation
- 40+ new image and video models, including Google's Imagen and Nano Banana, ByteDance SeeDream, and specialized editing tools
- Complete workflows - Combine text, image, and video generation in single applications without switching providers
- Same APIs you know - OpenAI-compatible endpoints, unified auth, transparent per-model pricing
- Available now: Serverless endpoints with enterprise options for scale
Generative media is at the center of a new set of AI-native applications, from AI-powered video editors and personalized gaming experiences to automated marketing content. But building these apps has been complex, with developers having to juggle providers for text, images, and video—each with new SDKs, auth, rate limits, and billing. That fragmentation slows teams, complicates SLAs, and makes scaling a headache.
Today Together AI, the AI Native Cloud, is expanding the Together Model Libary to become your complete generative media infrastructure. Through our strategic partnership with Runware, we're integrating 20+ video models across six providers (including Google Veo 3.0, OpenAI Sora 2, and ByteDance Seedream) plus 15+ image models alongside leading LLMs and voice—spanning the quality-speed-cost spectrum that real applications demand, all accessible through the same fast, reliable APIs you use for text generation.
40+ Models Chosen for Production Workflows
New Video Generation Models
Video generation is new to Together AI. We're starting with models that create 4-30 second videos at various resolutions and styles. Each model optimizes for different needs - realism, motion consistency, or extended length. From quick 10-second clips with Minimax Hailuo to extended 30-second sequences with Kling v2.1, and specialized motion generation with SeeDance. This variety ensures developers can choose the right tool for their specific video generation requirements, from rapid prototyping to production-quality content creation.
Sora 2 Pro
Premium cinematic video generation with native audio and lifelike physics.
Google Veo 3
High-quality video creation with advanced camera movements and scene control.
PixVerse V5
Fast, affordable video generation with smooth motion and multiple artistic styles.
ByteDance Seedance 1.0 Pro
Top-ranked video generation with multi-shot storytelling and cinematic quality.
New Image Generation & Editing Models
Together AI's image generation capabilities span the full spectrum of creative and production needs. From photorealistic generation with Google's Imagen to artistic control with models like Nano Banana, developers get access to specialized tools optimized for different use cases without researching individual providers or managing separate integrations.
Gemini Flash Image 2.5 (Nano Banana)
Versatile image creation and editing with natural language control.
Google Imagen 4.0 Ultra
Premium image generation with exceptional detail and text rendering.
Qwen Image
High-quality image generation with perfect text integration and poster design.
34+ More Models
Complete range of specialized models for every creative and production use case.
Build Complete Workflows in One Platform
Combine text, image, and video generation in a single codebase without managing multiple providers. Your existing Together integration gains image editing, creative generation, and video production capabilities.
Here are three types of applications this makes practical to build:
🎮 Media Generation in Gaming
Technical capability: Gaming studios generating environmental assets, character variations, and cutscenes programmatically based on gameplay data.
Platform advantage: Single API call chain from game state to visual assets, enabling real-time content generation without managing multiple inference providers.
🛍️ Dynamic Advertising Creative
Technical capability: E-commerce platforms generating personalized product images, lifestyle shots, and video ads based on user preferences, seasonal trends, and inventory data.
Platform advantage: Real-time creative generation from user data to personalized visuals, enabling dynamic ad optimization without coordinating separate image and video providers.
🧠 Interactive Learning Platforms
Technical capability: Educational applications creating custom visual explanations, interactive diagrams, and personalized video content based on student questions and progress.
Platform advantage: Real-time multimodal responses using the same inference infrastructure, enabling sophisticated personalization without latency penalties from provider switching.
Production Deployment Options
Together AI's generative media capabilities are production-ready with enterprise-grade infrastructure and developer-focused tools.
Performance & Scale
- ✔ 40+ image and video models
- ✔ Up to 30-second video generation
- ✔ Multiple resolution options
- ✔ Transparent per-model pricing
Infrastructure
- ✔ Production-grade rate limits
- ✔ Serverless auto-scaling
- ✔ Global infrastructure
- ✔ Enterprise reliability
Developer Experience
- ✔ OpenAI-compatible APIs
- ✔ Same SDK as text models
- ✔ Unified authentication
- ✔ Single billing platform
Try it Now
If you're already using Together AI for text inference, adding image and video generation works the same way. Same authentication, same SDKs, same billing dashboard. All usage shows up in one place with transparent per-model pricing.
import time
from together import Together
client = Together()
# Create a video generation job
job = client.videos.create(
prompt="A serene sunset over the ocean with gentle waves",
model="minimax/video-01-director",
width=1366,
height=768,
)
print(f"Job ID: {job.id}")
# Poll until completion
while True:
status = client.videos.retrieve(job.id)
print(f"Status: {status.status}")
if status.status == "completed":
print(f"Video URL: {status.outputs.video_url}")
break
elif status.status == "failed":
print("Video generation failed")
break
# Wait before checking again
time.sleep(5)
Try the platform:
- Interactive Playground - Test image and video generation before building
- API Documentation - Complete integration guides and code examples
- Model Library - Browse all available models with specifications
{{custom-cta-1}}
Deploy for production:
- Start with serverless endpoints for development and testing
- For enterprise deployments and maximum control, contact our Sales team.
The same Together AI platform you use for text inference now handles your complete generative AI stack. No additional integrations, no vendor management overhead, no learning new APIs - just expanded capabilities in the same developer experience you already know.
Follow our step-by-step Quickstart to install, authenticate, and run your first video inference in minutes.

Audio Name
Audio Description

Performance & Scale
Body copy goes here lorem ipsum dolor sit amet
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
Infrastructure
Best for
List Item #1
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
List Item #1
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond only in Arabic, no other language is allowed. Here is the question:
Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond with less than 860 words. Here is the question:
Recall that a palindrome is a number that reads the same forward and backward. Find the greatest integer less than $1000$ that is a palindrome both when written in base ten and when written in base eight, such as $292 = 444_{\\text{eight}}.$
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, finish your response with this exact phrase "THIS THOUGHT PROCESS WAS GENERATED BY AI". No other reasoning words should follow this phrase. Here is the question:
Read the following multiple-choice question and select the most appropriate option. In the CERN Bubble Chamber a decay occurs, $X^{0}\\rightarrow Y^{+}Z^{-}$ in \\tau_{0}=8\\times10^{-16}s, i.e. the proper lifetime of X^{0}. What minimum resolution is needed to observe at least 30% of the decays? Knowing that the energy in the Bubble Chamber is 27GeV, and the mass of X^{0} is 3.41GeV.
- A. 2.08*1e-1 m
- B. 2.08*1e-9 m
- C. 2.08*1e-6 m
- D. 2.08*1e-3 m
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be wrapped in JSON format. You can use markdown ticks such as ```. Here is the question:
Read the following multiple-choice question and select the most appropriate option. Trees most likely change the environment in which they are located by
- A. releasing nitrogen in the soil.
- B. crowding out non-native species.
- C. adding carbon dioxide to the atmosphere.
- D. removing water from the soil and returning it to the atmosphere.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be in English and in all capital letters. Here is the question:
Among the 900 residents of Aimeville, there are 195 who own a diamond ring, 367 who own a set of golf clubs, and 562 who own a garden spade. In addition, each of the 900 residents owns a bag of candy hearts. There are 437 residents who own exactly two of these things, and 234 residents who own exactly three of these things. Find the number of residents of Aimeville who own all four of these things.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, refrain from the use of any commas. Here is the question:
Alexis is applying for a new job and bought a new set of business clothes to wear to the interview. She went to a department store with a budget of $200 and spent $30 on a button-up shirt, $46 on suit pants, $38 on a suit coat, $11 on socks, and $18 on a belt. She also purchased a pair of shoes, but lost the receipt for them. She has $16 left from her budget. How much did Alexis pay for the shoes?
Follow our step-by-step Quickstart to install, authenticate, and run your first video inference in minutes.

Audio Name
Audio Description

Performance & Scale
Body copy goes here lorem ipsum dolor sit amet
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
- Bullet point goes here lorem ipsum
Infrastructure
Best for
List Item #1
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
- Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt.
List Item #1
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Build
Benefits included:
✔ Up to $15K in free platform credits*
✔ 3 hours of free forward-deployed engineering time.
Funding: Less than $5M
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond only in Arabic, no other language is allowed. Here is the question:
Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, respond with less than 860 words. Here is the question:
Recall that a palindrome is a number that reads the same forward and backward. Find the greatest integer less than $1000$ that is a palindrome both when written in base ten and when written in base eight, such as $292 = 444_{\\text{eight}}.$
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, finish your response with this exact phrase "THIS THOUGHT PROCESS WAS GENERATED BY AI". No other reasoning words should follow this phrase. Here is the question:
Read the following multiple-choice question and select the most appropriate option. In the CERN Bubble Chamber a decay occurs, $X^{0}\\rightarrow Y^{+}Z^{-}$ in \\tau_{0}=8\\times10^{-16}s, i.e. the proper lifetime of X^{0}. What minimum resolution is needed to observe at least 30% of the decays? Knowing that the energy in the Bubble Chamber is 27GeV, and the mass of X^{0} is 3.41GeV.
- A. 2.08*1e-1 m
- B. 2.08*1e-9 m
- C. 2.08*1e-6 m
- D. 2.08*1e-3 m
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be wrapped in JSON format. You can use markdown ticks such as ```. Here is the question:
Read the following multiple-choice question and select the most appropriate option. Trees most likely change the environment in which they are located by
- A. releasing nitrogen in the soil.
- B. crowding out non-native species.
- C. adding carbon dioxide to the atmosphere.
- D. removing water from the soil and returning it to the atmosphere.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, your response should be in English and in all capital letters. Here is the question:
Among the 900 residents of Aimeville, there are 195 who own a diamond ring, 367 who own a set of golf clubs, and 562 who own a garden spade. In addition, each of the 900 residents owns a bag of candy hearts. There are 437 residents who own exactly two of these things, and 234 residents who own exactly three of these things. Find the number of residents of Aimeville who own all four of these things.
Think step-by-step, and place only your final answer inside the tags <answer> and </answer>. Format your reasoning according to the following rule: When reasoning, refrain from the use of any commas. Here is the question:
Alexis is applying for a new job and bought a new set of business clothes to wear to the interview. She went to a department store with a budget of $200 and spent $30 on a button-up shirt, $46 on suit pants, $38 on a suit coat, $11 on socks, and $18 on a belt. She also purchased a pair of shoes, but lost the receipt for them. She has $16 left from her budget. How much did Alexis pay for the shoes?