Research

Faster inference enables up to 5x price reduction on Together API

August 11, 2023

・

Together

For the latest pricing please visit our pricing page.

‍

At Together AI, we are optimizers. We are constantly working to create the most efficient AI stack on the market. Our research team is behind innovations that are core to today’s fastest optimizations, from batching techniques like FlexGen to algorithms like FlashAttention-2.

In the weeks since launching Together API — our cloud platform for building and running the world’s leading open-source AI models — we’ve continued to make strides to optimize our inference stack. And over the coming months, we’ll be releasing additional optimizations to speed up inference even more.

With faster performance, we can process a greater number of transactions per GPU, enabling better cost efficiency. Today, we’re excited to announce updated pricing to give you more for less.

Inference pricing

We’ve simplified pricing for inference across the 50+ open-source models available on our platform, including RedPajama, Llama 2, Falcon, and more.

For these out-of-the-box models, you only pay for requests (per 1K tokens used). You still launch your own inference VMs for the models you use — ensuring the privacy of your data.

For models that you fine-tune and then host on our platform, you pay the same amount for requests in addition to an hourly hosting fee when you launch your inference VM.

Model size	Price per 1K tokens
Up to 3B	$0.0001
3.1B - 7B	$0.0002
7.1B - 20B	$0.0004
20.1B - 40B	$0.001
40.1B - 70B	$0.003

Chat, language, and code models

Model size	Price per 1K tokens	Price per hour hosting
Up to 3B	$0.0001	$0.52
3.1B - 7B	$0.0002	$0.52
7.1B - 20B	$0.0004	Coming soon
20.1B - 40B	$0.001	Coming soon
40.1B - 70B	$0.003	Coming soon

Your fine-tuned models

Model size	Price per 1K tokens	Price per hour hosting
Up to 3B	$0.0001	$0.52
3.1B - 7B	$0.0002	$0.52
7.1B - 20B	$0.0004	Coming soon
20.1B - 40B	$0.001	Coming soon
40.1B - 70B	$0.003	Coming soon

Image models

Pricing for image models remains the same.

Image size	25 steps	50 steps	75 steps	100 steps
Up to 300 kilopixels (512 x 512)	$0.001	$0.002	$0.0035	$0.005
Up to 1.1 megapixels (1024 x 1024)	$0.01	$0.02	$0.035	$0.05

Get started today!

Head to api.together.ai to start running more efficient inference with our Playgrounds and APIs! New users get $25 in free credits to get started. We’re excited to see what you build.

‍

Lower
Cost
20%
faster
training
4x
network
compression
117x

Q: Should I use the RedPajama-V2 Dataset out of the box?

RedPajama-V2 is conceptualized as a pool of data that serves as a foundation for creating high quality datasets. The dataset is thus not intended to be used out of the box and, depending on the application, data should be filtered out using the quality signals that accompany the data. With this dataset, we take the view that the optimal filtering of data is dependent on the intended use. Our goal is to provide all the signals and tooling that enables this.

Links in this
article