This website uses cookies to anonymously analyze website traffic using Google Analytics.
Research

Faster inference enables up to 5x price reduction on Together API

August 11, 2023

By 

Together

For the latest pricing please visit our pricing page.

At Together AI, we are optimizers. We are constantly working to create the most efficient AI stack on the market. Our research team is behind innovations that are core to today’s fastest optimizations, from batching techniques like FlexGen to algorithms like FlashAttention-2.

In the weeks since launching Together API — our cloud platform for building and running the world’s leading open-source AI models — we’ve continued to make strides to optimize our inference stack. And over the coming months, we’ll be releasing additional optimizations to speed up inference even more.

With faster performance, we can process a greater number of transactions per GPU, enabling better cost efficiency. Today, we’re excited to announce updated pricing to give you more for less.

Inference pricing

We’ve simplified pricing for inference across the 50+ open-source models available on our platform, including RedPajama, Llama 2, Falcon, and more.

For these out-of-the-box models, you only pay for requests (per 1K tokens used). You still launch your own inference VMs for the models you use — ensuring the privacy of your data.

For models that you fine-tune and then host on our platform, you pay the same amount for requests in addition to an hourly hosting fee when you launch your inference VM.

Model size Price per 1K tokens
Up to 3B $0.0001
3.1B - 7B $0.0002
7.1B - 20B $0.0004
20.1B - 40B $0.001
40.1B - 70B $0.003

Chat, language, and code models

Model size Price per 1K tokens Price per hour hosting
Up to 3B $0.0001 $0.52
3.1B - 7B $0.0002 $0.52
7.1B - 20B $0.0004 Coming soon
20.1B - 40B $0.001 Coming soon
40.1B - 70B $0.003 Coming soon

Your fine-tuned models

Model size Price per 1K tokens Price per hour hosting
Up to 3B $0.0001 $0.52
3.1B - 7B $0.0002 $0.52
7.1B - 20B $0.0004 Coming soon
20.1B - 40B $0.001 Coming soon
40.1B - 70B $0.003 Coming soon

Image models

Pricing for image models remains the same.

Image size 25 steps 50 steps 75 steps 100 steps
Up to 300 kilopixels (512 x 512) $0.001 $0.002 $0.0035 $0.005
Up to 1.1 megapixels (1024 x 1024) $0.01 $0.02 $0.035 $0.05

Get started today!

Head to api.together.ai to start running more efficient inference with our Playgrounds and APIs! New users get $25 in free credits to get started. We’re excited to see what you build.

  • Lower
    Cost
    20%
  • faster
    training
    4x
  • network
    compression
    117x

Q: Should I use the RedPajama-V2 Dataset out of the box?

RedPajama-V2 is conceptualized as a pool of data that serves as a foundation for creating high quality datasets. The dataset is thus not intended to be used out of the box and, depending on the application, data should be filtered out using the quality signals that accompany the data. With this dataset, we take the view that the optimal filtering of data is dependent on the intended use. Our goal is to provide all the signals and tooling that enables this.

No items found.
Start
building
yours
here →