Announcing v1 of our Python SDK
The Together AI Python SDK is officially out of beta with the v1 release! It provides great OpenAI compatible APIs to:
- Run inference on chat, language, code, moderation, and image models
- Fine-tune models (including Llama 3) with your own data
- Generate embeddings from text for RAG applications
v1 comes with several improvements including a new more intuitive fully OpenAI compatible API, async support, messages
support, more thorough tests, and better error handling. Upgrade to v1 by running pip install --upgrade together
.
Chat Completions
To use any of the 60+ chat models we support, you can run the following code:
Streaming
To stream back a response, simply specify stream=True
.
Completions
To run completions on our code and language models, do the following:
Image Models
To use our image models, run the following:
Embeddings
To generate embeddings with any of our embedding models, do the following:
Async Support
We now have async support! Here’s what that looks like for chat completions:
See this example to see async support for completions.
Fine-tuning
We also provide the ability to fine-tune models through our SDK or CLI, including the newly released Llama 3 models. Simply upload a file in JSONL format and create a fine-tuning job as seen in the code below:
For more about fine-tuning, including data formats, check out our finetuning docs.
Learn more in our documentation and python library on GitHub. We’re also actively working on a similar TypeScript SDK that will be out in the coming weeks as well!
- Lower
Cost20% - faster
training4x - network
compression117x
Q: Should I use the RedPajama-V2 Dataset out of the box?
RedPajama-V2 is conceptualized as a pool of data that serves as a foundation for creating high quality datasets. The dataset is thus not intended to be used out of the box and, depending on the application, data should be filtered out using the quality signals that accompany the data. With this dataset, we take the view that the optimal filtering of data is dependent on the intended use. Our goal is to provide all the signals and tooling that enables this.
article