Together AI Awarded ClusterMAX™ Gold Rating by SemiAnalysis

SemiAnalysis has introduced the ClusterMAX™ Rating System for GPU clouds – and we are honored that our Together GPU Clusters have been rated Gold.
Together AI was founded just a few years ago and has been on a remarkable trajectory ever since. This award highlights Together AI's rapid pace of innovation, exceptional support, and deep industry expertise in building and managing cutting-edge GPU infrastructure. In this post, we will explore what this ClusterMAXTM Gold rating means for the millions of developers building AI applications today—and how Together AI stands apart in an increasingly competitive landscape.
“Together AI stood out in our evaluation for its technical rigor, research-driven approach, and focus on performance at scale. Their team has clear expertise in AI systems. As both an AI lab & neocloud, Together AI is a compelling choice — differentiated by its roots in cutting-edge research and expert level GPU performance support."
— Dylan Patel, Chief Analyst, SemiAnalysis
SemiAnalysis ClusterMAX™
SemiAnalysis is well-known for its meticulous assessment of GPU cloud providers. SemiAnalysis has developed the world’s first GPU Cloud Rating System to evaluate performance, scalability, efficiency, and cost-effectiveness across various infrastructure providers. We at Together AI recently provided Together GPU Clusters for evaluation to the SemiAnalysis team, along with the support of our Field Engineering, Research and Site Reliability teams.
The ClusterMAX™ Rating System and content within the SemiAnalysis article were prepared independently by SemiAnalysis. No part of SemiAnalysis’s compensation by SemiAnalysis’s clients was, is, or will be directly or indirectly related to the specific tiering, ratings or comments expressed in SemiAnalysis article.
Achieving ClusterMAX™ Gold
Together AI has achieved the Gold rating by demonstrating strong security practices, reliable infrastructure, competitive pricing and deep technical support.
As the SemiAnalysis report points out, Together GPU Clusters’ value cannot be replicated anywhere else in the market without replicating the team led by Tri Dao, our Chief Scientist. Our research team, led by Tri, is highly specialized in optimizing GPUs with our proprietary kernels, Together Kernel Collection (TKC), as well as general expertise around debugging, optimizing and troubleshooting training and inference. TKC has delivered 90% improvements on NVIDIA HGX B200 hardware over NVIDIA H100 for training workloads. The team at Together AI regularly works with customers to improve training speeds by 30%+ in their implementations.
Together AI shined in the overall evaluation of GPU providers in many other areas:
- Infrastructure and Security
- Strong, one-click Slurm and Kubernetes support
- AI-native storage options, such as VAST and WEKA
- Industry-leading reliability commitments to customers
- Technical Expertise and Support
- Deep research expertise on GPU performance
- Strong technical partnership with NVIDIA
- Business Model
- Leading Price Performance with Together Kernel Collection
- Flexible Consumption Models, including new self-service Instant GPU Clusters
- GPU Availability across current and next-gen hardware needs
Together AI’s Field Engineering team was also pleased to collaborate with SemiAnalysis, and does so with our entire global customer base to deliver a redefined support experience with rapid response times to deliver the best reliability and performance in the industry.
At Together AI, we believe strongly in delivering a flexible ecosystem for developers and this was also quite clear in SemiAnalysis’ assessment. In just a few clicks, users can deploy managed Slurm or Kubernetes solutions. As an NVIDIA Cloud Partner (NCP), Together AI delivers the latest hardware (such as NVIDIA HGX B200 and GB200 NVL72) along with direct lines to the NVIDIA team for further optimizations where needed.
Continuous Improvement of our AI Acceleration Cloud
At Together AI, we are rapidly working to improve our passive health checks, weekly scheduled active health checks and improve our overall monitoring and alerting capabilities, both internal and external. The state we are rapidly moving towards is one in which there is no issue with a GPU that our customers notice before we do – followed by proactive resolution. We’re also exploring enabling SHARP (Scalable Hierarchical Aggregation and Reduction Protocol) in-network reduction on our InfiniBand fabric, in order to accelerate collective operations in distributed training workloads. Finally, we look forward to diversifying our fleet in the coming year with our own GPU clusters to further improve time to resolution.
Finally, being one of the first to market with the NVIDIA HGX B200 and GB200 NVL72 has given us a significant advantage over the competition. We have quickly developed expertise in managing this infrastructure as well as getting the most performance out of them for training and inference workloads. As we continue to build and optimize kernels for NVIDIA Blackwell platform, we believe we’re on a path towards Platinum in the next ClusterMaxTM evaluation.
Partnering with Together AI: NVIDIA Blackwell GPUs and Instant GPU Clusters now available
We owe the utmost gratitude to the SemiAnalysis team for their collaboration and for showcasing our innovation. We look forward to taking this feedback and continuing to innovate to deliver the most price performant, reliable GPU cloud on the market.
If you are interested in collaborating with Together AI in a free test drive of Together GPU Clusters accelerated by the NVIDIA Blackwell platform, please contact us. And if you’d like to try our new Instant GPU Clusters, with self-service provisioning, please request access at together.ai/instant
- Lower
Cost20% - faster
training4x - network
compression117x
Q: Should I use the RedPajama-V2 Dataset out of the box?
RedPajama-V2 is conceptualized as a pool of data that serves as a foundation for creating high quality datasets. The dataset is thus not intended to be used out of the box and, depending on the application, data should be filtered out using the quality signals that accompany the data. With this dataset, we take the view that the optimal filtering of data is dependent on the intended use. Our goal is to provide all the signals and tooling that enables this.