Frontier AI Factory

Manufacture intelligence at industrial scale

Forge the AI frontier: trillion-parameter models, trillion-token inference, and efficient orchestration of 1K → 100K+ GPUs.

Request a Factory Project Plan

Industrial-Grade AI Infrastructure. Custom-Built for Your AI Projects.

The Together Frontier AI Factory is your private GPU cloud, built at data center scale. Thousands of interconnected NVIDIA GPUs, liquid-cooled racks, and our optimized software stack deliver performance, efficiency, and control—tailored to your needs.

NVIDIA Blackwell GPUs, at Scale
NVIDIA's latest accelerated computing platform, like the NVIDIA GB200 NVL72 and NVIDIA HGX B200, tuned peak performance, supporting both training and inference.
High-Speed Interconnects
NVIDIA InfiniBand and NVIDIA NVLink™ ensure fast communication between GPUs, eliminating bottlenecks and enabling rapid processing of large datasets.
Advanced Cooling Systems
Liquid-cooled racks maximize thermal efficiency and GPU density, ensuring peak performance and reliability at scale.
Accelerated Software Stack
The Together Kernel Collection includes custom NVIDIA CUDA® kernels, reducing training times and costs with superior throughput.
Massive Scale
Deploy 1000→ 100K+ NVIDIA GPUs across global locations, adapting to evolving workload demands for resilient, enterprise-ready setups.
Robust Management Tools
Slurm and Kubernetes orchestrate dynamic AI workloads, optimizing training and inference seamlessly.

“Training our omnimodal Character-3 model required infrastructure designed for large-scale AI. The Together Frontier AI Factory delivered the performance we needed to push the boundaries of multimodal video generation. Together AI understands what builders need — and that made all the difference.”

‍

— Michael Lingelbach, CEO

AI Data Centers and Power across North America

Data Center Portfolio

2GW+ in the Portfolio with 600MW of near-term Capacity.

Expansion Capability in Europe and Beyond

Data Center Portfolio

150MW+ available in Europe: UK, Spain, France, Portugal, and Iceland also.

Next Frontiers – Asia and the Middle East

Data Center Portfolio

Options available based on the scale of the projects in Asia and the Middle East.

The Latest NVIDIA Accelerated Computing Platform

As an NVIDIA Cloud Partner, we have massive clusters ready for you right now, and can also work with you to build GPU Clusters specific to your project needs.

NVIDIA GB200 NVL72: A rack-scale, liquid-cooled supercomputer that enables 72 NVIDIA Blackwell GPUs to act as one massive GPU, delivering 1.4 exaFLOPs of AI performance and up to 30TB of fast memory.

Learn more

NVIDIA Blackwell GPU: delivering up to 15X more real-time inference and 3X faster training to accelerate trillion-parameter language models compared to the NVIDIA Hopper architecture generation.

Learn more

NVIDIA HGX H200: 1.1TB of HBM3e across 8 Hopper GPUs with 7.2TB/s of total aggregate bandwidth, nearly doubling the memory capacity and offering 1.4 times more memory bandwidth than HGX H100.

Learn more

NVIDIA HGX H100: delivering exceptional performance, scalability, and security for every workload.

Learn more

“Delivering competitive pricing, strong reliability and a properly set up cluster is the bulk of the value differentiation for most AI clouds. The only differentiated value we have seen outside this set is from a Neocloud called Together AI where the inventor of FlashAttention, Tri Dao, works. We don't believe the value created by Together can be replicated elsewhere without cloning Tri.”

‍

- Dylan Patel, Founder

Expert AI Advisory for Custom Model training  

We combine powerful infrastructure with expert guidance to help you build and deploy state-of-the-art custom AI models, tailored to your unique needs.

Custom Data Design: Leverage advanced tools like DSIR and DoReMi to select and optimize high-quality data slices, incorporating insights from datasets like RedPajama-v2.
Optimized Training: Collaborate with our experts to design architectures and training recipes for specialized use cases like instruction-tuning or conversational AI.
Accelerated Training and Fine-Tuning: Achieve up to 9x faster training and 75% cost savings with our optimized training stack, including FlashAttention-3.
Comprehensive Model Evaluation: We help you benchmark your model on public datasets or custom metrics to ensure exceptional performance and quality.

Talk with our AI experts

"At Krea, we're building a next-generation creative suite that brings AI-powered visual creation to everyone. Together AI provides the performance and reliability we need for real-time, high-quality image and video generation at scale. We value that Together AI is much more than an infrastructure provider – they’re a true innovation partner, enabling us to push creative boundaries without compromise."

‍

- Victor Perez, Co-Founder, Krea

Built for Frontier AI

MASSIVE scale
interconnected compute
Frontier AI Factories are designed to scale from under 100 GPUs, to frontier-class clusters with 1K → 100K+ GPUs. Interconnected via NVIDIA NVLink and NVIDIA InfiniBand, our clusters of GPUs work together as one.
ai-native
storage solutions
Frontier AI Factories integrate AI-native storage systems like VAST Data and WEKA alongside NVMe SSDs to ensure rapid read/write speeds. These solutions reduce latency for large datasets, improving training and inference efficiency.
Advanced orchestratioN
Right out of the box
Frontier AI Factories use Slurm and Kubernetes for efficient workload orchestration. Slurm can handle job scheduling for distributed training, while Kubernetes manages containerized inference, ensuring that GPU resources are optimally utilized.

Powering Frontier AI, Together

Frontier AI Factories are purpose-built from the ground up — combining NVIDIA's accelerated computing platform, Dell’s high-density server platforms, 5C’s power-dense data centers, and Together AI’s software-optimized orchestration stack.