Together Research
Foundational research for production AI
Our research areas
Inference
Design and optimization of production inference systems, spanning scheduling, batching, and hardware–software co-design for reliable high throughput.
Read papers
Kernels
Development of high-performance GPU kernels for training and inference, optimizing memory, attention, and custom operators at production scale.
Read papers
Model Shaping
Advancement of post-training methods like fine-tuning, distillation, and quantization to shape efficient, controllable model behavior.
Read papers
Agents
Studies of long-horizon reasoning and decision-making, focusing on tool use, multi-step planning, and reinforcement learning for reliable agentic systems.
Read papers
Recognized research
Papers accepted at top conferences
Key open-source projects
FlashAttention
IO-aware exact attention, universally adopted
Flash Decoding
8× faster long-context token generation
Mixture of Agents
Open models, working together, beat GPT-4o
Dragonfly
Tiny 8B model beats Med-Gemini on every benchmark
Red Pajama Datasets
100T+ tokens powering 500+ models
DeepCoder
First open model to match o3-mini on code
Open Deep Research
Open-source multi-model deep research agent
Open Data Scientist Agent
Autonomous agent tops Adyen's real-world benchmark
Research blogs
In the spotlight

At Slush 2025, Together AI VP of Kernels Dan Fu dives into building, using, and managing AI agents.
00:00
/
00:00
Research team
Researchers and engineers pushing the boundaries of AI

Ce Zhang
Founder & CTO

Chris Ré
Founder

Tri Dao
Founder & Chief Scientist

Percy Liang
Founder

Ben Athiwaratkun
Core ML

Dan Fu
Kernels

James Zou
Frontier Agents

Leon Song
Core ML

Max Ryabinin
Model Shaping

Simran Arora
Kernels

Yineng Zhang
Inference




