Models / DeepSeek
LLM
Chat
Reasoning

DeepSeek-V3.1

Advanced reasoning model with hybrid thinking capabilities

About model

Revolutionary Hybrid AI:
DeepSeek-V3.1 is a groundbreaking hybrid model that switches between thinking and non-thinking modes, delivering exceptional performance in reasoning, coding, and agent tasks. Built on a massive 671B parameter MoE architecture with 37B activated parameters, it offers unparalleled flexibility for developers seeking both fast responses and deep analytical capabilities.

Performance benchmarks

Model

AIME 2025

GPQA Diamond

HLE

LiveCodeBench

MATH500

SWE-bench verified

49.8%

74.9%

Related open-source models

Competitor closed-source models

Claude Opus 4.6

90.5%

34.2%

78.7%

OpenAI o3

83.3%

24.9%

99.2%

62.3%

OpenAI o1

76.8%

96.4%

48.9%

GPT-4o

49.2%

2.7%

32.3%

89.3%

31.0%

  • API usage

    • cURL
    • Python
    • Typescript

    Endpoint:

    deepseek-ai/DeepSeek-V3.1

    curl -X POST "https://api.together.xyz/v1/chat/completions" \
      -H "Authorization: Bearer $TOGETHER_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "deepseek-ai/DeepSeek-V3.1",
        "messages": [
          {
            "role": "user",
            "content": "What are some fun things to do in New York?"
          }
        ]
    }'
    
    from together import Together
    
    client = Together()
    
    response = client.chat.completions.create(
      model="deepseek-ai/DeepSeek-V3.1",
      messages=[
        {
          "role": "user",
          "content": "What are some fun things to do in New York?"
        }
      ]
    )
    print(response.choices[0].message.content)
    
    import Together from 'together-ai';
    const together = new Together();
    
    const completion = await together.chat.completions.create({
      model: 'deepseek-ai/DeepSeek-V3.1',
      messages: [
        {
          role: 'user',
          content: 'What are some fun things to do in New York?'
         }
      ],
    });
    
    console.log(completion.choices[0].message.content);
    
  • Model card

    Architecture Overview:
    • Mixture-of-Experts (MoE) architecture with 671B total parameters and 37B activated parameters
    • Built upon DeepSeek-V3.1-Base with two-phase long context extension approach
    • Extended training with 630B tokens for 32K phase and 209B tokens for 128K phase
    • Compatible with UE8M0 FP8 scale data format for microscaling optimization

    Training Methodology:
    • Post-trained on expanded dataset with additional long documents
    • 10-fold increase in 32K extension phase training
    • 3.3x extension in 128K phase training
    • Advanced post-training optimization for tool usage and agent tasks

    Performance Characteristics:
    • Hybrid mode supporting both thinking and non-thinking operations
    • Superior performance on MMLU-Redux (91.8% non-thinking, 93.7% thinking)
    • Exceptional coding capabilities with LiveCodeBench Pass@1 of 56.4% (non-thinking) and 74.8% (thinking)
    • Advanced math reasoning with AIME 2024 Pass@1 of 66.3% (non-thinking) and 93.1% (thinking)

  • Applications & use cases

    Advanced Reasoning & Analysis:
    • Complex mathematical problem solving with step-by-step reasoning
    • Scientific research and analysis with transparent thought processes
    • Academic writing and research with comprehensive literature review capabilities
    • Strategic planning and decision-making with multi-factor analysis

    Software Development & Engineering:
    • Full-stack application development with multi-language support
    • Code review and optimization with detailed explanations
    • Architecture design and system planning
    • Debugging and troubleshooting with systematic approaches

    Agent & Automation Tasks:
    • Autonomous code agents for software development workflows
    • Search agents for information gathering and analysis
    • Multi-step task automation with tool integration
    • Workflow orchestration and process optimization

    Enterprise & Business Applications:
    • Data analysis and reporting with comprehensive insights
    • Technical documentation and knowledge management
    • Customer support and query resolution
    • Process automation and workflow optimization

Related models
  • Model provider
    DeepSeek
  • Type
    LLM
    Chat
    Reasoning
  • Main use cases
    Chat
  • Fine tuning
    Supported
  • Speed
    High
  • Intelligence
    Very High
  • Deployment
    Serverless
    On-Demand Dedicated
    Monthly Reserved
  • Parameters
    671B
  • Activated parameters
    37B
  • Context length
    128K
  • Input price

    $0.60 / 1M tokens

  • Output price

    $1.70 / 1M tokens

  • Input modalities
    Text
  • Output modalities
    Text
  • Released
    August 20, 2025
  • Last updated
    August 26, 2025
  • Quantization level
    FP4
  • External link
  • Category
    Chat