Models / ZAI
Chat
Reasoning
LLM

GLM-4.7

Advanced agentic coding and reasoning with instant serverless access on Together AI

About model

Frontier Agentic Coding:
GLM-4.7 is Z.AI's latest flagship model engineered for task-oriented development and complex agent workflows. Ranking #1 open-source on LMArena Code Arena and achieving 73.8% on SWE-bench Verified, it delivers enhanced agentic coding, superior frontend aesthetics, and stable multi-step reasoning through advanced interleaved thinking. Access instantly via Together AI serverless APIs for rapid prototyping, evaluation, and production deployment with 200K context and 128K max output.

Performance benchmarks

Model

AIME 2025

GPQA Diamond

HLE

LiveCodeBench

MATH500

SWE-bench verified

95.7%

83.3%

Related open-source models

Competitor closed-source models

Claude Opus 4.6

90.5%

34.2%

78.7%

OpenAI o3

83.3%

24.9%

99.2%

62.3%

OpenAI o1

76.8%

96.4%

48.9%

GPT-4o

49.2%

2.7%

32.3%

89.3%

31.0%

  • Model card

    Core Coding Capabilities:
    • #1 open-source model on LMArena Code Arena (outperforming GPT-5.2)
    • 73.8% on SWE-bench Verified (+5.8% over GLM-4.6)
    • 66.7% on SWE-bench Multilingual (+12.9% improvement)
    • 84.9% on LiveCodeBench-v6 (surpassing Claude Sonnet 4.5 at 64%)
    • 41% on Terminal Bench 2.0 (+16.5% improvement)

    Advanced Reasoning:
    • AIME 2025: 95.7% (open-source SOTA)
    • GPQA-Diamond: 85.7%
    • HLE with Tools: 42.8% (+12.4% over GLM-4.6)
    • HMMT Feb 2025: 97.1%
    • MMLU-Pro: 84.3%

    Agent & Tool Capabilities:
    • τ²-Bench: 87.4% (approaching Claude Sonnet 4.5 at 87.2%)
    • BrowseComp: 52% (67.5% with context management)
    • Significantly improved tool-calling and web browsing performance

    Interleaved Thinking System:
    • Interleaved Thinking: Thinks before every response and tool call for improved instruction following
    • Preserved Thinking: Automatically retains thinking blocks across multi-turn conversations, reducing information loss
    • Turn-level Thinking: Per-turn control over reasoning—disable for lightweight requests, enable for complex tasks
    • Optimal for long-horizon, complex agentic workflows

    Vibe Coding:
    • Enhanced UI quality with cleaner, more modern webpages
    • Better-looking slides with accurate layout and sizing
    • Improved frontend aesthetic generation for low-code platforms and rapid prototyping

    Architecture:
    • 200K context window with 128K maximum output tokens
    • MIT licensed, open weights
    • Supports vLLM, SGLang, and Transformers inference frameworks

  • Applications & use cases

    Agentic Coding & Development:
    • End-to-end software engineering from requirement comprehension to executable code
    • Multilingual agentic coding across multiple programming languages
    • Terminal-based development tasks with stable multi-step execution

    Frontend & UI Generation:
    • Production-quality web UI generation with enhanced aesthetics
    • Modern webpage creation with clean layouts and accurate sizing
    • Slide and poster generation with professional design consistency
    • Low-code platforms and AI frontend generation tools

    Complex Agent Workflows:
    • Long-horizon tasks requiring preserved reasoning across turns
    • Web browsing and information retrieval with context management
    • Tool-calling and function execution for enterprise automation
    • Real-world interaction scenarios (τ²-Bench: retail, telecom, airline)

    Enterprise Applications:
    • Development support and solution discussions with context-aware responses
    • Decision-making assistance with advanced reasoning capabilities
    • Mathematical problem-solving and scientific reasoning
    • Creative writing, role-play, and conversational AI with improved quality

    Research & Prototyping:
    • Rapid prototyping with superior UI aesthetics and accurate implementation
    • Complex demos and proof-of-concept development
    • Educational tools and interactive learning applications with 200K context support

Related models
  • Model provider
    ZAI
  • Type
    Chat
    Reasoning
    LLM
  • Main use cases
    Chat
  • Fine tuning
    Supported
  • Deployment
    Monthly Reserved
    On-Demand Dedicated
  • Context length
    202K
  • Input price

    $0.45 / 1M tokens

  • Output price

    $2.00 / 1M tokens

  • Input modalities
    Text
  • Output modalities
    Text
  • Released
    December 21, 2025
  • Last updated
    January 8, 2026
  • Quantization level
    FP8
  • External link
  • Category
    Chat