How much does GLM 5.2 cost?

GLM 5.2 costs $1.89/1M tokens for input and $5.94/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use GLM 5.2 via API?

Send a POST request to POST /v1/chat/completions with model "glm-5.2" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of GLM 5.2?

GLM 5.2 supports a 262K token context window with up to 8K output tokens.

Back to all models

LLM Chatreasoningtools

GLM 5.2

by Z.ai · Released 2026

Zhipu AI's (Z.ai) flagship agentic coding model from the GLM-5 family. A 262K-context model purpose-built for long-horizon software engineering — multi-turn tool calling, native reasoning, and reliable structured output across large codebases.

LLM Chat

GLM 5.2

Context Window

262K

Parameters

MoE

Max Output

Overview

GLM 5.2 is Zhipu AI's (Z.ai) flagship agentic coding model, the most capable entry in the GLM-5 family. It pairs a very large 262,144-token context window with native reasoning and robust multi-turn function calling, making it well-suited for autonomous coding agents that plan changes across many files, call tools to read and edit code, run tests, and iterate — all while keeping the full project context in a single window.

The model is tuned for agentic coding workflows: it follows tool-calling instructions precisely, emits reliable structured output for tool payloads, and uses a `reasoning_effort` thinking toggle (low/medium/high) to trade latency for depth on harder problems. Its bilingual Chinese/English heritage from the GLM family carries through, so it remains strong on multilingual technical content.

On CallMissed, GLM 5.2 is fully OpenAI-compatible on `/v1/chat/completions` with streaming, tool calling, and the reasoning toggle. The 262K context handles large repositories, long design documents, and extended agent transcripts in one pass — pair it with a planner/executor loop for repository-scale refactors, or use it directly for complex single-shot coding and analysis tasks.

Pricing

Metric	Price
Input /1M tokens	₹189.0000
Output /1M tokens	₹594.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

Flagship agentic coding model — built for long-horizon engineering
262K context for repository-scale tasks in a single pass
Native reasoning with a low/medium/high effort toggle
Reliable multi-turn tool calling and structured output

Benchmarks

Benchmark	Score	Notes
Context	262K	Full-repo / long-transcript window
Tool Calling	Yes	Multi-turn, parallel function calls
Reasoning	Yes	reasoning_effort low/medium/high

Technical Details

Architecture: General Language Model (GLM) mixture-of-experts
Context window: 262,144 tokens
Native reasoning with reasoning_effort control
Multi-turn + parallel tool/function calling
OpenAI-compatible on the CallMissed gateway with streaming
Bilingual Chinese/English strength from the GLM family

Strengths

Purpose-built for agentic coding and long-horizon tool use
Very large 262K context for whole-codebase reasoning
Reasoning toggle balances latency vs depth per request
Reliable structured output keeps tool-call loops stable

Limitations

Premium pricing relative to the fast GLM 4.7 Flash tier
Reasoning mode increases time-to-first-token on hard prompts
Coding-optimized — general chat may prefer a cheaper model

Use Cases

Agentic codingRepository-scale refactorsLong-context analysisTool-using agents

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "glm-5.2", "messages": [{"role": "user", "content": "Refactor this module and add tests"}]}'

Endpoint: POST /v1/chat/completions · Model ID: glm-5.2

Try GLM 5.2 now

Get 1000 free API credits on signup. No credit card required.

Start free Book a Demo Read docs