How much does GPT-5.4 Pro cost?

GPT-5.4 Pro costs $30/1M tokens for input and $180/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use GPT-5.4 Pro via API?

Send a POST request to POST /v1/chat/completions with model "openai/gpt-5.4-pro" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of GPT-5.4 Pro?

GPT-5.4 Pro supports a 1M token context window with up to 128K output tokens.

Back to all models

LLM Chatflagshipreasoning

GPT-5.4 Pro

by OpenAI · Released March 2026

OpenAI's most capable model. GPT-5.4 Pro features a 1M token context window, native computer use, tool search, and sets new records on professional benchmarks. Optimized for deep reasoning, complex coding, and long-horizon agentic workflows.

LLM Chat

GPT-5.4 Pro

Context Window

Parameters

Undisclosed

Max Output

128K

Overview

GPT-5.4 Pro is OpenAI's most capable model, the flagship of the GPT-5.4 family that unifies frontier reasoning, coding, and computer use into a single system. It features a 1-million-token context window (272K standard, 1M in Codex experimental mode), 128K max output tokens for generating entire codebases in one pass, native computer use that interacts with desktops through screenshots, controls mouse and keyboard, and writes Playwright code for browser automation, and tool search — an agentic capability that loads tool definitions on demand instead of all at once, saving tens of thousands of tokens per request.

On professional benchmarks, GPT-5.4 Pro tops the GDPval-AA leaderboard at 1667 Elo, ahead of Claude Sonnet 4.6 at 1633 and Opus 4.6 at 1606. It achieves 87.3% on spreadsheet modeling (vs 68.4% for GPT-5.2), and human raters preferred GPT-5.4 presentations 68% of the time over GPT-5.2. On Humanity's Last Exam, it scored 52.1%, breaking the 50% threshold for the first time among any model. FrontierMath reached 47.6% (vs 40.3% for GPT-5.2), ARC-AGI-1 hit 93.7%, and ARC-AGI-2 reached 83.3% in Pro mode (vs 73.3% for standard GPT-5.4).

The computer use and agentic benchmarks are where GPT-5.4 Pro truly stands apart. On OSWorld-Verified, it scores 75.0% — exceeding human performance of 72.4%, with the previous top model being Kimi K2.5 at 63.3%. This is the first time any AI model has surpassed human-level performance on this benchmark. Web browsing capabilities are equally impressive: 89.3% on BrowseComp (Pro variant; standard GPT-5.4 at 82.7%), 67.3% on WebArena-Verified, and 92.8% on Online-Mind2Web. Toolathlon scored 54.6%, demonstrating strong autonomous tool use.

The model delivers a 33% reduction in false claims and 18% fewer responses containing any errors compared to predecessors. Its steerability is significantly enhanced — it outlines a plan before continuing and allows mid-response adjustments, giving users more control over the output direction. Tool search saves tens of thousands of tokens per request by loading tool definitions on demand rather than including all definitions in every context window.

Native computer use operates via screenshots, mouse and keyboard control, and Playwright browser automation, enabling the model to interact with desktop software, fill out web forms, navigate multi-tab workflows, and execute complex GUI-based tasks autonomously. This makes GPT-5.4 Pro uniquely suited for enterprise automation scenarios that require interacting with legacy software, web applications, and desktop tools.

Safety evaluations include a chain-of-thought controllability study showing that models cannot effectively hide their reasoning, with controllability rates between 0.1% and 15.4%. OpenAI expanded its cyber safety stack and reduced refusals compared to GPT-5.2, making the model more helpful on legitimate edge-case queries without compromising on genuinely harmful requests.

At $30/M input and $180/M output, GPT-5.4 Pro is priced for teams that need the absolute best performance on complex reasoning, long-horizon agentic workflows, professional-grade analysis, and tasks where exceeding human-level performance on computer use and web browsing is critical. For most production workloads, the standard GPT-5.4 at $2.50/$15 offers the same architecture at a fraction of the cost.

Pricing

Metric	Price
Input /1M tokens	₹3000.0000
Output /1M tokens	₹18000.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

1M token context window for massive codebases and documents
Native computer use — can operate desktop software
Tool search for finding and using the right tools autonomously
Top scores on professional benchmarks (SWE-Bench, GPQA, MATH)

Benchmarks

Benchmark	Score	Notes
GDPval	83%	1667 Elo — #1, ahead of Sonnet 4.6 (1633) and Opus 4.6 (1606)
Spreadsheet Modeling	87.3%	vs 68.4% for GPT-5.2
SWE-bench Pro	57.7%	Slightly above GPT-5.3-Codex (56.8%)
Terminal-Bench 2.0	75.0%	vs 77.3% GPT-5.3-Codex, 62.2% GPT-5.2
OSWorld-Verified	75.0%	EXCEEDS human performance (72.4%)
BrowseComp	89.3%	Pro variant; standard GPT-5.4 at 82.7%
WebArena-Verified	67.3%	Web browsing task automation
Online-Mind2Web	92.8%	Web interaction benchmark
FrontierMath	47.6%	vs 40.3% GPT-5.2
Humanity's Last Exam	52.1%	Broke the 50% threshold
ARC-AGI-1	93.7%	Abstract reasoning
ARC-AGI-2	83.3%	Pro variant novel reasoning
Toolathlon	54.6%	Tool use benchmark

Technical Details

Context window: 1,000,000 tokens (272K standard, 1M in Codex experimental)
Max output: 128K tokens for generating entire codebases in one pass
Native computer use: interacts with desktop via screenshots, controls mouse/keyboard, writes Playwright code for browser automation
Tool search: loads tool definitions on demand, saving tens of thousands of tokens per request
33% fewer false claims and 18% fewer responses with any errors vs predecessors
Steerability: outlines plan before continuing, allows mid-response adjustments
Proprietary Transformer architecture with undisclosed parameter count
Post-trained with RLHF and extensive red-teaming for safety
Supports structured outputs, function calling, and JSON mode
Available via OpenAI API and through CallMissed unified gateway

Strengths

Most capable model from OpenAI — #1 on GDPval at 1667 Elo
OSWorld-Verified 75.0% exceeds human performance (72.4%)
Native computer use enables GUI automation and desktop software operation
1M context window handles massive codebases and document collections
33% fewer hallucinations and 18% fewer error-containing responses
Tool search enables fully autonomous agentic workflows

Limitations

Premium pricing at $30/$180 per 1M tokens — designed for high-value tasks
Higher latency due to model size — not ideal for real-time chat
Proprietary and closed-source — no self-hosting option
Overkill for simple tasks where smaller models suffice

Use Cases

Complex reasoning tasksLarge codebase analysisAgentic workflowsResearch and analysis

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "openai/gpt-5.4-pro", "messages": [{"role": "user", "content": "Analyze this codebase and suggest architectural improvements"}]}'

Endpoint: POST /v1/chat/completions · Model ID: openai/gpt-5.4-pro

Try GPT-5.4 Pro now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs