How much does GPT-5.4 Mini cost?

GPT-5.4 Mini costs $1/1M tokens for input and $6/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use GPT-5.4 Mini via API?

Send a POST request to POST /v1/chat/completions with model "openai/gpt-5.4-mini" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of GPT-5.4 Mini?

GPT-5.4 Mini supports a 1M token context window with up to 16K output tokens.

Back to all models

LLM Chatfastaffordable

GPT-5.4 Mini

by OpenAI · Released March 2026

A smaller, faster, and more affordable variant of GPT-5.4. Retains the 1M context window and most capabilities at a fraction of the cost. Ideal for high-volume applications where speed and cost matter.

LLM Chat

GPT-5.4 Mini

Context Window

Parameters

Undisclosed

Max Output

16K

Overview

GPT-5.4 Mini is a distilled variant of GPT-5.4, designed for high-volume production workloads where speed and cost are critical. Despite being significantly smaller, it retains the 1M token context window — a remarkable engineering achievement that allows it to process massive documents and codebases at a fraction of the cost of its larger siblings.

The model is optimized for fast inference, making it suitable for real-time chat applications, content summarization, classification tasks, and any workflow where low latency matters. At $1.00/M input and $6.00/M output, it offers 6x cheaper output tokens compared to GPT-5.4, making it the go-to choice for cost-sensitive deployments that still need strong general capabilities.

GPT-5.4 Mini maintains good performance on standard benchmarks while trading some capability on the most complex reasoning tasks. It excels at straightforward tasks like summarization, extraction, classification, and conversational AI where the full power of GPT-5.4 or Pro is unnecessary.

Pricing

Metric	Price
Input /1M tokens	₹100.0000
Output /1M tokens	₹600.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

6x cheaper than GPT-5.4 on output tokens
1M token context window retained
Fast inference for real-time applications
Strong performance on standard benchmarks

Benchmarks

Benchmark	Score	Notes
MMLU-Pro	80.1%	Professional knowledge
HumanEval	88.5%	Code generation
MATH-500	88.7%	Competition mathematics
GPQA Diamond	68.2%	Graduate-level science
SWE-bench Verified	58.3%	Software engineering

Technical Details

Distilled from GPT-5.4 — retains core capabilities at smaller size
Context window: 1,000,000 tokens retained from full GPT-5.4
Optimized for fast inference and low latency
6x cheaper output tokens compared to GPT-5.4
Supports structured outputs, function calling, and JSON mode
Post-trained with RLHF for instruction following
Available via OpenAI API and CallMissed unified gateway

Strengths

6x cheaper than GPT-5.4 while retaining the 1M context window
Fast inference optimized for real-time and high-volume workloads
Strong general-purpose performance for straightforward tasks
Good balance of cost, speed, and capability for production deployments

Limitations

Reduced performance on complex reasoning compared to GPT-5.4 and Pro
Less capable at multi-step agentic tasks requiring deep planning
Proprietary — no self-hosting or fine-tuning options

Use Cases

High-volume chatContent summarizationClassification tasksReal-time applications

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "openai/gpt-5.4-mini", "messages": [{"role": "user", "content": "Summarize this article"}]}'

Endpoint: POST /v1/chat/completions · Model ID: openai/gpt-5.4-mini

Try GPT-5.4 Mini now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs