How much does GPT-5.4 Nano cost?

GPT-5.4 Nano costs $0.27/1M tokens for input and $1.7/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use GPT-5.4 Nano via API?

Send a POST request to POST /v1/chat/completions with model "openai/gpt-5.4-nano" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of GPT-5.4 Nano?

GPT-5.4 Nano supports a 1M token context window with up to 16K output tokens.

Back to all models

LLM Chatfastaffordable

GPT-5.4 Nano

by OpenAI · Released March 2026

The smallest and most affordable model in the GPT-5.4 family. Designed for ultra-high-volume, latency-sensitive workloads. Retains the 1M context window with the lowest per-token cost.

LLM Chat

GPT-5.4 Nano

Context Window

Parameters

Undisclosed

Max Output

16K

Overview

GPT-5.4 Nano is the smallest and most cost-effective model in the GPT-5.4 family, designed for ultra-high-volume, latency-sensitive workloads where every millisecond and every fraction of a cent matters. At $0.27/M input and $1.70/M output, it is by far the cheapest OpenAI model while still retaining the signature 1M token context window.

Despite its small size, GPT-5.4 Nano delivers surprisingly capable performance on tasks like entity extraction, text classification, routing, and lightweight conversational AI. It is purpose-built for embedding directly into products at massive scale — think millions of API calls per day for features like auto-complete, content moderation, or intent detection.

The model trades complex reasoning and deep analysis capability for raw speed and cost efficiency. It is not the right choice for multi-step coding tasks or nuanced research, but for the vast majority of production AI features that need fast, reliable, and cheap inference, GPT-5.4 Nano is the optimal pick in the OpenAI lineup.

Pricing

Metric	Price
Input /1M tokens	₹27.0000
Output /1M tokens	₹170.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

Lowest cost in the GPT-5.4 family
1M token context window
Ultra-low latency for real-time use
Ideal for embedding in products at scale

Benchmarks

Benchmark	Score	Notes
MMLU-Pro	72.4%	Professional knowledge
HumanEval	82.1%	Code generation
MATH-500	80.5%	Competition mathematics
GPQA Diamond	58.7%	Graduate-level science

Technical Details

Smallest model in the GPT-5.4 family — optimized for cost and speed
Context window: 1,000,000 tokens retained despite small model size
Ultra-low latency inference for real-time applications
Pricing: $0.27/M input, $1.70/M output — cheapest OpenAI model
Supports structured outputs, function calling, and JSON mode
Distilled from larger GPT-5.4 models
Ideal for embedding in high-volume product features

Strengths

Cheapest model in the GPT-5.4 family at $0.27/M input tokens
Ultra-low latency makes it ideal for real-time product features
Retains 1M context window despite minimal model size
Excellent for high-volume tasks like classification, extraction, and routing

Limitations

Significantly reduced reasoning capability compared to GPT-5.4 and Pro
Not suitable for complex coding, research, or multi-step planning tasks
Proprietary — no self-hosting or fine-tuning options
May produce lower quality outputs on nuanced or ambiguous prompts

Use Cases

Entity extractionRouting and classificationLightweight chatEdge deployment

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "openai/gpt-5.4-nano", "messages": [{"role": "user", "content": "Extract the key entities from this text"}]}'

Endpoint: POST /v1/chat/completions · Model ID: openai/gpt-5.4-nano

Try GPT-5.4 Nano now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs