LLM Chatreasoningazure

GPT-5 Mini (Azure)

by Azure OpenAI · Released 2025

OpenAI GPT-5 Mini on Azure — a fast, affordable reasoning model with a 400K context window. Text-only, tuned for speed and cost.

LLM Chat

GPT-5 Mini (Azure)

Powered by Azure OpenAI · Reasoning transformer (OpenAI GPT-5 Mini), hosted on Azure OpenAI Service

Context Window

400K

Parameters

Not disclosed

Max Output

128K

Category

LLM Chat

Overview

GPT-5 Mini is the small, cost-efficient member of OpenAI's GPT-5 reasoning family, served here through Microsoft Azure OpenAI Service. It delivers genuine chain-of-thought reasoning at a fraction of the price and latency of the full GPT-5 tier, with a large 400K-token context window, making it a strong default for high-volume reasoning workloads where cost matters.

As a reasoning model it controls sampling internally — temperature and top-p are fixed — and it is text-only (no image input). Through CallMissed it is addressed as `azure/gpt-5-mini` on the OpenAI-compatible `/v1/chat/completions` endpoint with streaming and tool calling. Running on Azure provides first-party metered billing, Sweden Central data residency, and Azure SLAs — a cost-effective, compliant choice for agents, classification, and reasoning at scale.

Pricing

MetricPrice
Input /1M tokens₹25.0000
Output /1M tokens₹200.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • Affordable GPT-5-family reasoning
  • Large 400K-token context window
  • Low latency for high-volume workloads
  • Azure-hosted with enterprise SLAs + data residency

Benchmarks

BenchmarkScore
AIME0.85
GPQA0.71
MMLU0.86

Technical Details

  • Served via Azure OpenAI Service (api-version 2024-10-21)
  • Deployment: gpt-5-mini (2025-08-07), GlobalStandard SKU, Sweden Central
  • Reasoning model: temperature/top-p fixed; uses max_completion_tokens
  • Text-only (no image input); supports streaming + tools

Strengths

  • Cheap, genuine reasoning
  • Large 400K context
  • Low latency

Limitations

  • Text-only — no image input
  • Fixed sampling params (reasoning model)
  • Below full GPT-5 on the hardest tasks

Use Cases

High-volume reasoningAgents and planningClassification at scaleCost-sensitive apps

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "azure/gpt-5-mini",
    "messages": [{"role": "user", "content": "Solve this step by step: ..."}]
  }'

Endpoint: POST /v1/chat/completions · Model ID: azure/gpt-5-mini

Try GPT-5 Mini (Azure) now

Get 1000 free API credits on signup. No credit card required.