gpt-5-mini
द्वारा OpenAI · रिलीज़ 2025
OpenAI GPT-5 mini — fast, affordable reasoning model। 400K context, text-only।
gpt-5-mini
द्वारा संचालित OpenAI · Reasoning transformer
कॉन्टेक्स्ट विंडो
400K
पैरामीटर
Not disclosed
अधिकतम आउटपुट
128K
श्रेणी
LLM चैट
अवलोकन
GPT-5 mini OpenAI का cost-efficient reasoning model — explicitly "reasoning" tier documented, high internal deliberation, 400,000 tokens context, 128,000 tokens output (platform.openai.com/docs/models/gpt-5-mini)। API model id `gpt-5-mini`। CallMissed `/v1/chat/completions` same JSON schema, OpenAI reasoning guide read: temperature sampling params constrained, reasoning tokens output billing count।
OpenAI GPT-5 mini well-defined high-volume tasks chain-of-thought quality without full GPT-5 flagship prices — nuanced classification, multi-step planning, structured extraction, moderation with explanation, agent routing। Text and image input, text only output। Fine-tuning model card unsupported; Responses API image generation tools unavailable this family।
CallMissed pricing $0.25/M input, $2.00/M output, cached input $0.025/M — genuine reasoning cheapest ways platform पर। Batch evaluation pipelines, guardrail models, background planners, internal copilots hundreds ms latency acceptable attractive। `max_tokens`/`max_completion_tokens` mindful: reasoning models internal thinking output budget consume; limit low incomplete responses।
GPT-4.1 vs: explicit sampling control/ultra-long 1M context trade stronger deliberation tricky instructions। GPT-4o vs: text-first reasoning-native — vision-heavy interactive chat moderate cost GPT-4o; step-by-step logic dominates GPT-5 mini। Production systems GPT-5 mini "second opinion" model or tool loops brain, smaller model formatting।
Integration: explanation tasks generous output cap; finish reasons log। System prompts format constrain (JSON, bullets) — reasoning models instructions well follow but not guided over-explain। Streaming supported — UX wire even total latency exceeds non-reasoning। CallMissed Azure OpenAI hosted; clean id `gpt-5-mini` — `azure/` prefix नहीं।
Limitations: hardest research tasks absolute frontier larger GPT-5 variants reserve; chat completions text-only output; classic GPT-4 sampling behavior differs — temperature-sensitive prompts migrate retest। Audio/realtime speech `gpt-realtime`/STT/TTS models use।
Reasoning token economics: internal chain-of-thought output usage bill। User "500 completion tokens" request usage breakdowns additional reasoning tokens include। Agent budgets size accordingly — pipeline per-step spend cap `usage` monitor।
Evaluation playbook: production promote before held-out set GPT-4.1 exact prompts compare। Reasoning models ambiguous policy interpretation, multi-constraint scheduling, fraud review often win; simple templated tasks tie/lose mini reasoning overkill।
Structured output: native JSON mode every snapshot नहीं, system "Return only valid JSON matching schema" works well। Server pydantic/zod validate; raw JSON SQL execute parameterization without trust नहीं।
Azure hosting: GPT-5 mini Foundry OpenAI snapshot naming (`gpt-5-mini-2025-08-07`) track। CallMissed unversioned id current production deployment map। Snapshot bumps rare breaking — integration tests behavioral assertions pin, exact wording नहीं।
Multi-model orchestration: GPT-5 mini planner (tool sequence) + formatter GPT-4o mini-class/deterministic code common; GPT-5 mini escalations cheaper model confidence fail after only another pattern।
Latency: first-token non-reasoning exceeds; UI streaming "thinking" states communicate। Batch concurrency limits — large nightly runs stagger।
Safety: reasoning models jailbreak over-analyze — user-generated content input filtering/output moderation still apply।
Snapshot pinning: internal runbooks validated CallMissed deployment snapshot document (`gpt-5-mini` unversioned production alias track)। Catalog changelog snapshot bumps eval suites re-run। Support tickets request id, model id, approximate prompt token count include — output caps mid-thought truncate subtle failures।
प्राइसिंग
| मेट्रिक | कीमत |
|---|---|
| इनपुट /1M tokens | ₹25.0000 |
| आउटपुट /1M tokens | ₹200.0000 |
1 क्रेडिट = ₹1 = $0.01 USD। कीमतें प्रोवाइडर से दिखाई गई हैं; CallMissed ~35% मार्कअप के साथ पास-थ्रू करता है।
मुख्य बातें
- Affordable reasoning
- 400K context
- Streaming + tools
बेंचमार्क
| बेंचमार्क | स्कोर |
|---|---|
| GPQA | 0.71 |
तकनीकी विवरण
- Model id: gpt-5-mini
- Reasoning model — fixed temperature/top_p
ताकतें
- Low cost
- बड़ा कॉन्टेक्स्ट
सीमाएं
- Text-only
- Fixed sampling params
उपयोग के मामले
API उदाहरण
curl https://api.callmissed.com/v1/chat/completions \
-H "Authorization: Bearer cm_YOUR_KEY" \
-d '{"model": "gpt-5-mini", "messages": [{"role": "user", "content": "Plan this task"}]}'एंडपॉइंट: POST /v1/chat/completions · मॉडल ID: gpt-5-mini
gpt-5-mini अभी आज़माएं
साइनअप पर 1000 फ्री API क्रेडिट पाएं। कोई क्रेडिट कार्ड ज़रूरी नहीं।