gpt-realtime-2
by OpenAI · Released 2026
OpenAI's newest realtime speech-to-speech model. Same audio rates as gpt-realtime with stronger text reasoning for tool-driven flows.
gpt-realtime-2
Powered by OpenAI · Realtime multimodal
Context Window
128K
Parameters
Not disclosed
Max Output
N/A
Category
LLM Chat
Overview
`gpt-realtime-2` is OpenAI's newest realtime speech-to-speech foundation model — same unified audio-in / audio-out product as gpt-realtime, with significantly better text reasoning for tool-driven voice agents. 128K text context. Voice-agent only (WebSocket).
Audio in/out rates match gpt-realtime ($32/$64 per 1M); text output is $24/1M (vs $16 on the 1.0). On CallMissed it bills against active call minutes — roughly $0.375/min — so the per-minute cost on a typical 50/50 conversation is identical to gpt-realtime.
Use it when you need top-of-the-line realtime quality plus heavy tool/function calling.
Pricing
| Metric | Price |
|---|---|
| Input /1M tokens | ₹400.0000 |
| Output /1M tokens | ₹2400.0000 |
1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.
Key Highlights
- Newest realtime
- Strong tool calling
- $0.375/min
Technical Details
- Model id: gpt-realtime-2
- Voice-agent WebSocket only
- ~$0.375 per active call minute
Strengths
- Tool-call accuracy
- 128K context
Limitations
- Not available on chat completions
Use Cases
API Example
# Create a voice session with llm_model=gpt-realtime-2 via POST /v1/voice/sessions
Endpoint: WebSocket /v1/voice/sessions · Model ID: gpt-realtime-2
Try gpt-realtime-2 now
Get 1000 free API credits on signup. No credit card required.