LLM Chatrealtimevoiceflagship

gpt-realtime-2

by OpenAI · Released 2026

OpenAI's newest realtime speech-to-speech model. Same audio rates as gpt-realtime with stronger text reasoning for tool-driven flows.

LLM Chat

gpt-realtime-2

Powered by OpenAI · Realtime multimodal

Context Window

128K

Parameters

Not disclosed

Max Output

N/A

Category

LLM Chat

Overview

`gpt-realtime-2` is OpenAI's newest realtime speech-to-speech foundation model — same unified audio-in / audio-out product as gpt-realtime, with significantly better text reasoning for tool-driven voice agents. 128K text context. Voice-agent only (WebSocket).

Audio in/out rates match gpt-realtime ($32/$64 per 1M); text output is $24/1M (vs $16 on the 1.0). On CallMissed it bills against active call minutes — roughly $0.375/min — so the per-minute cost on a typical 50/50 conversation is identical to gpt-realtime.

Use it when you need top-of-the-line realtime quality plus heavy tool/function calling.

Pricing

MetricPrice
Input /1M tokens₹400.0000
Output /1M tokens₹2400.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • Newest realtime
  • Strong tool calling
  • $0.375/min

Technical Details

  • Model id: gpt-realtime-2
  • Voice-agent WebSocket only
  • ~$0.375 per active call minute

Strengths

  • Tool-call accuracy
  • 128K context

Limitations

  • Not available on chat completions

Use Cases

Voice agents with heavy tool callsPremium phone bots

API Example

# Create a voice session with llm_model=gpt-realtime-2 via POST /v1/voice/sessions

Endpoint: WebSocket /v1/voice/sessions · Model ID: gpt-realtime-2

Try gpt-realtime-2 now

Get 1000 free API credits on signup. No credit card required.