LLM Chatrealtimevoicemultilingual

Nova 2 Sonic

by Amazon · Released 2026

Amazon Nova 2 Sonic — flagship speech-to-speech voice model (voice agent only). 16 expressive voices across 8 languages including Hindi and Indian English, with natural turn-taking and function calling. The default voice-agent model.

LLM Chat

Nova 2 Sonic

Powered by Amazon · Realtime speech-to-speech

Context Window

32K

Parameters

Not disclosed

Max Output

N/A

Category

LLM Chat

Overview

Amazon Nova 2 Sonic is a native speech-to-speech foundation model: a single model that understands speech and generates speech directly, rather than bolting text-to-speech onto a separate language model. One connection listens, reasons, and speaks with low enough latency for natural live conversation, including human-like turn-taking (the model detects when the caller has finished a thought) and graceful handling of interruptions without dropping context. On CallMissed it is the default voice model — create a session via `/v1/voice/sessions` with `llm_model` set to `nova-sonic-2`, or leave it unset since it is the default. It is not available on `/v1/chat/completions`; it is voice-agent only over WebSocket.

Nova 2 Sonic ships 16 expressive voices across eight languages: English (US, UK, India, and Australia), Hindi, Spanish, French, Italian, German, and Portuguese. Two of the voices (Tiffany and Matthew) are polyglot — a single voice persona that can switch languages mid-conversation without sounding like a different speaker, which is ideal for multilingual support lines where a caller code-switches between, say, Hindi and English. The model is robust to background noise and to a range of accents, and supports asynchronous function calling so tools can run while the assistant keeps talking.

Pricing on CallMissed is $4.00 per million input tokens and $15.00 per million output tokens (speech). That is dramatically cheaper than the older gpt-realtime class while delivering native speech-to-speech quality, which is why Nova 2 Sonic is the platform default for voice agents. Budget for continuous audio: minutes of conversation accumulate tokens faster than text-only chat, so pilot with recorded calls to estimate monthly spend before enabling toll-free numbers.

Use Nova 2 Sonic for phone bots, voice assistants, appointment booking, customer support automation, and any hands-free workflow where a single unified model is simpler than chaining separate STT, LLM, and TTS providers. You trade some flexibility (mixing your favorite STT + text LLM + TTS) for operational simplicity and lower latency. For Indian-language telephony specifically, the en-IN and Hindi voices plus polyglot code-switching make it a strong default.

Limitations: voice-pipeline only (no text chat completions endpoint), and like all realtime models it depends on client-side audio capture quality. For batch transcription after the fact, use a dedicated STT model instead. CallMissed runs Nova Sonic through AWS Bedrock; if AWS credentials are not configured in a region the platform falls back to the standard STT→LLM→TTS pipeline automatically so calls still connect.

Pricing

MetricPrice
Input /1M tokens₹400.0000
Output /1M tokens₹1500.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • Native speech-to-speech
  • 16 voices · 8 languages
  • Hindi + Indian English
  • Polyglot voices
  • Default voice model

Technical Details

  • Model id: nova-sonic-2
  • Voice-agent WebSocket only
  • Natural turn-taking + barge-in

Strengths

  • Native speech-to-speech
  • Multilingual incl. Hindi
  • Low latency
  • Cost-efficient

Limitations

  • Not available on chat completions
  • Voice-only surface

Use Cases

Voice agentsPhone botsMultilingual support linesAppointment booking

API Example

# Create a voice session with llm_model=nova-sonic-2 via POST /v1/voice/sessions

Endpoint: WebSocket /v1/voice/sessions · Model ID: nova-sonic-2

Try Nova 2 Sonic now

Get 1000 free API credits on signup. No credit card required.