Nova Sonic
by Amazon · Released 2025
Amazon Nova Sonic — first-generation speech-to-speech voice model (voice agent only). 11 voices across English, Spanish, French, Italian, and German with low-latency bidirectional streaming.
Nova Sonic
Powered by Amazon · Realtime speech-to-speech
Context Window
32K
Parameters
Not disclosed
Max Output
N/A
Category
LLM Chat
Overview
Amazon Nova Sonic (version 1) is the first generation of Amazon's native speech-to-speech foundation models — a single model that does speech understanding, reasoning, and speech generation over a low-latency bidirectional streaming connection. On CallMissed you select it as `llm_model=nova-sonic` when creating a voice session; like all realtime models it is voice-agent only and does not appear on `/v1/chat/completions`.
Nova Sonic 1.0 ships 11 voices across five languages — English (US and UK), Spanish, French, Italian, and German — with both feminine- and masculine-sounding options. It supports function calling and adaptive speech response that adjusts delivery based on the prosody of the input speech, plus graceful handling of user interruptions. For most new builds, prefer Nova 2 Sonic, which adds more voices, more languages (including Hindi and Indian English), polyglot code-switching, and lower pricing; Nova Sonic 1.0 remains available for compatibility and for workloads already tuned to its voice set.
Pricing on CallMissed is $4.50 per million input tokens and $17.00 per million output tokens (speech). Use it for English and European-language phone bots, voice assistants, and live conversation where a single unified model is preferable to chaining separate STT, LLM, and TTS providers.
Limitations: voice-pipeline only (no text chat completions endpoint), audio-only modality (Nova 2 Sonic adds text input), and a smaller voice/language set than the newer generation. CallMissed falls back to the standard STT→LLM→TTS pipeline automatically if the speech-to-speech model is unavailable.
Pricing
| Metric | Price |
|---|---|
| Input /1M tokens | ₹450.0000 |
| Output /1M tokens | ₹1700.0000 |
1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.
Key Highlights
- Native speech-to-speech
- 11 voices · 5 languages
- Low latency
Technical Details
- Model id: nova-sonic
- Voice-agent WebSocket only
Strengths
- Native speech-to-speech
- Low latency
Limitations
- Not available on chat completions
- Voice-only surface
- Superseded by Nova 2 Sonic
Use Cases
API Example
# Create a voice session with llm_model=nova-sonic via POST /v1/voice/sessions
Endpoint: WebSocket /v1/voice/sessions · Model ID: nova-sonic