Text to Speechenglishlow-latencyvoice-agents

Aura 2 English

by Deepgram · Released 2025

Deepgram Aura 2 — natural, conversational English TTS with 39 voices. Designed for low-latency voice agents and IVR. Streaming MP3 output.

Text to Speech

Aura 2 English

Powered by Deepgram · Proprietary low-latency neural TTS

Context Window

N/A

Parameters

Undisclosed

Max Output

N/A

Category

Text to Speech

Overview

Aura 2 is Deepgram's second-generation TTS model, built specifically for conversational voice applications where latency and naturalness matter equally. It offers 39 distinct English voices spanning a range of genders, ages, and styles — including warm conversational voices (luna, athena, iris), confident professional voices (apollo, atlas, hera), and characterful storytelling voices (orion, hyperion, jupiter).

Deployed via Cloudflare Workers AI, it returns MP3-encoded audio as a streaming HTTP response, making it well-suited for voice agents that need to speak as soon as the first phoneme is ready. Compared to Sarvam Bulbul (Indian languages) and ElevenLabs (English with cloning), Aura 2 sits at the production-quality midpoint with the lowest latency.

At $0.40 per 10K characters, it is roughly 25% cheaper than Bulbul for English-only workloads and significantly cheaper than ElevenLabs.

Pricing

MetricPrice
Price /10K chars₹40.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • 39 natural English voices
  • Streaming MP3 output for real-time voice agents
  • Encoding options: mp3, opus, linear16, mulaw, alaw, flac, aac
  • Configurable sample rate and bitrate

Benchmarks

BenchmarkScore
Voices39
Latency~200ms
Cost$0.40

Technical Details

  • Runs on Cloudflare Workers AI (`@cf/deepgram/aura-2-en`)
  • Returns ReadableStream of MP3 audio
  • Voices: amalthea, andromeda, apollo, arcas, aries, asteria, athena, atlas, aurora, callista, cora, cordelia, delia, draco, electra, harmonia, helena, hera, hermes, hyperion, iris, janus, juno, jupiter, luna (default), mars, minerva, neptune, odysseus, ophelia, orion, orpheus, pandora, phoebe, pluto, saturn, thalia, theia, vesta, zeus
  • Encoding options: mp3 (default), opus, linear16, mulaw, alaw, flac, aac

Strengths

  • 39 voices — widest English selection on the platform
  • Low first-audio latency for real-time voice agents
  • Streaming output, MP3 by default
  • 25% cheaper than Bulbul for English-only

Limitations

  • English only (use aura-2-es for Spanish)
  • No voice cloning or custom voice training
  • No SSML — limited prosody control vs Bulbul

Use Cases

English voice agentsIVR systemsAudiobook generationAccessibility readers

API Example

curl https://api.callmissed.com/v1/audio/speech \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "aura-2-en", "input": "Hello, how can I help you today?", "voice": "luna"}' \
  --output speech.mp3

Endpoint: POST /v1/audio/speech · Model ID: aura-2-en

Try Aura 2 English now

Get 1000 free API credits on signup. No credit card required.