How much does Bulbul v3 cost?

Bulbul v3 costs $0.53/10K chars on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Bulbul v3 via API?

Send a POST request to POST /v1/audio/speech with model "bulbul:v3" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Bulbul v3?

Bulbul v3 supports a N/A token context window with up to N/A output tokens.

Back to all models

Text to Speechindian-languages

Bulbul v3

by Sarvam AI · Released February 5, 2026

Sarvam AI's natural text-to-speech model. 37 voices across 11 Indian languages with production-ready quality. Supports SSML for fine-grained control over speed, pitch, pauses, and emphasis. Handles code-mixed text and number normalization out of the box.

Text to Speech

Bulbul v3

Context Window

N/A

Parameters

Undisclosed

Max Output

N/A

Overview

Bulbul v3, released February 5, 2026, is Sarvam AI's production-ready text-to-speech model offering 37 natural-sounding voices across 11 Indian languages. The voices are designed to sound natural and conversational rather than robotic, making them suitable for customer-facing applications like IVR systems, voice agents, and telephony platforms.

The model supports SSML (Speech Synthesis Markup Language) for fine-grained control over prosody — developers can adjust speed, pitch, volume, add pauses, and emphasize specific words. It handles code-mixed text natively, correctly pronouncing Hindi-English mixed sentences without requiring language tags. Number normalization, date formatting, and currency reading are handled automatically.

Bulbul v3 is production-ready for telephony and call center deployments, with consistent quality across all 37 voices and 11 languages. The voices cover a range of genders, ages, and regional accents, allowing applications to match the voice to their target audience. At $0.53 per 10K characters, it is competitively priced for high-volume TTS workloads.

Pricing

Metric	Price
Price /10K chars	₹53.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

37 natural voices across 11 Indian languages
SSML support for prosody, breaks, emphasis
Code-mixed text handling (Hinglish, etc.)
Production-ready for call centers and telephony

Benchmarks

Benchmark	Score	Notes
MOS Score	4.2/5	Mean Opinion Score for naturalness
Voices	37	Across 11 Indian languages
Languages	11	Major Indian languages
SSML Support	Full	Prosody, breaks, emphasis, phonemes

Technical Details

39 natural-sounding voices across 11 Indian languages
SSML support: speed, pitch, volume, pauses, emphasis, phonemes
Native code-mixed text handling (Hinglish, Tanglish, etc.)
Automatic number normalization, date formatting, currency reading
Production-ready for telephony and call center deployments
Consistent quality across all voices and languages

Strengths

37 natural voices — widest selection for Indian languages
Full SSML support for fine-grained prosody control
Native code-mixed text handling without language tags
Production-ready quality for telephony and call centers

Limitations

Limited to 11 Indian languages — no global language coverage
Voice cloning and custom voice creation not yet supported
Audio output quality may vary with very long text inputs

Use Cases

Voice agentsIVR systemsAudiobook generationAccessibility applications

API Example

curl https://api.callmissed.com/v1/audio/speech \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "bulbul:v3", "input": "Namaste, aapka order confirm ho gaya hai.", "voice": "meera"}' \
  --output speech.mp3

Endpoint: POST /v1/audio/speech · Model ID: bulbul:v3

Try Bulbul v3 now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs