How much does Saaras v3 cost?

Saaras v3 costs $0.53/hour on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Saaras v3 via API?

Send a POST request to POST /v1/audio/transcriptions with model "saaras:v3" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Saaras v3?

Saaras v3 supports a N/A token context window with up to N/A output tokens.

Back to all models

Speech to Textindian-languages

Saaras v3

by Sarvam AI · Released 2025

Sarvam AI's flagship speech-to-text model. Industry-leading accuracy for 22 Indian languages plus English. Handles code-mixed speech (e.g. switching between Hindi and English mid-sentence) natively. Supports real-time streaming via WebSocket and batch transcription via REST.

Speech to Text

Saaras v3

Context Window

N/A

Parameters

Undisclosed

Max Output

N/A

Overview

Saaras v3 is Sarvam AI's flagship speech-to-text model, delivering industry-leading accuracy for 22 Indian languages plus English. It is specifically designed to handle the linguistic complexity of India — where speakers routinely switch between languages mid-sentence (code-mixing), use regional accents, and speak in noisy environments like call centers and public spaces.

The model supports two deployment modes: real-time streaming via WebSocket for live transcription (voice agents, live captioning, meeting transcription) and batch transcription via REST API for processing recorded audio files. Both modes deliver high accuracy across all 22 supported Indian languages, with particularly strong performance on code-mixed speech like Hinglish (Hindi-English) and Tanglish (Tamil-English).

Saaras v3 is production-ready for enterprise deployments, with robust handling of telephony audio quality, background noise, and multiple speakers. It is the go-to choice for Indian market applications that need accurate, real-time speech recognition across the country's diverse linguistic landscape.

Pricing

Metric	Price
Price /hour	₹53.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

22 Indian languages + English
Native code-mixed speech handling (Hinglish, etc.)
Real-time streaming via WebSocket
Batch transcription via REST API

Benchmarks

Benchmark	Score	Notes
Hindi WER	<8%	Word Error Rate on Hindi speech
Code-Mixed WER	<12%	Hinglish and other code-mixed speech
English WER	<6%	Indian-accented English
Languages	23	22 Indian languages + English

Technical Details

Supports 22 Indian languages + English with native code-mixed handling
Real-time streaming via WebSocket for live transcription
Batch transcription via REST API for recorded audio
Handles telephony audio quality, background noise, and multiple speakers
Optimized for Indian accents and regional pronunciation variations
Production-ready for call center and enterprise deployments

Strengths

Industry-leading accuracy for 22 Indian languages
Native code-mixed speech handling — unique capability for Indian market
Real-time WebSocket streaming for live applications
Robust handling of telephony audio and noisy environments

Limitations

Focused on Indian languages — not a general-purpose multilingual STT
Accuracy may vary across less common Indian languages
WebSocket streaming requires persistent connection management

Use Cases

Call center transcriptionVoice agent backendsMeeting transcriptionMultilingual dictation

API Example

curl https://api.callmissed.com/v1/audio/transcriptions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -F file=@audio.wav \
  -F model=saaras:v3 \
  -F language=hi

Endpoint: POST /v1/audio/transcriptions · Model ID: saaras:v3

Try Saaras v3 now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs