How much does Whisper Large v3 Turbo cost?

Whisper Large v3 Turbo costs $0.06/hour on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Whisper Large v3 Turbo via API?

Send a POST request to POST /v1/audio/transcriptions with model "whisper-large-v3-turbo" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Whisper Large v3 Turbo?

Whisper Large v3 Turbo supports a N/A token context window with up to N/A output tokens.

Back to all models

Speech to Textmultilingualbudget

Whisper Large v3 Turbo

by OpenAI · Released 2024

OpenAI Whisper Large v3 Turbo — 99-language ASR with auto-detect. Supports both transcription and translation modes. Best accuracy/cost ratio for global multilingual speech.

Speech to Text

Context Window

N/A

Parameters

809M

Max Output

N/A

Overview

Whisper Large v3 Turbo is OpenAI's open-weight ASR model optimized for fast inference while retaining the multilingual breadth of the Large v3 family. It supports 99 languages out of the box, automatically detecting the spoken language when none is specified, and can either transcribe (output text in the source language) or translate (output English regardless of input). It is the most cost-efficient way to add global multilingual speech recognition to a product.

Deployed on the CallMissed gateway, it accepts base64-encoded audio in standard formats (MP3, WAV, FLAC) and returns structured JSON with the transcription text plus optional VTT-formatted segments for subtitle workflows. The Turbo variant uses a smaller decoder than Large v3, achieving ~8× faster inference with only minor accuracy loss on most languages.

At $0.06 per audio hour, it is roughly 9× cheaper than Sarvam Saaras for use cases that don't need Indian-language code-mixing. Pair it with one of our LLMs for end-to-end speech-to-insight workflows: meeting summarization, podcast indexing, accessibility captions, or compliance archiving.

Pricing

Metric	Price
Price /hour	₹6.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

99 languages with automatic language detection
Transcribe + translate modes
VTT subtitle output for downstream tooling
8× faster inference than Whisper Large v3

Benchmarks

Benchmark	Score	Notes
Languages	99	with auto-detect
Speed	8×	vs Whisper Large v3
Hourly cost	$0.06	best in class

Technical Details

Runs on the CallMissed gateway
Accepts base64 MP3/WAV/FLAC; max ~30 min per request
Returns transcription_info.text + segments[].vtt
task=transcribe (default) or task=translate
Optional: vad_filter, initial_prompt, beam_size, hallucination_silence_threshold

Strengths

Best multilingual coverage (99 languages)
Auto language detection — no need for ISO tags
Built-in translation to English
~9× cheaper than Sarvam Saaras for non-Indian languages

Limitations

Less accurate than Saaras on Indian languages and code-mixed speech
Batch only on this surface — for streaming use Nova-3 or Flux
Hallucinations on long silences without vad_filter

Use Cases

Meeting transcriptionPodcast indexingSubtitle generationMultilingual voice search

API Example

curl https://api.callmissed.com/v1/audio/transcriptions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -F file=@audio.mp3 \
  -F model=whisper-large-v3-turbo \
  -F language=en

Endpoint: POST /v1/audio/transcriptions · Model ID: whisper-large-v3-turbo

Try Whisper Large v3 Turbo now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs