Speech to Textindian-languages

Gnani Prisma v2.5

by Gnani · Released 2025

Gnani.ai's India-first speech-to-text model. Telephony-grade accuracy across 10 Indian languages with native code-switching, sub-4% WER on Indian English. Built for contact centers and real-time voice agents with WebSocket streaming and batch transcription.

Speech to Text

Gnani Prisma v2.5

Powered by Gnani · Gnani Prisma v2.5 ASR (trained on 14M+ hrs telephonic audio)

Context Window

N/A

Parameters

5B

Max Output

N/A

Category

Speech to Text

Overview

Gnani Prisma v2.5 is Gnani.ai's India-first speech-to-text model, engineered for the realities of Indian telephony — noisy lines, regional accents, and speakers who switch between languages mid-sentence. It covers 10 Indian languages (Bengali, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Punjabi, Tamil, and Telugu) with native code-switching, so Hinglish and other mixed-language speech transcribe cleanly without language tags.

The model is trained on a large corpus of telephonic audio and tuned for contact-center conditions, delivering sub-4% Word Error Rate on Indian-accented English. It supports two deployment modes: real-time streaming over WebSocket for live transcription (voice agents, live captioning, agent-assist) and batch transcription over REST for processing recorded calls and audio files.

Gnani Prisma v2.5 is a strong fit for Indian enterprises running high-volume call operations where telephony robustness and code-switching matter more than broad global language coverage. At $0.27 per audio hour it is competitively priced for production transcription workloads on the CallMissed platform.

Pricing

MetricPrice
Price /hour₹27.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • 10 Indian languages with native code-switching
  • Telephony-grade — sub-4% WER on Indian English
  • Real-time streaming via WebSocket
  • Batch transcription via REST API

Benchmarks

BenchmarkScore
Indian English WER<4%
Languages10
Audio ProfileTelephony
DeploymentRealtime + Batch

Technical Details

  • Supports 10 Indian languages (bn, en, gu, hi, kn, ml, mr, pa, ta, te) with native code-switching
  • Telephony-grade accuracy — sub-4% WER on Indian-accented English
  • Real-time streaming via WebSocket for live transcription
  • Batch transcription via REST API for recorded audio
  • Trained on large-scale telephonic audio for contact-center conditions
  • Production-ready for high-volume Indian call operations

Strengths

  • Telephony-grade accuracy tuned for Indian contact centers
  • Native code-switching across 10 Indian languages
  • Sub-4% WER on Indian-accented English
  • Real-time WebSocket streaming plus batch transcription

Limitations

  • Focused on Indian languages — not a general-purpose multilingual STT
  • Accuracy may vary across less common Indian languages
  • WebSocket streaming requires persistent connection management

Use Cases

Contact center transcriptionVoice agent backendsAgent-assist and live captioningMultilingual call analytics

API Example

curl https://api.callmissed.com/v1/audio/transcriptions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -F file=@audio.wav \
  -F model=gnani-prisma-v2.5 \
  -F language=hi

Endpoint: POST /v1/audio/transcriptions · Model ID: gnani-prisma-v2.5

Try Gnani Prisma v2.5 now

Get 1000 free API credits on signup. No credit card required.