Sarvam 105B
by Sarvam AI · Released 2025
Sarvam AI's flagship 105B MoE model with 128K context. The largest Indian-language-optimized LLM, offering superior reasoning and generation quality across 11 Indian languages while maintaining strong English performance.
Sarvam 105B
Powered by Sarvam AI · Mixture-of-Experts (MoE)
Context Window
128K
Parameters
105B (MoE)
Max Output
8K
Category
LLM Chat
Overview
Sarvam 105B is the flagship model in Sarvam AI's lineup, scaling the same post-training pipeline proven on Sarvam-M (30B) to a much larger 105-billion-parameter Mixture-of-Experts architecture. With 128K context, it can ingest entire legal documents, financial reports, and codebases in a single pass — all while maintaining best-in-class Indian language understanding across 11 major languages in both native script and romanized form.
The training methodology mirrors the three-stage pipeline of its smaller sibling: supervised fine-tuning with quality-scored, culturally curated prompts; reinforcement learning with verifiable rewards (RLVR) using the GRPO algorithm across instruction-following, math, and programming curricula; and inference optimization with quantization. The MoE architecture activates only a subset of experts per token, keeping inference costs manageable despite the large total parameter count.
As the largest Indian-language-optimized LLM available, Sarvam 105B delivers the highest quality outputs for enterprise use cases — complex document analysis, long-form content generation in regional languages, and government/public-sector AI deployments where accuracy and cultural sensitivity are paramount. The model is deployed on SOC 2 Type II compliant, ISO-certified infrastructure.
Pricing
| Metric | Price |
|---|---|
| Input /1M tokens | ₹35.0000 |
| Output /1M tokens | ₹35.0000 |
1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.
Key Highlights
- Flagship model — best quality for Indian languages
- 128K context window for long documents
- Superior reasoning across Hindi, Tamil, Telugu, Bengali, and more
- ISO certified, SOC 2 Type II compliant infrastructure
Benchmarks
| Benchmark | Score |
|---|---|
| MMLU | 0.89 |
| MMLU-IN | 0.83 |
| MMLU-IN-R | 0.71 |
| HumanEval | 0.90 |
| GSM-8K | 0.96 |
| GSM-8K-IN-R | 0.87 |
| MTBench | 8.45 |
| AlpacaEval | 65.3 |
Technical Details
- Architecture: Mixture-of-Experts (MoE) with 105B total parameters
- Training pipeline: SFT → RLVR (GRPO) → Inference optimization (same as Sarvam-M)
- Context window: 128K tokens for long-document processing
- Languages: Hindi, Tamil, Telugu, Bengali, Marathi, Gujarati, Kannada, Malayalam, Odia, Punjabi, Assamese + English
- Supports native script and romanized input for all 11 Indian languages
- MoE routing activates subset of experts per token for efficient inference
- Deployed on SOC 2 Type II compliant, ISO-certified infrastructure
Strengths
- Highest quality Indian language model available — best reasoning and generation across 11 languages
- Handles code-mixed text (Hinglish, Tanglish) natively with superior accuracy
- 128K context enables full-document analysis for legal, financial, and government use cases
- Same affordable pricing as the 30B variant despite significantly higher capability
- Enterprise-grade infrastructure with SOC 2 Type II and ISO compliance
Limitations
- MoE architecture requires more memory at deployment compared to dense models of similar active parameter count
- Primarily optimized for 11 Indian languages — less coverage than models supporting 100+ languages
- Higher latency than the smaller Sarvam 30B due to larger model size
Use Cases
API Example
curl https://api.callmissed.com/v1/chat/completions \
-H "Authorization: Bearer cm_YOUR_KEY" \
-d '{"model": "sarvam-105b", "messages": [{"role": "user", "content": "Explain quantum computing in Hindi"}]}'Endpoint: POST /v1/chat/completions · Model ID: sarvam-105b
Try Sarvam 105B now
Get 1000 free API credits on signup. No credit card required.