Gemma 4 26B A4B
by Google · Released April 2, 2026
Google DeepMind's open-weight MoE model from the Gemma 4 family. 26B total parameters with only 4B active per forward pass — runs nearly as fast as a 4B model while delivering much larger model quality. Multimodal (text + image), 256K context, Apache 2.0 license.
Gemma 4 26B A4B
Powered by Google · Mixture-of-Experts (26B total / 4B active)
Context Window
128K
Parameters
26B total / 4B active (MoE)
Max Output
8K
Category
LLM Chat
Overview
Gemma 4 26B A4B, released April 2, 2026 by Google DeepMind, is an open-weight Mixture-of-Experts model that achieves a remarkable efficiency breakthrough: 26B total parameters with only 4B active per forward pass. This means it runs nearly as fast as a 4B model while delivering quality comparable to much larger models — making it one of the most efficient open models available.
The model is multimodal, supporting both text and image input (with audio support on smaller variants), and features a 256K token context window. It supports 140+ languages, making it one of the most linguistically diverse open models. Released under the Apache 2.0 license, it offers full commercial freedom with no restrictions on use, modification, or distribution.
Gemma 4 26B A4B ranks #3 among open-source models on key benchmarks, punching well above its weight class thanks to the MoE architecture. It is particularly well-suited for cost-effective deployments, edge-friendly scenarios, and any application where the combination of multimodal capability, multilingual support, and permissive licensing matters.
Pricing
| Metric | Price |
|---|---|
| Input /1M tokens | ₹40.0000 |
| Output /1M tokens | ₹160.0000 |
1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.
Key Highlights
- Apache 2.0 license — full commercial freedom
- 26B total params, only 4B active (fast inference)
- Multimodal: text and image input
- 140+ language support
Benchmarks
| Benchmark | Score |
|---|---|
| Open-Source Ranking | #3 |
| MMLU-Pro | 72.8% |
| HumanEval | 80.5% |
| MATH-500 | 78.3% |
| GPQA Diamond | 55.2% |
Technical Details
- Architecture: MoE with 26B total / 4B active parameters per forward pass
- Runs nearly as fast as a 4B model with much higher quality
- Multimodal: text and image input (audio on smaller variants)
- 256K native context window
- 140+ language support — one of the most linguistically diverse open models
- Apache 2.0 license — full commercial freedom, no restrictions
- #3 open-source model on key benchmarks
- Available via Google AI API and CallMissed unified gateway
Strengths
- Apache 2.0 — most permissive license among top open models
- Only 4B active params — runs on consumer hardware and edge devices
- Multimodal text+image with 140+ language support
- #3 open-source model — punches well above its weight class
- Affordable at $0.40/$1.60 per 1M tokens
Limitations
- Lower absolute capability than larger models (GPT-OSS-120B, Kimi K2.5)
- 4B active parameters limits complex reasoning depth
- Image understanding is less capable than dedicated vision models
Use Cases
API Example
curl https://api.callmissed.com/v1/chat/completions \
-H "Authorization: Bearer cm_YOUR_KEY" \
-d '{"model": "gemma-4-26b-a4b-it", "messages": [{"role": "user", "content": "Describe what you see in this image"}]}'Endpoint: POST /v1/chat/completions · Model ID: gemma-4-26b-a4b-it
Try Gemma 4 26B A4B now
Get 1000 free API credits on signup. No credit card required.