Gemini 3.1 Flash Lite
by Google · Released 2026
Most affordable Gemini 3.x model from Google. 1M token context, optimized for high-volume low-latency tasks. Pass-through Google AI Studio pricing.
Gemini 3.1 Flash Lite
Powered by Google · Transformer (proprietary, Gemini 3.x family)
Context Window
1M
Parameters
Undisclosed
Max Output
16K
Category
LLM Chat
Overview
Gemini 3.1 Flash Lite is Google's most affordable Gemini 3.x model — designed for high-volume, low-latency tasks where cost is the primary constraint. It keeps the full 1M token context window that defines the Gemini 3.x family, but trades some output quality and reasoning depth for the lowest input/output pricing in the lineup ($0.25/$1.50 per 1M tokens).
It is ideal for production workloads where each individual request is simple but volume is high: classification, summarization, content moderation, intent routing, retrieval-augmented chat, multilingual content tagging, and similar tasks. The 1M context window also makes it a strong choice for processing long documents at low cost.
Like the other Gemini 3.x models on CallMissed, it routes directly to Google AI Studio — there is no OpenRouter hop, no markup, and pricing matches Google's published rate verbatim.
Pricing
| Metric | Price |
|---|---|
| Input /1M tokens | ₹25.0000 |
| Output /1M tokens | ₹150.0000 |
1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.
Key Highlights
- Cheapest Gemini 3.x model — $0.25/$1.50 per 1M tokens
- 1M token context window (same as Pro)
- Optimized for high-volume, low-latency tasks
- Routed direct to Google AI Studio — no markup
Benchmarks
| Benchmark | Score |
|---|---|
| Price (input) | $0.25 |
| Price (output) | $1.50 |
| Context window | 1M |
Technical Details
- Model ID: google/gemini-3.1-flash-lite
- Routed directly to Google AI Studio — no third-party hops
- Context window: 1,048,576 tokens (same as 3.1 Pro)
- Pass-through pricing — $0.25 input / $1.50 output per 1M tokens
- Supports streaming, tool calling, and structured outputs
- OpenAI- and Anthropic-compatible — works via /v1/chat/completions and /v1/messages
✓ Strengths
- Lowest cost in the Gemini 3.x family — by a wide margin
- Retains the full 1M context window of Pro
- Direct Google routing — fast, no markup
⚠ Limitations
- Reduced reasoning depth vs Gemini 3 Flash and 3.1 Pro
- Preview model — may change before general availability
- Not ideal for complex multi-step agentic workflows
Use Cases
API Example
curl https://api.callmissed.com/v1/chat/completions \
-H "Authorization: Bearer cm_YOUR_KEY" \
-d '{"model": "google/gemini-3.1-flash-lite", "messages": [{"role": "user", "content": "Classify this customer message"}]}'Endpoint: POST /v1/chat/completions · Model ID: google/gemini-3.1-flash-lite
Try Gemini 3.1 Flash Lite now
Get 1000 free API credits on signup. No credit card required.