How much does Gemini 3.1 Flash Lite cost?

Gemini 3.1 Flash Lite costs $0.25/1M tokens for input and $1.5/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Gemini 3.1 Flash Lite via API?

Send a POST request to POST /v1/chat/completions with model "google/gemini-3.1-flash-lite" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Gemini 3.1 Flash Lite?

Gemini 3.1 Flash Lite supports a 1M token context window with up to 16K output tokens.

Back to all models

LLM Chatfastcheapestpass-through pricing

Gemini 3.1 Flash Lite

by Google · Released 2026

Most affordable Gemini 3.x model from Google. 1M token context, optimized for high-volume low-latency tasks. Pass-through pricing.

LLM Chat

Context Window

Parameters

Undisclosed

Max Output

16K

Overview

Gemini 3.1 Flash Lite is Google's most affordable Gemini 3.x model — designed for high-volume, low-latency tasks where cost is the primary constraint. It keeps the full 1M token context window that defines the Gemini 3.x family, but trades some output quality and reasoning depth for the lowest input/output pricing in the lineup ($0.25/$1.50 per 1M tokens).

It is ideal for production workloads where each individual request is simple but volume is high: classification, summarization, content moderation, intent routing, retrieval-augmented chat, multilingual content tagging, and similar tasks. The 1M context window also makes it a strong choice for processing long documents at low cost.

Like the other Gemini 3.x models on CallMissed, it routes directly with no markup — pricing matches the published rate verbatim.

Pricing

Metric	Price
Input /1M tokens	₹25.0000
Output /1M tokens	₹150.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

Cheapest Gemini 3.x model — $0.25/$1.50 per 1M tokens
1M token context window (same as Pro)
Optimized for high-volume, low-latency tasks
Routed direct — no markup

Benchmarks

Benchmark	Score	Notes
Price (input)	$0.25	per 1M tokens — cheapest 3.x
Price (output)	$1.50	per 1M tokens
Context window	1M	Full Gemini 3.x context

Technical Details

Model ID: google/gemini-3.1-flash-lite
Routed directly — no third-party hops
Context window: 1,048,576 tokens (same as 3.1 Pro)
Pass-through pricing — $0.25 input / $1.50 output per 1M tokens
Supports streaming, tool calling, and structured outputs
OpenAI- and Anthropic-compatible — works via /v1/chat/completions and /v1/messages

Strengths

Lowest cost in the Gemini 3.x family — by a wide margin
Retains the full 1M context window of Pro
Direct Google routing — fast, no markup

Limitations

Reduced reasoning depth vs Gemini 3 Flash and 3.1 Pro
Preview model — may change before general availability
Not ideal for complex multi-step agentic workflows

Use Cases

Intent classificationHigh-volume routingContent moderationLong-document summarizationMultilingual tagging

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "google/gemini-3.1-flash-lite", "messages": [{"role": "user", "content": "Classify this customer message"}]}'

Endpoint: POST /v1/chat/completions · Model ID: google/gemini-3.1-flash-lite

Try Gemini 3.1 Flash Lite now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs