LLM Chatfastcheapestpass-through pricing

Gemini 3.1 Flash Lite

by Google · Released 2026

Most affordable Gemini 3.x model from Google. 1M token context, optimized for high-volume low-latency tasks. Pass-through Google AI Studio pricing.

LLM Chat

Gemini 3.1 Flash Lite

Powered by Google · Transformer (proprietary, Gemini 3.x family)

Context Window

1M

Parameters

Undisclosed

Max Output

16K

Category

LLM Chat

Overview

Gemini 3.1 Flash Lite is Google's most affordable Gemini 3.x model — designed for high-volume, low-latency tasks where cost is the primary constraint. It keeps the full 1M token context window that defines the Gemini 3.x family, but trades some output quality and reasoning depth for the lowest input/output pricing in the lineup ($0.25/$1.50 per 1M tokens).

It is ideal for production workloads where each individual request is simple but volume is high: classification, summarization, content moderation, intent routing, retrieval-augmented chat, multilingual content tagging, and similar tasks. The 1M context window also makes it a strong choice for processing long documents at low cost.

Like the other Gemini 3.x models on CallMissed, it routes directly to Google AI Studio — there is no OpenRouter hop, no markup, and pricing matches Google's published rate verbatim.

Pricing

MetricPrice
Input /1M tokens₹25.0000
Output /1M tokens₹150.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • Cheapest Gemini 3.x model — $0.25/$1.50 per 1M tokens
  • 1M token context window (same as Pro)
  • Optimized for high-volume, low-latency tasks
  • Routed direct to Google AI Studio — no markup

Benchmarks

BenchmarkScore
Price (input)$0.25
Price (output)$1.50
Context window1M

Technical Details

  • Model ID: google/gemini-3.1-flash-lite
  • Routed directly to Google AI Studio — no third-party hops
  • Context window: 1,048,576 tokens (same as 3.1 Pro)
  • Pass-through pricing — $0.25 input / $1.50 output per 1M tokens
  • Supports streaming, tool calling, and structured outputs
  • OpenAI- and Anthropic-compatible — works via /v1/chat/completions and /v1/messages

✓ Strengths

  • Lowest cost in the Gemini 3.x family — by a wide margin
  • Retains the full 1M context window of Pro
  • Direct Google routing — fast, no markup

⚠ Limitations

  • Reduced reasoning depth vs Gemini 3 Flash and 3.1 Pro
  • Preview model — may change before general availability
  • Not ideal for complex multi-step agentic workflows

Use Cases

Intent classificationHigh-volume routingContent moderationLong-document summarizationMultilingual tagging

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "google/gemini-3.1-flash-lite", "messages": [{"role": "user", "content": "Classify this customer message"}]}'

Endpoint: POST /v1/chat/completions · Model ID: google/gemini-3.1-flash-lite

Try Gemini 3.1 Flash Lite now

Get 1000 free API credits on signup. No credit card required.