Gemini 3.5 Flash
by Google · Released 2026
Google's latest fast frontier model. A 1M-token-context multimodal model (text, image, video, audio) with native reasoning, tool calling, and prompt caching — the speed-tier upgrade to the Gemini 3 Flash line.
Gemini 3.5 Flash
Powered by Google · Transformer (proprietary)
Context Window
1M
Parameters
Undisclosed
Max Output
65K
Category
LLM Chat
Overview
Gemini 3.5 Flash is Google's newest speed-optimized frontier model, succeeding Gemini 3 Flash. It keeps the full 1M-token input context window (65K output) and adds stronger reasoning ("thinking" is on by default), native multimodal input across text, image, video, and audio, reliable function calling, and prompt caching for repeated context.
It targets high-volume production workloads that need frontier-class quality at flash-tier latency: real-time chat, document and meeting summarization, multimodal understanding, agentic tool loops, and retrieval-augmented generation over very long contexts. The thinking budget lets you trade latency for depth on harder prompts while staying fast on simple ones.
On CallMissed it is fully OpenAI-compatible on `/v1/chat/completions` with streaming, tools, and caching. Pricing is pass-through from Google's published Global-tier rate — $1.50 per 1M input tokens and $9.00 per 1M output tokens, with cached input at $0.15 — so you pay the maker's rate with no markup.
Pricing
| Metric | Price |
|---|---|
| Input /1M tokens | ₹150.0000 |
| Output /1M tokens | ₹900.0000 |
1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.
Key Highlights
- Latest Gemini Flash — frontier quality at flash latency
- 1M token context window (65K output)
- Native multimodal: text, image, video, audio
- Reasoning on by default + prompt caching
- Routed direct — pass-through Google pricing
Benchmarks
| Benchmark | Score |
|---|---|
| Context | 1M |
| Reasoning | Yes |
| Multimodal | Yes |
Technical Details
- Context window: 1,048,576 input / 65,536 output tokens
- Native multimodal input: text, image, video, audio
- Function calling, structured outputs, and prompt caching
- Reasoning ("thinking") enabled by default
- Pass-through pricing — $1.50 / $9.00 per 1M tokens (Global)
- Routed direct (no markup)
Strengths
- Frontier quality with flash-tier speed and 1M context
- Multimodal across text, image, video, and audio
- Reasoning + caching for complex, repeated-context workloads
- Pass-through pricing — pay Google's rate, no markup
Limitations
- Pricier than Gemini 3 Flash Preview / 3.1 Flash Lite
- Reasoning mode raises time-to-first-token on hard prompts
- Flash tier — the Pro line still leads on the hardest reasoning
Use Cases
API Example
curl https://api.callmissed.com/v1/chat/completions \
-H "Authorization: Bearer cm_YOUR_KEY" \
-d '{"model": "google/gemini-3.5-flash", "messages": [{"role": "user", "content": "Summarize this 200-page report"}]}'Endpoint: POST /v1/chat/completions · Model ID: google/gemini-3.5-flash
Try Gemini 3.5 Flash now
Get 1000 free API credits on signup. No credit card required.