How much does Gemini 3.5 Flash cost?

Gemini 3.5 Flash costs $1.5/1M tokens for input and $9/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Gemini 3.5 Flash via API?

Send a POST request to POST /v1/chat/completions with model "google/gemini-3.5-flash" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Gemini 3.5 Flash?

Gemini 3.5 Flash supports a 1M token context window with up to 65K output tokens.

Back to all models

LLM Chatreasoningmultimodaltoolspass-through pricing

Gemini 3.5 Flash

by Google · Released 2026

Google's latest fast frontier model. A 1M-token-context multimodal model (text, image, video, audio) with native reasoning, tool calling, and prompt caching — the speed-tier upgrade to the Gemini 3 Flash line.

LLM Chat

Gemini 3.5 Flash

Context Window

Parameters

Undisclosed

Max Output

65K

Overview

Gemini 3.5 Flash is Google's newest speed-optimized frontier model, succeeding Gemini 3 Flash. It keeps the full 1M-token input context window (65K output) and adds stronger reasoning ("thinking" is on by default), native multimodal input across text, image, video, and audio, reliable function calling, and prompt caching for repeated context.

It targets high-volume production workloads that need frontier-class quality at flash-tier latency: real-time chat, document and meeting summarization, multimodal understanding, agentic tool loops, and retrieval-augmented generation over very long contexts. The thinking budget lets you trade latency for depth on harder prompts while staying fast on simple ones.

On CallMissed it is fully OpenAI-compatible on `/v1/chat/completions` with streaming, tools, and caching. Pricing is pass-through from Google's published Global-tier rate — $1.50 per 1M input tokens and $9.00 per 1M output tokens, with cached input at $0.15 — so you pay the maker's rate with no markup.

Pricing

Metric	Price
Input /1M tokens	₹150.0000
Output /1M tokens	₹900.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

Latest Gemini Flash — frontier quality at flash latency
1M token context window (65K output)
Native multimodal: text, image, video, audio
Reasoning on by default + prompt caching
Routed direct — pass-through Google pricing

Benchmarks

Benchmark	Score	Notes
Context	1M	Input token window
Reasoning	Yes	Thinking enabled by default
Multimodal	Yes	Text + image + video + audio

Technical Details

Context window: 1,048,576 input / 65,536 output tokens
Native multimodal input: text, image, video, audio
Function calling, structured outputs, and prompt caching
Reasoning ("thinking") enabled by default
Pass-through pricing — $1.50 / $9.00 per 1M tokens (Global)
Routed direct (no markup)

Strengths

Frontier quality with flash-tier speed and 1M context
Multimodal across text, image, video, and audio
Reasoning + caching for complex, repeated-context workloads
Pass-through pricing — pay Google's rate, no markup

Limitations

Pricier than Gemini 3 Flash Preview / 3.1 Flash Lite
Reasoning mode raises time-to-first-token on hard prompts
Flash tier — the Pro line still leads on the hardest reasoning

Use Cases

Real-time chatLong-context analysisMultimodal understandingAgentic tool loops

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "google/gemini-3.5-flash", "messages": [{"role": "user", "content": "Summarize this 200-page report"}]}'

Endpoint: POST /v1/chat/completions · Model ID: google/gemini-3.5-flash

Try Gemini 3.5 Flash now

Get 1000 free API credits on signup. No credit card required.

Start free Book a Demo Read docs