How much does Gemini 3 Flash Preview cost?

Gemini 3 Flash Preview costs $0.5/1M tokens for input and $3/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Gemini 3 Flash Preview via API?

Send a POST request to POST /v1/chat/completions with model "google/gemini-3-flash-preview" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Gemini 3 Flash Preview?

Gemini 3 Flash Preview supports a 1M token context window with up to 16K output tokens.

Back to all models

LLM Chatfastaffordablepass-through pricing

Gemini 3 Flash Preview

by Google · Released 2026

Google's fast and affordable model. Lightning-fast inference with a 1M token context window. Optimized for speed and high-volume developer use cases while maintaining strong reasoning capabilities.

LLM Chat

Context Window

Parameters

Undisclosed

Max Output

16K

Overview

Gemini 3 Flash Preview is Google's speed-optimized model, designed for developers who need fast inference at scale without sacrificing the 1M token context window. It delivers lightning-fast responses while maintaining strong reasoning capabilities across text, image, and other modalities.

The model is built for high-volume production workloads where latency and cost are primary concerns. At $0.70/M input and $4.00/M output, it offers an excellent balance of capability and affordability — significantly cheaper than Gemini 3.1 Pro while retaining multimodal input support and the full 1M context window. This makes it ideal for real-time chat applications, quick summarization, and cost-sensitive deployments.

As a Flash-tier model, it trades some depth of reasoning for raw speed. It handles straightforward tasks like summarization, classification, translation, and conversational AI with high quality, while more complex multi-step reasoning tasks may benefit from the Pro variant.

Pricing

Metric	Price
Input /1M tokens	₹50.0000
Output /1M tokens	₹300.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

Lightning-fast inference speed
1M token context window
Highly capable for its price point
Multimodal input support
Routed direct — pass-through pricing

Benchmarks

Benchmark	Score	Notes
MMLU-Pro	78.5%	Professional knowledge
HumanEval	86.2%	Code generation
MATH-500	85.4%	Competition mathematics
GPQA Diamond	62.1%	Graduate-level science

Technical Details

Context window: 1,000,000 tokens — same as Gemini 3.1 Pro
Optimized for lightning-fast inference and low latency
Native multimodal: text, image, and other input types
Supports function calling, structured outputs, and search grounding
Pass-through pricing — $0.50/$3 per 1M tokens
Routed direct (no markup)

Strengths

Lightning-fast inference — among the fastest frontier-tier models
1M context window at a fraction of Pro pricing
Multimodal input support for text, images, and more
Excellent cost-to-capability ratio for production workloads

Limitations

Reduced reasoning depth compared to Gemini 3.1 Pro
Preview model — may change before general availability
Less capable on complex multi-step agentic tasks

Use Cases

Real-time chatQuick summarizationHigh-volume processingCost-sensitive applications

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "google/gemini-3-flash-preview", "messages": [{"role": "user", "content": "Quickly summarize these meeting notes"}]}'

Endpoint: POST /v1/chat/completions · Model ID: google/gemini-3-flash-preview

Try Gemini 3 Flash Preview now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs