LLM Chatflagshippass-through pricing

Gemini 3.1 Pro Preview

by Google · Released February 19, 2026

Google's frontier reasoning model. Released February 19, 2026, Gemini 3.1 Pro scored 77.1% on ARC-AGI-2 (genuine novel reasoning). Features a 1M token context window, 65K output tokens, multimodal inputs (text, images, audio, video), and adaptive thinking.

LLM Chat

Gemini 3.1 Pro Preview

Powered by Google · Transformer (proprietary, multimodal)

Context Window

1M

Parameters

Undisclosed

Max Output

65K

Category

LLM Chat

Overview

Gemini 3.1 Pro Preview, released February 2026, is Google's frontier reasoning model and one of the most capable multimodal AI systems available. It features a 1M token input context (1,048,576 tokens exactly), 64K max output tokens, and native multimodal support for text, images, video, audio, and PDF inputs — making it uniquely versatile for complex real-world tasks. It is available in the Gemini app, Google AI Studio, Vertex AI, and GitHub Copilot.

The model scored 77.1% on ARC-AGI-2, a 2.5x improvement over Gemini 3 Pro's 31.1%, demonstrating a massive leap in genuine novel reasoning capability. On GPQA Diamond it achieves 94.3% (vs Claude Opus 4.6 at 91.3% and GPT-5.2 at 93.2%), and it launched at first place on both SWE-Bench Verified (80.6%) and Terminal-Bench 2.0 (68.5%). GDPval-AA sits at 1317 Elo, lower than the Claude models but still competitive.

Gemini 3.1 Pro offers three thinking levels — Low, Medium, and High — for controlling reasoning depth and compute cost. Pricing is tiered: $2/M input and $12/M output for requests under 200K tokens, scaling to $4/$18 for 200K-1M token requests. Context caching can deliver up to 75% cost savings on repeated context. A specialized gemini-3.1-pro-preview-customtools variant is available for agentic workflows.

The multimodal capabilities enable workflows that combine code review with screenshot analysis, document processing with image understanding, and video analysis with text generation — all in a single model. Its combination of top-tier reasoning benchmarks, native multimodality, competitive pricing with context caching, and the broadest platform availability makes it a strong contender for research analysis, long document processing, and agentic workflows.

Pricing

MetricPrice
Input /1M tokens₹200.0000
Output /1M tokens₹1200.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • 77.1% on ARC-AGI-2 — genuine novel reasoning
  • 1M token context (1,048,576 tokens)
  • Multimodal: text, images, audio, video, files
  • Adaptive thinking for complex agentic challenges
  • Routed direct to Google AI Studio — pass-through pricing

Benchmarks

BenchmarkScore
ARC-AGI-277.1%
GPQA Diamond94.3%
SWE-bench Verified80.6%
Terminal-Bench 2.068.5%
GDPval-AA1317 Elo
MATH-50095.1%
MMLU-Pro87.3%

Technical Details

  • Context window: 1,048,576 tokens (1M input) with 64K max output
  • Native multimodal: text, image, video, audio, and PDF input
  • Three thinking levels: Low, Medium, High for reasoning depth control
  • Pricing tiers: $2/$12 per 1M tokens (under 200K); $4/$18 (200K-1M)
  • Context caching: up to 75% cost savings on repeated context
  • gemini-3.1-pro-preview-customtools variant for agentic workflows
  • Available in Gemini app, Google AI Studio, Vertex AI, GitHub Copilot
  • Function calling, search grounding, structured outputs, code execution

Strengths

  • 77.1% on ARC-AGI-2 — among the best for genuine novel reasoning
  • True multimodal: processes text, images, video, audio, and PDFs natively
  • 1M context with context caching for cost-efficient repeated use
  • Rich feature set: thinking, function calling, search grounding, code execution
  • Pass-through Google AI Studio pricing — $2/$12 per 1M tokens for a frontier model

Limitations

  • Preview model — may have stability or availability changes before GA
  • 65K max output is lower than Opus 4.6's 128K
  • Proprietary — no self-hosting or open-weight option
  • Video and audio processing adds to token count and cost

Use Cases

Research analysisMultimodal reasoningLong document processingAgentic workflows

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "google/gemini-3.1-pro-preview", "messages": [{"role": "user", "content": "Analyze this research paper and extract key findings"}]}'

Endpoint: POST /v1/chat/completions · Model ID: google/gemini-3.1-pro-preview

Try Gemini 3.1 Pro Preview now

Get 1000 free API credits on signup. No credit card required.