How much does Claude Opus 4.6 cost?

Claude Opus 4.6 costs $7/1M tokens for input and $35/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Claude Opus 4.6 via API?

Send a POST request to POST /v1/chat/completions with model "anthropic/claude-opus-4.6" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Claude Opus 4.6?

Claude Opus 4.6 supports a 1M token context window with up to 128K output tokens.

Back to all models

LLM Chatflagshipreasoning

Claude Opus 4.6

by Anthropic · Released February 5, 2026

Anthropic's most capable model. Claude Opus 4.6 features a 1M token context window, 128K max output tokens, extended thinking, and a 14.5-hour task completion horizon. Excels at financial analysis, complex code debugging, multi-step planning, and autonomous task execution.

LLM Chat

Claude Opus 4.6

Context Window

Parameters

Undisclosed

Max Output

128K

Overview

Claude Opus 4.6 is Anthropic's most capable model and the first Opus-class system to feature a 1-million-token context window (available in beta). It doubles the maximum output to 128K tokens (up from 64K) and introduces premium pricing for contexts exceeding 200K tokens ($10 input / $37.50 output per million tokens). The model plans more carefully, sustains agentic tasks longer, operates more reliably in larger codebases, and delivers significantly better code review and debugging than its predecessors.

Opus 4.6 thinks more deeply than previous Claude models, revisiting its reasoning before settling on an answer. This deeper deliberation can add cost and latency on simpler tasks, so Anthropic recommends dialing the effort level to medium for routine queries. The model supports adaptive thinking with four configurable effort levels — low, medium, high (default), and max — and picks up contextual clues about how much reasoning a given prompt requires. Context compaction (in beta) automatically summarizes older context when approaching the token threshold, keeping conversations coherent without manual truncation.

Agent teams in Claude Code (research preview) allow multiple agents to work in parallel and coordinate autonomously, unlocking complex multi-repo workflows. In one demonstration, Opus 4.6 autonomously closed 13 issues and assigned 12 to the right team members in a single day, managing an approximately 50-person organization across 6 repositories. Partners described the model as handling a multi-million-line codebase migration "like a senior engineer."

Benchmark results represent a qualitative shift in capability. On MRCR v2 8-needle at 1M context, Opus 4.6 scores 76% compared to Sonnet 4.5 at just 18.5% — a dramatic improvement in long-context utilization. It achieves the highest score on Terminal-Bench 2.0 and leads all frontier models on Humanity's Last Exam. On GDPval-AA, it outperforms GPT-5.2 by approximately 144 Elo points and Opus 4.5 by 190 points, translating to winning roughly 70% of head-to-head comparisons. BrowseComp results are the best of any model at locating hard-to-find information online, with a multi-agent harness pushing accuracy to 86.8%.

In legal and cybersecurity domains, Opus 4.6 scores 90.2% on BigLaw Bench with 40% perfect scores and 84% of responses scoring above 0.8. For cybersecurity, 38 out of 40 investigations produced the best results in a blind ranking against Claude 4.5 models, with each model running up to 9 subagents and over 100 tool calls per investigation.

Safety is a core focus. Opus 4.6 has the lowest rate of over-refusals of any recent Claude model and underwent the most comprehensive safety evaluations Anthropic has ever conducted, including 6 new cybersecurity probes. Misaligned behavior rates remain low across all tested scenarios. Reduced refusals mean the model is more helpful on legitimate edge-case queries without compromising on genuinely harmful requests.

Partner adoption has been strong. Teams at Notion, Devin, Cognition, Windsurf, Lovable, Box, Figma, and v0 have integrated Opus 4.6 into their products, citing its sustained agentic performance and reliability in production. Claude in Excel has received improvements, and Claude in PowerPoint is available as a research preview. US-only inference is offered at 1.1x standard pricing for organizations with data residency requirements.

At standard pricing of $7/$35 per million tokens (with premium rates above 200K context), Opus 4.6 is positioned for enterprise teams that need the deepest reasoning, longest autonomous task horizons, and most reliable agentic performance available — particularly for financial analysis, legal review, complex code debugging, and multi-step autonomous workflows.

Pricing

Metric	Price
Input /1M tokens	₹700.0000
Output /1M tokens	₹3500.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

1M token context window with 128K max output
14.5-hour autonomous task completion horizon
#1 on Finance Agent benchmark
Extended thinking for deep reasoning chains

Benchmarks

Benchmark	Score	Notes
SWE-bench Verified	80.8%	Real-world software engineering
OSWorld-Verified	72.7%	Operating system task automation
Terminal-Bench 2.0	65.4%	#1 at launch
Humanity's Last Exam	#1	Hardest human-curated exam
BigLaw Bench	90.2%	Legal reasoning and analysis
MRCR (1M)	76%	Multi-round context recall at 1M tokens
Finance Agent v1.1	60.1%	Second to Sonnet 4.6 (63.3%)
GDPval-AA	1606 Elo	Professional-level comparisons

Technical Details

Context window: 1,000,000 tokens with 128K max output (doubled from 64K)
Adaptive thinking: 4 configurable effort levels for reasoning depth control
Interleaved thinking: reasons between tool calls for better agentic performance
Context compaction: auto-summarizes long conversations to stay within limits
14.5-hour autonomous task completion horizon for long-running workflows
Post-trained with Constitutional AI (CAI) and RLHF
Supports tool use, structured outputs, and computer use
Available via Anthropic API and CallMissed unified gateway

Strengths

#1 on Finance Agent, Terminal-Bench, and Humanity's Last Exam benchmarks
14.5-hour task horizon enables truly autonomous long-running workflows
Adaptive thinking lets developers control reasoning depth vs. cost
Interleaved thinking between tool calls dramatically improves agentic accuracy
128K max output for generating complete codebases and detailed reports

Limitations

Premium pricing at $7/$35 per 1M tokens — expensive for high-volume use
Higher latency with extended thinking enabled, especially at max effort
Proprietary and closed-source — no self-hosting option
1M context with heavy tool use can lead to high per-request costs

Use Cases

Financial analysisComplex code debuggingLong-horizon autonomous tasksResearch synthesis

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "anthropic/claude-opus-4.6", "messages": [{"role": "user", "content": "Analyze this financial report and identify risks"}]}'

Endpoint: POST /v1/chat/completions · Model ID: anthropic/claude-opus-4.6

Try Claude Opus 4.6 now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs