How much does Qwen 3.5 Flash cost?

Qwen 3.5 Flash costs $0.09/1M tokens for input and $0.35/1M tokens for output on CallMissed. 1 credit = ₹1 = $0.01 USD.

How do I use Qwen 3.5 Flash via API?

Send a POST request to POST /v1/chat/completions with model "qwen/qwen3.5-flash" and your API key. CallMissed uses the OpenAI-compatible format — just change the base URL and model field.

What is the context window of Qwen 3.5 Flash?

Qwen 3.5 Flash supports a 128K token context window with up to 8K output tokens.

Back to all models

LLM Chatfastaffordable

Qwen 3.5 Flash

by Qwen · Released February 2026

The fastest and most affordable model in the Qwen 3.5 family. Features native multimodality, 262K native context window (extendable to ~1M), and support for 201 languages. Optimized for speed and high-volume use cases.

LLM Chat

Qwen 3.5 Flash

Context Window

128K

Parameters

Undisclosed (MoE)

Max Output

Overview

Qwen 3.5 Flash is the fastest and most affordable model in the Qwen 3.5 family, designed for high-volume production workloads where cost and speed are paramount. At just $0.09/M input tokens, it is one of the cheapest capable models available — making it ideal for applications that process millions of requests per day.

The model features a 262K native context window (extendable to approximately 1M with techniques like YaRN), native multimodal support across text, image, video, and audio, and coverage of 201 languages. Despite its focus on speed and cost, it maintains surprisingly strong performance on standard benchmarks, making it suitable for a wide range of production tasks.

Qwen 3.5 Flash excels at high-volume translation, quick classification, lightweight conversational AI, and any cost-sensitive deployment where the full power of Qwen 3.5 Plus is unnecessary. Its 201-language support makes it particularly valuable for global applications that need to handle diverse linguistic inputs at scale.

Pricing

Metric	Price
Input /1M tokens	₹9.0000
Output /1M tokens	₹35.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

Ultra-low cost: $0.09/1M input tokens
Native multimodal support
201 language coverage
Fast inference optimized for high volume

Benchmarks

Benchmark	Score	Notes
MMLU-Pro	74.2%	Professional knowledge
HumanEval	83.5%	Code generation
MATH-500	82.1%	Competition mathematics
GPQA Diamond	56.8%	Graduate-level science

Technical Details

Fastest model in the Qwen 3.5 family — optimized for speed
262K native context window (extendable to ~1M)
Native multimodal: text, image, video, and audio input
201 language support — same linguistic coverage as Qwen 3.5 Plus
Ultra-low pricing: $0.09/M input, $0.35/M output
MoE architecture for efficient inference
Available via Alibaba Cloud API and CallMissed unified gateway

Strengths

Ultra-low cost at $0.09/M input — among the cheapest capable models
201 language support for global applications
Native multimodal capabilities despite low price point
Fast inference optimized for high-volume production workloads

Limitations

Reduced reasoning depth compared to Qwen 3.5 Plus
Not suitable for complex multi-step reasoning or deep analysis
Less established ecosystem compared to OpenAI and Anthropic models

Use Cases

High-volume translationQuick classificationLightweight chatCost-sensitive deployments

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "qwen/qwen3.5-flash", "messages": [{"role": "user", "content": "Translate this to Japanese"}]}'

Endpoint: POST /v1/chat/completions · Model ID: qwen/qwen3.5-flash

Try Qwen 3.5 Flash now

Get 1000 free API credits on signup. No credit card required.

Start free Read docs