LLM Chat

Qwen 3.5 Plus

by Qwen · Released February 2026

Alibaba's premium Qwen 3.5 variant with a 1M token context window. Part of the Qwen 3.5 family that features native multimodality, hybrid thinking modes, and support for 201 languages. The flagship 397B MoE model beat Claude 4.5 Opus on the HMMT math benchmark.

LLM Chat

Qwen 3.5 Plus

Powered by Qwen · Sparse Mixture-of-Experts (397B total / 17B active)

Context Window

128K

Parameters

397B total / 17B active (MoE)

Max Output

16K

Category

LLM Chat

Overview

Qwen 3.5 Plus is Alibaba's flagship model in the Qwen 3.5 family, featuring a 397B total parameter Mixture-of-Experts architecture with 17B active parameters per token. It represents a major leap in multilingual AI, supporting 201 languages natively — far more than any other model on the platform — with native multimodal capabilities across text, image, video, and audio.

The model introduces hybrid thinking modes that let developers toggle between thinking (chain-of-thought reasoning) and non-thinking (fast, direct response) modes. On the HMMT math benchmark, Qwen 3.5 Plus beat Claude 4.5 Opus, demonstrating frontier-level mathematical reasoning. The MoE architecture keeps inference costs low despite the massive total parameter count, with only 17B parameters active per token.

Qwen 3.5 Plus is particularly strong for multilingual applications, mathematical reasoning, and long-document analysis. Its 201-language support makes it the most linguistically diverse model available, covering not just major world languages but also many low-resource languages. The native multimodal capabilities enable workflows that combine text, image, video, and audio processing in a single model.

Pricing

MetricPrice
Input /1M tokens₹35.0000
Output /1M tokens₹210.0000

1 credit = ₹1 = $0.01 USD. Prices shown from provider; CallMissed passes through with ~35% markup.

Key Highlights

  • Native multimodal: text, image, video, audio
  • 201 language support
  • Hybrid thinking modes (thinking/non-thinking toggles)
  • Beat Claude 4.5 Opus on HMMT math benchmark

Benchmarks

BenchmarkScore
HMMTBeat Opus 4.5
MATH-50094.8%
MMLU-Pro83.5%
HumanEval90.2%
GPQA Diamond72.1%

Technical Details

  • Architecture: Sparse MoE with 397B total / 17B active parameters per token
  • Native multimodal: text, image, video, and audio input
  • 201 language support — most linguistically diverse model available
  • Hybrid thinking modes: toggle between chain-of-thought and direct response
  • Beat Claude 4.5 Opus on HMMT math benchmark
  • MoE routing keeps inference cost low despite massive total parameter count
  • Available via Alibaba Cloud API and CallMissed unified gateway

Strengths

  • Beat Claude 4.5 Opus on HMMT — frontier-level mathematical reasoning
  • 201 language support — by far the most linguistically diverse model
  • Native multimodal across text, image, video, and audio
  • Hybrid thinking modes for flexible reasoning depth control
  • Affordable at $0.35/$2.10 per 1M tokens for a 397B model

Limitations

  • 128K context is smaller than 1M-context competitors
  • Less established in Western markets compared to OpenAI and Anthropic
  • Multimodal capabilities may vary in quality across modalities

Use Cases

Multilingual applicationsMath and reasoningLong document analysisMultimodal tasks

API Example

curl https://api.callmissed.com/v1/chat/completions \
  -H "Authorization: Bearer cm_YOUR_KEY" \
  -d '{"model": "qwen/qwen3.5-plus", "messages": [{"role": "user", "content": "Solve this math problem step by step"}]}'

Endpoint: POST /v1/chat/completions · Model ID: qwen/qwen3.5-plus

Try Qwen 3.5 Plus now

Get 1000 free API credits on signup. No credit card required.