How much does Qwen3.6-Plus cost per million tokens?

On the Global tier (US/EU), Qwen3.6-Plus costs $0.276 per million input tokens and $1.651 per million output tokens for requests under 256K tokens. For requests between 256K and 1M tokens, it costs $1.101 input and $6.602 output. The International (Singapore) tier runs $0.50/$3.00 for the standard band.

How does Qwen3.6-Plus compare to Claude Opus 4.6?

Qwen3.6-Plus costs 18x less on input ($0.276 vs $5.00 per million tokens). On multimodal benchmarks it outperforms Claude 4.5 Opus. On agentic coding, Claude 4.6 Opus leads - scoring 65.4 vs 61.6 on Terminal-Bench 2.0, and 80.9 vs 78.8 on SWE-bench Verified. Alibaba's published comparison used Claude 4.5 Opus, not the current 4.6.

Is Qwen3.6-Plus open source?

No. Qwen3.6-Plus is a proprietary model with no public weights. It is API-only, available through Alibaba Cloud DashScope and OpenRouter. Alibaba has moved to a proprietary strategy for their flagship models.

What context window does Qwen3.6-Plus support?

Qwen3.6-Plus supports a 1,000,000 token context window. Requests using 256K-1M tokens cost more per token than shorter requests under the tiered pricing structure.

Model ReleaseApril 6, 2026·7 min read

Qwen3.6-Plus: $0.28 per million input tokens, and the benchmark comparison Alibaba chose not to lead with

Released April 2, 2026. At $0.276 per million input tokens globally, it costs 18x less than Claude Opus 4.6. On multimodal tasks it genuinely outperforms. On agentic coding, the current Claude Opus 4.6 still wins - which is why Alibaba's benchmark chart compared against the older 4.5.

Qwen3.6-Plus model release benchmark chart showing performance across agentic coding and multimodal tasks

Image source: Qwen Blog

Released April 2. $0.276/M input globally (Global tier, under 256K). 18x less than Claude Opus 4.6.
Genuinely better than Claude 4.5 Opus on documents, images, and video. Claude 4.6 Opus still wins on agentic coding.
Alibaba's benchmark chart used Claude 4.5, not 4.6. OpenRouter collects your prompt data.

Pricing by region and context band

The API pricing depends on where you call from and how much context you use per request. Alibaba bills differently for requests under 256K tokens versus longer ones.

Region	Context band	Input / 1M	Output / 1M
Global (US/EU)	0 - 256K tokens	$0.276	$1.651
Global (US/EU)	256K - 1M tokens	$1.101	$6.602
International (Singapore)	0 - 256K tokens	$0.50	$3.00
International (Singapore)	256K - 1M tokens	$2.00	$6.00

For context: Claude Opus 4.6 costs $5.00/M input and $25.00/M output with no context band tiering. GPT-5.4 runs $2.50/M input and $15.00/M output. Qwen3.6-Plus on the Global tier, for requests under 256K, is the cheapest of the three by a wide margin. For 1M-context requests, the $1.101 Global input rate still undercuts Opus at $5.00 - you're paying $1.10 versus $5.00 per million tokens.

New accounts on Alibaba Cloud get 1 million tokens free per modality for 90 days on the International tier. The model ID is qwen3.6-plus or qwen3.6-plus-2026-04-02.

What the model is

Qwen3.6-Plus uses a hybrid architecture: linear attention layers combined with sparse mixture-of-experts routing. Parameter count is not disclosed. Context window is 1,000,000 tokens. The model handles text, images, and video in a single call - multimodal is native, not bolted on. It also has a hybrid thinking mode enabled by default, meaning it can do explicit chain-of-thought reasoning or skip it depending on task complexity.

Unlike the Qwen3.5 series, this one is proprietary. No weights, no self-hosting. Alibaba has moved the flagship line to API-only deployment. The model is integrated into their enterprise product Wukong and the consumer Qwen app.

Available on Alibaba Cloud (DashScope) and OpenRouter. The OpenRouter listing notes that Alibaba collects prompt and completion data on that route - worth knowing if you're sending sensitive content.

The benchmarks, with context

Alibaba's official chart compares Qwen3.6-Plus against Claude 4.5 Opus, not Claude 4.6 Opus. That matters because on Terminal-Bench 2.0, Claude 4.6 Opus scores 65.4 - above Qwen3.6-Plus at 61.6. The chart makes the terminal benchmark look like a Qwen win (61.6 vs 59.3) when the current model would flip that result.

The multimodal results are a different story. On MMMU, RealWorldQA, OmniDocBench, and Video-MME, Qwen3.6-Plus leads Claude 4.5 Opus by meaningful margins. Claude 4.6 Opus scores are not available for those multimodal benchmarks yet.

Qwen3.6-Plus benchmark chart vs Claude 4.5 Opus: Terminal-Bench, SWE-bench, MMMU, and multimodal tasks

Benchmark	Category	Qwen3.6-Plus	Claude 4.5 Opus	Winner
Terminal-Bench 2.0	Agentic coding	61.6	59.3	Qwen (4.5 only)
SWE-bench Verified	Agentic coding	78.8	80.9	Claude
SWE-bench Pro	Agentic coding	56.6	57.1	Claude
SWE-bench Multilingual	Agentic coding	73.8	77.5	Claude
NL2Repo	Long-horizon coding	37.9	43.2	Claude
MMMU	Multimodal reasoning	86.0	80.7	Qwen
RealWorldQA	Image reasoning	85.4	77.6	Qwen
OmniDocBench v1.5	Document recognition	91.2	87.7	Qwen
Video-MME	Video reasoning	87.8	77.6	Qwen

* Terminal-Bench 2.0: Qwen3.6-Plus scores 61.6 vs Claude 4.5 Opus 59.3, but Claude 4.6 Opus scores 65.4 - which would reverse this result. Alibaba's chart does not include Claude 4.6 Opus. QwenClawBench and QwenWebBench are Alibaba's own proprietary benchmarks and are not shown here. See The Decoder's coverage for the reproduced benchmark chart.

On multimodal tasks, Qwen3.6-Plus competes with Claude Opus at a fraction of the cost. On pure agentic coding, Claude 4.6 Opus is still ahead. Heavy on document processing, image analysis, or video? The price difference makes this worth a real eval.

Cost at scale

Three monthly scenarios using Global tier pricing for Qwen3.6-Plus (requests under 256K tokens). All models use identical token counts.

Scenario	Volume	Qwen3.6-Plus	GPT-5.4	Claude Opus 4.6
Code review	50M in / 15M out	$39	$350	$625
Document intelligence	200M in / 30M out	$105	$950	$1,750
Agentic workflow	500M in / 100M out	$303	$2,750	$5,000

Qwen3.6-Plus: $0.276/M input + $1.651/M output (Global tier, under 256K). GPT-5.4: $2.50/M input + $15/M output. Claude Opus 4.6: $5/M input + $25/M output. Use the TokenCost calculator for your exact numbers.

For the agentic workflow scenario, Qwen3.6-Plus saves roughly $4,700/month versus Claude Opus 4.6 and $2,450/month versus GPT-5.4. At $5,000/month Claude spend, switching to Qwen3.6-Plus for tasks where quality holds up pays back quickly.

Where it makes sense

The clearest wins are multimodal at scale. Document OCR and analysis, image-heavy pipelines, video understanding - these are the workloads where Qwen3.6-Plus leads on benchmarks and where input tokens cost about 95% less. If you're running Claude Opus on these tasks today, the quality gap either doesn't exist or runs the other way.

The 1M context window at $0.276/M makes it interesting for long document summarization, as long as you stay under 256K per request to hold the lower rate. Go over 256K and you jump to $1.101/M - still cheaper than Opus, but less dramatic.

For pure agentic coding - the kind where the model navigates a repo, patches files, runs tests - Claude 4.6 Opus still has an edge on independent benchmarks. The gap on SWE-bench Verified is 80.9 vs 78.8, which might matter or not depending on the task. But on Terminal-Bench 2.0, Claude 4.6 Opus is meaningfully ahead (65.4 vs 61.6). Run evals on your actual workload before deciding.

Our read

Qwen3.6-Plus is the most cost-effective option at the frontier for multimodal workloads, and it's not close. At $0.276 versus $5.00 per million input tokens, you get roughly the same budget to run 18 Claude Opus calls or 100 Qwen3.6-Plus calls. For document processing, image analysis, and video tasks, the benchmarks support switching.

We'd be more cautious on agentic coding. The SWE-bench gap is small, but Alibaba presented their benchmark results against Claude 4.5 Opus specifically, and that choice is telling. The current Claude 4.6 Opus outperforms Qwen3.6-Plus on Terminal-Bench 2.0 and NL2Repo, which are better proxies for real-world coding agent work.

Privacy is the other variable. If you route through OpenRouter, Alibaba collects your prompts. For anything sensitive, route through Alibaba Cloud directly and check their data processing terms.

Compare All Model Pricing Calculate Your API Costs

Sources

Compare All Model Pricing Calculate Your API Costs