How much will DeepSeek V4 cost?

DeepSeek V4 has not launched as of April 13, 2026. Multiple pricing trackers report a projected price of $0.30 per million input tokens and $0.50 per million output tokens, though DeepSeek has not published official pricing. Current DeepSeek V3.2 costs $0.28 input / $0.42 output per million tokens.

When will DeepSeek V4 be released?

DeepSeek V4 is expected to launch in late April 2026. Reuters reported on April 3 that it would ship 'within weeks.' A grayscale test interface appeared on April 8, 2026, per TechNode. There is no confirmed official release date.

How many parameters does DeepSeek V4 have?

DeepSeek V4 reportedly uses a Mixture-of-Experts architecture with approximately 1 trillion total parameters, but only around 37 billion parameters active per token - the same active-per-token count as DeepSeek V3.2 (which has 671 billion total parameters). This means inference costs stay similar despite the larger total model size.

Does DeepSeek V4 run on Nvidia chips?

No. DeepSeek V4 is reportedly trained on Huawei Ascend 910B and Cambricon MLU chips, with inference on Huawei Ascend 950PR. If confirmed, it would be the first frontier model built with no Nvidia hardware. DeepSeek V3 used Nvidia H800 chips.

Model ReleaseApril 13, 2026·8 min read

DeepSeek V4: $0.30 per million tokens for a 1 trillion parameter model

DeepSeek V4 is not out yet. A grayscale test interface appeared April 8, Reuters confirmed a late-April window on April 3, and every major pricing tracker has landed on $0.30 per million input tokens. Here's what the leaked specs mean for anyone running production workloads.

Blue and purple abstract 3D spiral on dark background representing AI model architecture

Photo by Milad Fakurian on Unsplash

Reported specs at a glance

Status	Not launched (late April 2026 window per Reuters, April 3)
Architecture	MoE — ~1T total parameters, ~37B active per token
Context	1M tokens via 'Engram' memory (V3.2: 128K)
Chips	Huawei Ascend 910B training, Ascend 950PR inference. Zero Nvidia.
Price (projected)	$0.30/M input, $0.50/M output — tracker consensus, not official
Benchmarks	SWE-bench 80-85%, HumanEval 90%, GPQA 59.1% — all internal/unverified
License	Apache 2.0 planned (unconfirmed until release)

Why the $0.30 figure is worth paying attention to

DeepSeek V3.2 already sits at $0.28/M input. So V4 is not a dramatic price cut - it's roughly the same cost as the current model, but with 1 million tokens of context instead of 128K and a significantly larger expert pool. If the leaked benchmark numbers hold up anywhere near frontier-level performance, you're getting that capability at prices most developers associate with mid-tier models.

The comparison that stops me: Claude Opus 4.6 charges $5.00/M input. GPT-5.4 is $2.50/M. DeepSeek V4 is projected at $0.30/M. If V4's real-world results on your actual workloads come close to either of those models, the math at scale is hard to ignore.

There are real caveats - all benchmarks are internal leaks, the Huawei chip dependency introduces enterprise procurement friction, and the model is not live yet. But the signals are consistent enough to be worth understanding before it drops.

Timeline of V4 signals

Date	Signal	Source
Feb 2026	Original Lunar New Year launch window passes. Hardware engineering challenges cited.	Multiple reports
March 9	V4 Lite (~200B params) briefly appears on DeepSeek website, then disappears.	Dataconomy
March 16	April launch target reported after sourced reporting.	Dataconomy
April 3	Reuters: launch 'within weeks.' Confirms Huawei Ascend chip use.	Reuters / The Information
April 7	TrendForce confirms Huawei Ascend 950PR as inference chip.	TrendForce
April 8	TechNode publishes grayscale screenshot: 'Fast,' 'Expert,' and 'Vision' mode tabs in test interface.	TechNode
April 13	Official DeepSeek API docs still show V3.2 only. No V4 listing.	api-docs.deepseek.com

The test interface screenshot is the most concrete signal. Three distinct modes in a single interface suggests the product lineup is finalized. DeepSeek does not do long soft launches - V3 and R1 both went public with short notice and minimal pre-announcement.

Why $0.30 is plausible for a 1T parameter model

The number that matters for inference cost is not total parameters - it's active parameters per token. V4's MoE architecture routes each token through roughly 37 billion parameters, the same as V3.2. Total model size went from 671B to ~1T, but per-token compute stayed flat because V4 just has more experts to choose from - not more experts active simultaneously.

The 1M context window is the harder engineering problem. DeepSeek built two new components to make it cost-viable: Engram handles conditional long-context memory, and DeepSeek Sparse Attention (DSA) cuts the compute cost of attending over long sequences. Without something like DSA, serving 1M context at $0.30/M tokens would lose money on every call - attention computation scales quadratically with sequence length.

Whether these architectural claims hold under independent load is unknown. DeepSeek's V3 and R1 claims did hold up when we and others tested them - R1 matched o1 on several benchmarks at roughly 3% of the cost. V4 is making larger claims from a more complex architecture, so I would not assume the same track record carries over automatically.

Where V4 pricing sits against the current market

Using the $0.30/$0.50 projected figures against current production prices:

Model	Input / 1M	Output / 1M	Context	vs V4 input
DeepSeek V4 (projected)	$0.30	$0.50	1M	baseline
DeepSeek V3.2 (current)	$0.28	$0.42	128K	1.1x cheaper
DeepSeek R1	$0.55	$2.19	64K	1.8x more
Gemini 3.1 Pro	$2.00	$12.00	1M	6.7x more
GPT-5.4	$2.50	$15.00	272K / 1M	8.3x more
Claude Opus 4.6	$5.00	$25.00	200K	17x more

V4 pricing from tracker consensus (devtk.ai, NxCode). Not official DeepSeek pricing. All other prices current as of April 2026.

On output tokens, V4 pulls further ahead. At $0.50/M vs Claude Opus 4.6's $25/M, that is a 50x difference on the token type that actually drives costs in most applications. A workload generating 100M output tokens per month costs $50 on V4 vs $2,500 on Opus. At that scale, even a meaningful quality gap might not justify staying put.

Worth saying plainly: DeepSeek V3.2 at $0.28 input is already cheap and performs well on coding tasks. V4 is marginally more expensive on input while adding 1M context - a good trade if you actually need long context, irrelevant if your workloads run sub-32K. You can check V3.2's current live pricing on OpenRouter if you want to compare against third-party providers.

The Huawei chip dependency

DeepSeek V3 ran on Nvidia H800s - the export-restricted version of the H100 that could still reach China. V4 reportedly moves entirely to Huawei Ascend 910B for training and Ascend 950PR for inference. If that holds, it is the clearest evidence yet that US export controls did not achieve their stated goal of slowing Chinese frontier AI development. DeepSeek did not hit a hardware ceiling - it built around it.

For US developers, the practical question is API accessibility. DeepSeek's API has been internationally accessible through V3 and V3.2 without restrictions from the DeepSeek side. Congressional reports have flagged data security concerns, some governments have restricted DeepSeek apps at the device level, but those restrictions have generally not extended to API access in most jurisdictions.

Enterprise procurement teams at larger US companies will need legal review before routing production traffic through DeepSeek regardless of the per-token price. That friction is real and does not appear in the pricing table. For startups and individual developers, it is much less of a blocker.

All benchmark numbers are from leaks

SWE-bench 80-85%, HumanEval 90%, MMLU 88.5 - these come from internal DeepSeek testing and secondhand reporting. No independent evaluation has happened. The pattern with V3 and R1 was that leaked claims held up well on independent replication, but V4 is making larger claims from a more complex architecture.

One number worth watching specifically: GPQA Diamond. The leaked V4 claim is 59.1%, which would be below Claude Opus 4.6 (80.8%) and well below Gemini 3.1 Pro (94.3%). If that holds, V4 is a strong coding and instruction-following model, not a scientific reasoning model. That changes which workloads it makes sense for.

Most production LLM workloads are not running GPQA Diamond problems. Document processing, code generation, summarization, structured extraction - for these, V4's coding and instruction-following numbers are the relevant signal.

Before V4 launches

If you are running DeepSeek V3.2 now, the upgrade calculus is straightforward: similar price, 8x more context. The main risk is whether the new architecture introduces regressions on tasks V3.2 handles reliably. Run your eval suite against V4 on day one before migrating any production traffic.

If you are on GPT-5.4 or Claude Opus 4.6 for production workloads and have not seriously tested DeepSeek, V4's launch is a reasonable moment to benchmark it. The price differential is large enough that even a 10-15% quality gap on your specific tasks might be worth accepting for high-volume workloads.

We'll add V4 to the pricing comparison page the day it launches. The cost calculator lets you model what $0.30 input would cost at your current token volumes - worth running before the model drops so you have a number ready when it is time to evaluate.

Sources

- Reuters (via The Information): "DeepSeek V4 to launch within weeks, running on Huawei chips," April 3, 2026
- TechNode: "DeepSeek V4 may launch this month, test interface suggests Vision and Expert modes," April 8, 2026
- TrendForce: "Decoding DeepSeek V4: how Huawei's Ascend 950PR is powering China's push to break CUDA dependence," April 7, 2026
- Dataconomy: "DeepSeek V4 and Tencent's new Hunyuan model to launch in April," March 16, 2026
- HuaweiCentral: "DeepSeek V4 model will run entirely on Huawei AI chips," 2026
- NxCode: DeepSeek V4 release specs and benchmark report, April 2026
- DevTk.ai: DeepSeek V4 model pricing tracker
- DeepSeek API docs (api-docs.deepseek.com): current V3.2 pricing ($0.28/$0.42 per million tokens)

Compare all model pricing Calculate your API costs