DeepSeek V4 (Flash & Pro)

The largest open-weights model of 2026 — MIT-licensed, frontier-class on coding, and a fraction of the cost of GPT-5.5.

DeepSeek V4 is an open-weights AI model released on 24 April 2026 in two tiers: V4-Pro (1.6 trillion parameters) and V4-Flash (284 billion). Both are MIT-licensed with downloadable weights, ship a 1 million token context window, and undercut Western flagships heavily on price.

DeepSeek V4 specs at a glance

MakerDeepSeek (China)
Released24 April 2026
TiersDeepSeek V4-Pro · DeepSeek V4-Flash
ArchitectureMixture of Experts (MoE)
V4-Pro size1.6 trillion total parameters · 49B active
V4-Flash size284 billion parameters · 13B active · 160GB download
Context window1,000,000 tokens (both tiers)
LicenceMIT — open weights on Hugging Face & ModelScope
Best atCoding, competitive programming, low-cost agentic work
1.6T
parameters (V4-Pro) — largest open model
80.6%
SWE-bench Verified
$0.14
per 1M input tokens (V4-Flash)

Flash vs Pro — which tier is which

DeepSeek V4 ships in two tiers built for different jobs. Both use a Mixture-of-Experts architecture, where only a fraction of total parameters activate per token — which is why a 1.6-trillion-parameter model can run at a sane cost.

 V4-FlashV4-Pro
Total parameters284 billion1.6 trillion
Active parameters13 billion49 billion
Download size~160 GBLarger (full weights)
Built forSpeed, high volume, low costTop-tier capability
Input / output per 1M$0.14 / $0.28$1.74 / $3.48

DeepSeek V4 benchmarks

V4-Pro lands squarely in frontier territory, especially on engineering and competitive-programming tasks.

BenchmarkV4-ProWhat it measures
SWE-bench Verified80.6%Real-world software engineering — within 0.2 pts of Claude Opus 4.6
Codeforces rating3,206Competitive programming — highest of any model at release

Key takeaway

V4-Pro's Codeforces rating of 3,206 beat GPT-5.4's 3,168 — the strongest competitive-programming result of any model at the time of release, and it comes with open, MIT-licensed weights.

Pricing

Cost is DeepSeek's sharpest weapon. V4-Flash is priced for high-volume work at $0.14 / $0.28 per 1M input / output tokens; V4-Pro at $1.74 / $3.48. For context, GPT-5.5 runs $5 / $30 and Claude Opus 4.7 is $5 / $25 — so V4-Pro delivers near-frontier coding scores at roughly a tenth of GPT-5.5's output cost.

Open weights and licence

Both V4 tiers are released under the MIT License — one of the most permissive licences available — with weights published on Hugging Face and ModelScope. That means you can download, fine-tune, self-host and deploy DeepSeek V4 commercially without per-token API fees or usage restrictions. For teams with data-residency requirements or a need to avoid vendor lock-in, that is the model's biggest structural advantage over GPT-5.5 and Gemini 3.1, which are closed and API-only.

Who should use DeepSeek V4

It is a weaker fit if you need native video/audio multimodality (Gemini 3.1 leads there) or the deepest agentic-coding reliability inside a managed environment (GPT-5.5's Codex integration).

How DeepSeek V4 compares

Visit DeepSeek

Ad slot — AdSense in-article unit

Frequently asked questions

When was DeepSeek V4 released?

DeepSeek V4 was released on 24 April 2026 in two versions: DeepSeek V4-Pro and DeepSeek V4-Flash.

How much does DeepSeek V4 cost?

DeepSeek V4-Flash costs $0.14 per million input tokens and $0.28 per million output tokens. DeepSeek V4-Pro costs $1.74 per million input tokens and $3.48 per million output tokens.

Is DeepSeek V4 open source?

Yes. Both DeepSeek V4-Pro and V4-Flash are released under the MIT License, with weights downloadable from Hugging Face and ModelScope.

How big is DeepSeek V4-Pro?

DeepSeek V4-Pro has 1.6 trillion total parameters with 49 billion active, using a Mixture-of-Experts architecture. It is the largest open-weights model currently available. V4-Flash has 284 billion parameters with 13 billion active.

How good is DeepSeek V4 at coding?

DeepSeek V4-Pro scores 80.6% on SWE-bench Verified — within 0.2 points of Claude Opus 4.6 — and reaches a Codeforces rating of 3,206, the highest competitive-programming score of any model at its release.

What is DeepSeek V4's context window?

Both DeepSeek V4-Pro and V4-Flash ship with a 1 million token context window by default.