DeepSeek V4 (Flash & Pro)
The largest open-weights model of 2026 — MIT-licensed, frontier-class on coding, and a fraction of the cost of GPT-5.5.
DeepSeek V4 specs at a glance
| Maker | DeepSeek (China) |
| Released | 24 April 2026 |
| Tiers | DeepSeek V4-Pro · DeepSeek V4-Flash |
| Architecture | Mixture of Experts (MoE) |
| V4-Pro size | 1.6 trillion total parameters · 49B active |
| V4-Flash size | 284 billion parameters · 13B active · 160GB download |
| Context window | 1,000,000 tokens (both tiers) |
| Licence | MIT — open weights on Hugging Face & ModelScope |
| Best at | Coding, competitive programming, low-cost agentic work |
Flash vs Pro — which tier is which
DeepSeek V4 ships in two tiers built for different jobs. Both use a Mixture-of-Experts architecture, where only a fraction of total parameters activate per token — which is why a 1.6-trillion-parameter model can run at a sane cost.
| V4-Flash | V4-Pro | |
|---|---|---|
| Total parameters | 284 billion | 1.6 trillion |
| Active parameters | 13 billion | 49 billion |
| Download size | ~160 GB | Larger (full weights) |
| Built for | Speed, high volume, low cost | Top-tier capability |
| Input / output per 1M | $0.14 / $0.28 | $1.74 / $3.48 |
DeepSeek V4 benchmarks
V4-Pro lands squarely in frontier territory, especially on engineering and competitive-programming tasks.
| Benchmark | V4-Pro | What it measures |
|---|---|---|
| SWE-bench Verified | 80.6% | Real-world software engineering — within 0.2 pts of Claude Opus 4.6 |
| Codeforces rating | 3,206 | Competitive programming — highest of any model at release |
Key takeaway
V4-Pro's Codeforces rating of 3,206 beat GPT-5.4's 3,168 — the strongest competitive-programming result of any model at the time of release, and it comes with open, MIT-licensed weights.
Pricing
Cost is DeepSeek's sharpest weapon. V4-Flash is priced for high-volume work at $0.14 / $0.28 per 1M input / output tokens; V4-Pro at $1.74 / $3.48. For context, GPT-5.5 runs $5 / $30 and Claude Opus 4.7 is $5 / $25 — so V4-Pro delivers near-frontier coding scores at roughly a tenth of GPT-5.5's output cost.
Open weights and licence
Both V4 tiers are released under the MIT License — one of the most permissive licences available — with weights published on Hugging Face and ModelScope. That means you can download, fine-tune, self-host and deploy DeepSeek V4 commercially without per-token API fees or usage restrictions. For teams with data-residency requirements or a need to avoid vendor lock-in, that is the model's biggest structural advantage over GPT-5.5 and Gemini 3.1, which are closed and API-only.
Who should use DeepSeek V4
- Cost-sensitive teams — V4-Flash at $0.14/$0.28 is among the cheapest capable models available.
- Anyone who needs to self-host — open MIT weights mean full control over deployment and data.
- Coding and competitive-programming workloads — V4-Pro's SWE-bench and Codeforces scores are frontier-grade.
- Developers avoiding vendor lock-in — no API dependency, no per-token billing if you run it yourself.
It is a weaker fit if you need native video/audio multimodality (Gemini 3.1 leads there) or the deepest agentic-coding reliability inside a managed environment (GPT-5.5's Codex integration).
How DeepSeek V4 compares
- DeepSeek V4 vs GPT-5.5 — open weights vs flagship →
- GPT-5.5 ("Spud") overview →
- Gemini 3.1 Ultra overview →
- GPT-5.5 vs Gemini 3.1 Ultra →
Frequently asked questions
When was DeepSeek V4 released?
DeepSeek V4 was released on 24 April 2026 in two versions: DeepSeek V4-Pro and DeepSeek V4-Flash.
How much does DeepSeek V4 cost?
DeepSeek V4-Flash costs $0.14 per million input tokens and $0.28 per million output tokens. DeepSeek V4-Pro costs $1.74 per million input tokens and $3.48 per million output tokens.
Is DeepSeek V4 open source?
Yes. Both DeepSeek V4-Pro and V4-Flash are released under the MIT License, with weights downloadable from Hugging Face and ModelScope.
How big is DeepSeek V4-Pro?
DeepSeek V4-Pro has 1.6 trillion total parameters with 49 billion active, using a Mixture-of-Experts architecture. It is the largest open-weights model currently available. V4-Flash has 284 billion parameters with 13 billion active.
How good is DeepSeek V4 at coding?
DeepSeek V4-Pro scores 80.6% on SWE-bench Verified — within 0.2 points of Claude Opus 4.6 — and reaches a Codeforces rating of 3,206, the highest competitive-programming score of any model at its release.
What is DeepSeek V4's context window?
Both DeepSeek V4-Pro and V4-Flash ship with a 1 million token context window by default.