DeepSeek V4 Pro at 75% Off – Pricing, Specs, and What You Need to Know
DeepSeek just updated their pricing page, and the headline is hard to miss: the V4 Pro model is 75% off until May 31, 2026. If you've been on the fence about trying their API, this is a strong signal to jump in.
The Models
Two models are available: deepseek-v4-flash and deepseek-v4-pro. Both support 1 million token context windows with a maximum output of 384K tokens. That's massive – enough to process entire codebases or long documents in a single pass.
Both models support JSON output, tool calls, and chat prefix completion (beta). The Flash model also supports FIM (Fill-in-the-Middle) completion in non-thinking mode – useful for code completion tasks.
Pricing Breakdown
DeepSeek V4 Flash
- Input (cache hit): $0.0028 per million tokens
- Input (cache miss): $0.14 per million tokens
- Output: $0.28 per million tokens
DeepSeek V4 Pro (75% off until May 31, 2026)
- Input (cache hit): $0.003625 per million tokens (was $0.0145)
- Input (cache miss): $0.435 per million tokens (was $1.74)
- Output: $0.87 per million tokens (was $3.48)
The cache hit price for all models has been reduced to 1/10 of the launch price as of April 26, 2026. This means if your prompts hit the cache (common for repetitive tasks), you pay a fraction of the already low price.
Thinking Mode
Both models support thinking mode (default for Pro) and non-thinking mode. The Flash model defaults to thinking mode as well. You can switch modes via the API – useful for tasks that require reasoning versus those that need fast responses.
API Compatibility
The DeepSeek API is compatible with OpenAI format at https://api.deepseek.com and Anthropic format at https://api.deepseek.com/anthropic. This means you can drop it into existing projects with minimal code changes.
Deprecation Warning
Note that the model names deepseek-chat and deepseek-reasoner will be deprecated. They currently map to non-thinking and thinking modes of V4 Flash, respectively. Update your code to use deepseek-v4-flash explicitly.
What This Means for Developers
If you're building AI-powered features that require large context windows or tool calling, the V4 Pro at 75% off is a bargain. The cache hit pricing is absurdly low – $0.0036 per million tokens for Pro. That's competitive with even the cheapest alternatives.
Next Steps
- Check your current DeepSeek usage and see if you can benefit from the Pro model's thinking capabilities.
- Migrate from deprecated model names (
deepseek-chat,deepseek-reasoner) todeepseek-v4-flashordeepseek-v4-pro. - Optimize your prompts to maximize cache hits – structure inputs to reuse common prefixes.
- Act before May 31, 2026, to lock in the 75% discount on Pro.
The pricing page is live at api-docs.deepseek.com/quick_start/pricing. Go update your configs.





