DeepSeek Makes 75% V4-Pro Price Cut Permanent

If your business is running AI agents in production, or seriously evaluating which AI providers to put in your stack, today is a date worth noting. DeepSeek has made its 75% price reduction for the V4-Pro model permanent, effective May 31, 2026. What began as a promotional discount is now the standard pricing.

The numbers are significant. DeepSeek-V4-Pro now sits at $0.435 per million uncached input tokens and $0.87 per million output tokens, down from the original $1.74 and $3.48. That’s not a rounding error. For businesses running agentic workflows at any meaningful scale, these figures change the cost conversation entirely.

And that’s before accounting for the cache hit change. Since April 26, DeepSeek has also cut input cache hit prices across its entire API suite to one-tenth of their original levels. For production agentic applications, where repeated or structurally similar requests are the norm rather than the exception, this is arguably the bigger deal.

What Actually Changed Today

The 75% discount on V4-Pro was initially framed as a promotional period running until May 31, 2026. Instead of letting it expire, DeepSeek has locked the reduced rate in as permanent pricing. What was a temporary incentive to try the model is now just the price.

This is not a minor adjustment. DeepSeek is explicitly positioning V4-Pro as a cost-competitive alternative to frontier models from OpenAI, Anthropic, and Google. At these prices, the cost calculation for enterprise AI deployments has shifted substantially.

Why the AI Pricing War Matters for Business Leaders

Twelve months ago, the question for most business leaders was whether to invest in AI at all. Today, the question is which AI infrastructure to build on, and cost is now a real factor in that conversation, not just a concern for developers.

The pricing war between Western frontier labs and Chinese AI providers has been accelerating throughout 2026. DeepSeek’s aggressive pricing forces a response from competitors. OpenAI, Anthropic, and Google all feel the pressure to justify their pricing with capability, speed, or enterprise features that DeepSeek cannot match.

For businesses, this creates genuine choices:

High-stakes tasks requiring the best reasoning? Frontier models still earn their premium.
High-volume, repeating workflows? Cost-optimised models are now dramatically cheaper.
Hybrid approaches using different models for different task types? Now table stakes for any serious AI deployment.

The savviest organisations are already running tiered model strategies, using capable and cheaper models for routine agent steps and routing only the complex decisions to premium providers. DeepSeek’s permanent pricing makes this calculus easier to justify.

The Cache Effect on Agentic Workloads

The cut to cache hit pricing deserves more attention than it’s getting. In a typical agentic workflow, the system prompt, context window, and tool definitions get sent with every API call. Much of that content is identical across calls, which is exactly what caching is designed to handle.

At one-tenth the original cache hit price, the effective cost of running repeated agentic tasks has dropped sharply. A customer service agent that handles 10,000 similar enquiries per day, or a data pipeline that runs similar analytical queries on fresh inputs, will see cost structures that look very different from last month.

This matters for any business evaluating whether AI agents can run profitably at the volume their operations require.

What This Means for Business

Re-evaluate your AI vendor mix. If you locked in contracts or built assumptions around AI infrastructure costs from six or twelve months ago, those assumptions may no longer reflect reality. The landscape has shifted quickly.

Volume thresholds have dropped. Tasks that didn’t make financial sense to automate at previous pricing may cross the viability line now. Review your backlog of “too expensive to automate” use cases.

Enterprise AI is becoming a commodity in some segments. Routine, repeatable AI tasks are getting cheaper fast. The differentiation will increasingly come from how well organisations can orchestrate, govern, and layer AI capabilities, not from which model they access.

Don’t mistake cheap for best. Pricing changes don’t erase performance differences. For tasks where accuracy, reasoning depth, or security posture matters (financial analysis, legal review, high-stakes customer decisions), model selection should still be driven by capability first.

The AI pricing war is not slowing down. If anything, DeepSeek’s decision to make this permanent signals that aggressive pricing is now the strategy, not a short-term play. For business leaders building AI into their operations, that’s worth tracking — not just as news, but as a material input to planning.

Enterprise DNA helps businesses understand, evaluate, and deploy AI that actually fits their operations, across strategy, infrastructure, and team capability. Learn more about Omni Advisory.

Source

DeepSeek API Docs

Free Resource

Going deeper with Claude?

Get the free 32-page implementation guide for ANZ teams.

Enterprise DNA Resources