DeepSeek V4 Slashed Prices 75% in Two Days: The Pricing Psychology Behind the Madness
Last Friday night, just as I was about to shut down my laptop, a push notification popped up: DeepSeek V4 Pro, 75% off for a limited time.
Wait - didn’t this thing just launch two days ago?
I quickly pulled up the timeline: April 24, V4 officially released, priced at 24 yuan per million output tokens. April 25, the website quietly updated - boom, 75% off, output drops to 6 yuan. April 26, cache-hit pricing across all APIs slashed to one-tenth of the original.
Two days. Two rounds of price cuts. Combined discount: over 90%.
My first reaction wasn’t “what a deal.” It was “this feels way too deliberate.”
What V4 Actually Brings
DeepSeek V4 ships in two flavors: Pro (1.6 trillion parameters, 49B active) and Flash (284B parameters, 13B active), both supporting 1M token context windows. The real technical highlight is their custom sparse attention architecture - Pro’s per-token compute is just 27% of V3.2, with KV cache down to 10%.
Translation: same workload, 70%+ cheaper to run.
And here’s the kicker - V4 is deeply optimized for Huawei’s Ascend 950 chips. The official docs explicitly state it’s “the first trillion-parameter model to officially certify support for domestic AI chips.”
That sentence carries a lot of weight.
The Rhythm of Price Cuts
Many people assumed DeepSeek panicked after community backlash - V4 Pro’s initial pricing was several times higher than previous models. But look at the timeline: the first price cut came less than 24 hours after launch. That’s not crisis management. That’s a script.
My read: the high-then-low pricing was intentional.
Launch at a price that looks expensive, let the market buzz. Then slash 75% the next day, creating a massive psychological contrast. In consumer goods, this is called anchoring. Show a high price first, then reveal the real price, and people feel like they’re getting a steal.
Day three was even more aggressive: cache-hit pricing dropped to one-tenth across all models. This targets enterprise use cases with high cache-hit rates - RAG, customer service bots, document analysis - where costs effectively fell off a cliff.
Ascend 950 Is the Real Play
Buried in the pricing announcement was one line: “We expect V4 Pro prices to drop significantly further once Ascend 950 super-nodes ship in volume in H2.”
That’s the core of the entire pricing strategy.
The current 75% discount expires May 5. But DeepSeek has already told you: it gets cheaper from here. Why would you switch to anyone else?
The Ascend 950 optimization gives this promise technical backing. Domestic chips plus domestic models equals full-stack cost control - not just a technology narrative, but a business one.
What This Means for the Industry
V4 Flash cache-hit pricing is now 0.02 yuan per million tokens. That’s over 100x cheaper than OpenAI’s latest models.
I’m not saying OpenAI’s models aren’t good - GPT-5.5 is genuinely impressive. But when the price gap hits two orders of magnitude, the question shifts from “which is better” to “which can I afford.”
For small companies and indie developers, this pricing means you can run trillion-parameter models on projects you wouldn’t have dared attempt before. Honestly, I’m already planning to migrate a few of my side projects.
But cheap also deserves scrutiny. When something is priced this aggressively, you have to ask: where’s the money coming from? DeepSeek is backed by High-Flyer Quant, so cash isn’t the issue. But sustained API subsidies - is this about grabbing market share, or building a price barrier that competitors can’t breach?
My Take
DeepSeek V4’s price war isn’t an isolated event. It’s a signal that Chinese LLM commercialization has entered deep water.
The tech is real - 1.6T parameters, 1M context, Ascend optimization. But the pricing strategy deserves equal attention: lock users in with extreme value, reduce long-term costs through domestic chip integration, and create sustained price expectations.
This isn’t a tech company setting prices. This is a market-savvy team fighting a war.
Who wins? No idea. But for those of us consuming APIs, now’s a good time to jump in. The 75% discount expires May 5.