Google's TPUv8 Is Finally Real — And This Time It Might Actually Hurt Nvidia

A few days ago I saw a headline about Google partnering with Marvell on AI chips and didn’t think much of it. Then I looked closer at the specs. That got my attention.

Now the follow-up dropped: Google is officially launching the TPUv8 series at Google Cloud Next this week.

Not one chip. Two chips.

TPUv8t, designed by Broadcom, focused on high-performance training acceleration.
TPUv8i, built by MediaTek, targeting cost-efficient inference.

This split is telling. Google isn’t trying to build another “does everything” chip. They’re going all-in on the idea that training and inference have fundamentally different workloads, and should probably be handled by different silicon.

Honestly? I’ve been waiting for this move.

Nvidia’s GPUs are genuinely dominant for AI training. But dominant doesn’t mean “best value.” During training, GPU versatility is an asset. But inference? A purpose-built chip can absolutely crush a GPU on cost-per-inference. I thought about this endlessly when I was at a major tech company writing ML code. The fact that Google is actually doing it means someone internal finally convinced the right people.

So will it work?

My take: yes, partly — but “replacing Nvidia entirely” is nonsense.

Let’s start with what can actually happen.

TPUv8i’s positioning is refreshingly honest. It’s not trying to beat the H100. It’s aiming for “cheaper than H100, faster than CPU.” MediaTek’s track record in power efficiency for mobile SoCs is well known — applying that to AI inference silicon is smart. If they can price it at 60-70% of Nvidia’s comparable products, mid-size cloud providers will absolutely take notice.

Now for what won’t work.

Nvidia’s real moat isn’t the chip itself — it’s CUDA. Millions of developers worldwide have codebases, toolchains, and lab benchmarks all built around CUDA. Asking an enterprise to migrate their training infrastructure from Nvidia to TPUv8 is less a technical challenge and more a full ecosystem migration. That’s a CFO problem, not an engineering problem.

Google has its own TPU ecosystem, but it’s locked to Google Cloud. Use TPU, use Google Cloud. For some companies that’s a selling point. For others, it’s a dealbreaker.

So the real significance of this week’s launch is: it signals that Nvidia’s pricing power is finally facing real pushback.

OpenAI just committed to buying more than $20 billion in Cerebras chips. Nvidia itself spent $20 billion acquiring Groq. Now Google’s in the arena. Three chip companies fighting for AI infrastructure dominance — the real war is just getting started.

For regular developers, this is genuinely good news. The trend toward cheaper compute won’t be stopped by Nvidia raising GPU prices. It might just take longer than the optimistic forecasts suggest.

Let’s wait for the benchmarks.

— Lin Rui, writing from Shenzhen

Your turn: Which cloud provider’s AI compute are you currently using? Nvidia or something else? Drop a comment.