Alibaba's HappyOyster Takes the Crown: How Chinese AI Video Surged as Sora Fell
Just two weeks after Sora’s demise, Alibaba dropped a bombshell.
In mid-April, Alibaba officially released HappyOyster, their world model codenamed—somewhat adorably—after a cheerful horse. Don’t let the cute name fool you though. This thing is a beast, beating Sora 2 across multiple industry benchmarks.
Honestly? I have mixed feelings about this news.
On one hand, Chinese AI finally claimed first place on the world stage. That’s worth celebrating. On the other hand, I keep wondering: why Alibaba? Why now?
Let’s talk about what makes HappyOyster special.
According to Alibaba’s official data, the model shows significant improvements in video consistency, physics simulation, and long-form stability. The wildest part? It can generate coherent 2-minute videos where character movements, lighting changes, and physical collisions mostly obey real-world physics.
That might sound underwhelming, but if you’ve ever tried early AI video tools, you know how hard ‘physical common sense’ is to achieve.
I once generated a ‘person walking’ video using a domestic model. The person’s leg clipped straight through their torso. I shared the screenshot in a group chat—we all had a good laugh. But afterward, there was this lingering frustration. That was the state of the art back then.
HappyOyster seems to be making that embarrassment ancient history.
More importantly, Alibaba didn’t just release a model—they dropped an entire world model framework.
What’s a world model? In simple terms, the AI doesn’t just generate video; it ‘understands’ the physics within it. It knows balls fall when thrown, water flows downhill, and people shouldn’t clip through themselves. This intrinsic understanding of how the world works is a crucial step toward more advanced AI.
OpenAI pitched the same ‘world model’ concept when launching Sora. Ironically, Sora shut down due to commercial struggles, while Alibaba’s HappyOyster picked up the mantle.
Here’s a thought worth exploring: why could Chinese AI video rise so quickly after Sora’s fall?
I see several reasons.
First, engineering capability. Domestic tech giants have richer experience in model optimization, cost control, and productization than OpenAI. Sora reportedly cost $15 million daily to run while generating just $2.1 million in revenue—that burn rate was unsustainable even for OpenAI.
Alibaba designed HappyOyster with cost control as a core objective from day one. Rather than blindly chasing maximum parameters, they focused on architecture and inference optimization. This pragmatic approach is more sustainable.
Second, scenario-driven development. Chinese AI video tools—from Kling to Jimeng to HappyOyster—were built with clear commercial applications in mind. Short-form content creation, e-commerce livestreaming, ad production—these use cases have explicit needs and clearer monetization paths.
By contrast, Sora’s positioning always seemed fuzzy. Was it for professional creators or casual users? OpenAI never seemed sure.
Third, ecosystem synergy. Alibaba isn’t building HappyOyster in isolation. It’s deeply integrated with Taobao, Tmall, Youku—creating immediate application scenarios and data feedback loops for rapid iteration.
This ‘model + scenario’ playbook is where domestic giants excel.
I’m not mindlessly cheerleading here though. HappyOyster is impressive, but it’s not quite a true ‘world model’ yet. Right now, it’s more of a ‘physics simulator’ than AI that genuinely understands the world.
Still, Sora’s fall and HappyOyster’s rise mark a significant inflection point. It signals the official divergence of Chinese and American AI video strategies—OpenAI retreating, domestic players charging ahead.
The competition ahead will only intensify. ByteDance, Kuaishou, Baidu—they’re all cooking up something. Who ultimately wins depends on productization speed and commercial闭环 execution.
As a user, I just hope this competition brings better tools and cheaper prices.
After all, the coolest tool in the world is useless if you can’t afford it.