Every day I load Chinese open models on a cluster with four RTX 3090s. The more I see them tuned for domestic hardware, the more I wonder what they actually run on. The answer is a map of seven companies shipping AI accelerators with specs comparable to H100 and H200. Most went public in the last six months. This isn't experimentation—it's mass production.
The Three Dragons and Four Snakes
The local classification splits players into "dragons" (full-stack Big Tech) and "snakes" (recently listed pure-plays). Huawei is the dominant dragon: it shipped 812,000 AI cards last year alone, 49% of domestic supply, using its own HBM and its own fabs. The Ascend 950 reportedly targets H200-class. Alibaba, another dragon, is shipping a server with 16 GPUs, each with 96 GB, for a total of 1.5 TB of VRAM in a single chassis—enough to hold a frontier model in BF16 entirely on-premises.
The snakes, such as MetaX, stand out for their DNA: founded by the former global GPU leadership of AMD, headquartered in Shenzhen, revenue has multiplied 3,800x in three years. Several other startups have roots in NVIDIA and AMD.
Production Shift
Manufacturing has moved from TSMC to SMIC, and NVIDIA's market share in China has fallen from 95% to 55% in two years. Domestic hardware and open models are converging fast: the metal and the Large Language Models speak the same language.
Implications for On-Premises Deployment
For those evaluating on-premises deployment, this trend expands hardware options beyond NVIDIA GPUs alone. Servers with massive VRAM density like Alibaba's enable entirely local hosting of frontier models without relying on the cloud, preserving full data control and reducing dependence on foreign suppliers. However, unknowns remain: software support, driver maturity, compatibility with widely used inference frameworks such as vLLM or llama.cpp. AI-RADAR provides analytical tools to weigh these trade-offs, especially for calculating the real TCO of stacks based on non-NVIDIA hardware.
Future Outlook
The convergence of open models and local accelerators could reshape the enterprise AI market, particularly for organizations with strict data sovereignty requirements. The key question is whether these GPUs will earn developer trust beyond national borders. For now, China has shown it can build competitive hardware and bring it to market quickly, while the rest of the world watches.
💬 Comments (0)
🔒 Log in or register to comment on articles.
No comments yet. Be the first to comment!