In a move that caught many insiders off guard, China has seized the top spot in the global supercomputer ranking with a system that completely does away with GPUs. The new champion, delivering 2.198 exaflops of sustained performance (Rmax), ended the reign of El Capitan, the US system built on AMD accelerators and CPUs. The news resonates strongly at a time when AI is defined by an insatiable appetite for GPUs: here, not a single specialized accelerator is present.

The rise of homogeneous compute

The result certified by the TOP500 list proves that an architecture composed entirely of tens of thousands of general-purpose processors can reach exascale without relying on vector accelerators. The system’s core is likely based on domestic many-core designs, capable of handling massive parallelism and traditional HPC workloads, but also – as the community is beginning to speculate – deep neural network inference work. The absence of GPUs brings a constraint: aggregate memory bandwidth remains the critical factor, because the processors must feed hundreds of cores without the dedicated HBM typically found on accelerator boards. Still, the outcome opens scenarios where a CPU alone, if engineered with a high core count and a well-tuned memory hierarchy, can compete on throughput and latency with comparable GPU clusters – especially when the model is aggressively quantized or when the bottleneck is data movement rather than peak compute.

What changes for on-premises adopters

For organizations evaluating on-premises LLM deployment, the message is twofold. On one hand, accelerator silicon remains scarce, with lead times that can exceed a year for the most in-demand models. On the other, CPU-only systems offer a gentler operational learning curve: they require no proprietary drivers or separate software stacks and can be managed through ordinary tooling, reducing lifecycle complexity. In batch inference or light fine-tuning scenarios, where throughput-per-watt matters more than peak teraflops, a high-density CPU server fleet may represent a concrete alternative to GPU nodes, particularly when paired with quantization techniques that ease memory bandwidth pressure.

AI-RADAR has long tracked the evolution of architectures aimed at those pursuing data sovereignty and direct hardware control. The Chinese case shows that the CPU-only option is not just a fallback but a viable path to HPC-grade performance while respecting budget and supply chain constraints. Trade-offs remain, of course: modern GPU compute density is still superior on matrix-bound workloads, and the software ecosystem for distributed training on CPUs is less mature. Yet if the goal is serving pre-trained models in an enterprise setting, the gap narrows.

The geopolitical factor and technological sovereignty

Behind the overtaking lies a strategic push. The US administration has imposed export restrictions on advanced semiconductors to China, accelerators foremost. Building an exascale supercomputer without GPUs thus becomes a statement of self-reliance. For European and Italian enterprises, the lesson is clear: dependence on a single silicon supplier for AI introduces operational risk. Diversification toward commodity or custom CPU architectures can enter mid-term TCO evaluations, especially when maintenance contracts and hardware lifespans extend beyond three years.

It remains to be seen whether this approach will influence the design of next-generation enterprise systems for AI workloads. CPU vendors are already integrating matrix compute units and reduced numerical formats suitable for inference. The Chinese milestone may accelerate a trajectory that leads from HPC rooms to corporate racks, making self-hosted CPU-based setups not a niche choice but a strategically grounded one.