The ARM server world is buzzing. A month after the first benchmarks of the upcoming NVIDIA Vera CPU surfaced, attention now turns to a direct comparison with the silicon it will replace: GB10, the Grace heart inside current superchips. Michael Larabel at Phoronix has crunched the numbers, and for anyone managing on-premise infrastructure the results are far from academic—they’re a compass for future investments.

Grace (GB10) and Vera: dissecting the generational gap

These chips represent two stages of NVIDIA’s ARM bet. GB10 is the CPU core of the Grace Hopper (and later Grace Blackwell) platform, designed to pair with GPUs in high-bandwidth, cache-coherent setups. Vera, expected in 2026, marks the leap to a new architecture and promises an IPC and efficiency uplift that could redefine CPU-bound workloads in data centers.

Phoronix’s tests, run on Linux systems with mature tooling, isolate single-core performance. This isn’t a stylistic exercise: in real-world scenarios, many data preprocessing pipelines, inference orchestration tasks, and token-queue management jobs can saturate cores before GPUs even enter the picture. The measured gap between Vera and GB10 thus provides a tangible yardstick for the improvement on offer.

On-premise, TCO, and sovereignty: why the CPU still matters

When on-premise deployment of LLMs is discussed, focus often lands on VRAM and inference throughput. Yet the CPU remains the conductor: it handles I/O, manages storage, runs the serving framework, and—in hybrid configurations—can take on part of the inference workload for quantized models. Choosing a platform with CPU headroom ensures investment longevity and keeps TCO in check.

For organizations handling sensitive data that cannot be entrusted to the cloud, technological sovereignty also hinges on keeping the entire processing chain under physical control. A more performant CPU allows consolidating more workloads onto fewer nodes, reducing complexity and the attack surface.

The bigger picture: ARM in the data center and the NVIDIA effect

The GB10–Vera benchmarks are not just an in-house duel. They mark a chapter in ARM’s server expansion, a path already blazed by Ampere Altra and AWS Graviton. NVIDIA’s ability to iterate quickly on ARM designs—bolstered by the Mellanox acquisition and GPU synergy—is shifting the balance versus traditional x86 offerings. For teams evaluating hardware for the next refresh cycle, observing these performance deltas helps gauge whether the ARM ecosystem has reached the maturity required for mission-critical AI workloads.

Beyond the benchmark: what to watch next

Numbers only tell part of the story. Energy efficiency, software compatibility, and actual market availability will be the real proving grounds for Vera. In the meantime, the GB10 comparison offers a useful reference for those sizing on-premise clusters today and deciding whether to wait for the next generation. For practitioners, tracking this platform evolution is no longer optional—it’s the prerequisite for every informed architectural choice.