Qualcomm's China data center chips: Dragonfly AI accelerators nerfed for export compliance

Qualcomm is gearing up for a new push in the data center arena, this time with a clear geographic focus: China. The Dragonfly chip lineup, according to reports, will feature AI accelerators intentionally dialed back to stay below the performance thresholds set by U.S. export controls on advanced technology. This choice speaks volumes about the state of the tech cold war between Washington and Beijing, and it will have direct consequences for companies running on-premise AI infrastructure in China.

Playing by the rules: slower chips, open market

Since 2022, the United States has progressively tightened export restrictions on semiconductors bound for China, defining caps based on compute capability and interconnect bandwidth. To keep selling into that market, manufacturers must design variants that fall within these limits. This isn't unprecedented: Nvidia already did the same with its A800 and H800 GPUs, which feature reduced NVLink bandwidth compared to the full-powered A100 and H100. In the case of Qualcomm's Dragonfly, it's likely that the performance cut affects the number of active cores or memory bandwidth for inference, in order to meet the restrictions without fundamentally altering the core architecture. The goal is clear: offer accelerators that, while less capable than the unconstrained versions, are still useful for training and inference of medium-sized LLMs, preserving access to the Chinese market that would otherwise be off limits.

What it means for on-premise AI in China

For Chinese data centers and enterprises committed to self-hosted infrastructure, these chips represent a necessary middle ground. On one hand, they enable the building of on-premise clusters for model training and serving, ensuring data sovereignty and direct control—a critical point under Chinese regulations. On the other hand, reduced per-chip performance forces a reconsideration of TCO: reaching the same compute throughput requires more units, with a knock-on effect on energy consumption and operational costs. Nevertheless, for many workloads such as fine-tuning or inference on quantized models (e.g., INT8), the performance gap may be acceptable, especially when the alternative is no access to advanced silicon at all. Moreover, using local accelerators avoids dependency on foreign cloud providers, something Chinese authorities strongly favor. For those evaluating on-premise deployments in China, hardware vendor selection thus introduces an additional trade-off between performance and regulatory compliance.

The fragmentation of AI hardware: two parallel worlds

Qualcomm's move is part of a broader bifurcation in the global AI semiconductor market. While the U.S. administration aims to slow China's technological progress in AI, Beijing's response has been to accelerate investments in domestic chipmakers like Biren and Moore Thread. The result is a kind of "silicon curtain" separating two hardware ecosystems: one based on full-performance NVIDIA, AMD, and Intel products for the rest of the world, and one made of nerfed or locally designed solutions for China. For multinational companies with operations in China, this scenario adds complexity: managing heterogeneous server fleets with differently optimized drivers and frameworks can increase maintenance costs and reduce software portability. Moreover, fragmentation risks slowing the adoption of shared standards, creating separate pockets of expertise.

Outlook: a precarious balance between trade and security

The strategy of nerfed versions keeps the door to the Chinese market open, but it's an unstable equilibrium. U.S. regulators could tighten the parameters further, rendering the newly introduced chips obsolete almost overnight. At the same time, China is investing heavily to close the technology gap, which could lead, in the medium term, to competitive homegrown solutions, reducing reliance on imports. For IT managers and decision-makers operating in China's AI sector, this regulatory and technological uncertainty becomes a key variable in on-premise infrastructure planning. The analysis of trade-offs between performance, TCO, and supply risk will only grow in importance. At AI-RADAR, we will continue to track these developments, because hardware choices for inference and training are never just about specs—they are strategic factors that shape the long-term viability of projects.