The US-China AI Gap Narrows, But the Responsible AI Gap Widens

The Global AI Landscape: An Evolving Balance

The recent 2026 AI Index Report by Stanford University, published this week, offers an in-depth assessment of the current state of artificial intelligence. This 423-page document, a product of the Stanford Institute for Human-Centred Artificial Intelligence, examines research output, model performance, investment flows, public sentiment, and responsible AI. Among the more unsettling findings, it emerges that the assumption of a durable US lead in AI model performance is no longer supported by data.

The headline findings are striking, but the more consequential insights lie in the sections that have received less coverage, particularly those related to AI safety. Here, the gap between what models can do and how rigorously they are evaluated for potential harm has not narrowed but widened, raising crucial questions for companies evaluating on-premise LLM deployments.

Model Performance and Supply Chain Vulnerabilities

The narrative that places the United States at the forefront of AI development needs an update. According to the report, US and Chinese models have traded the top performance position multiple times since early 2025. In February 2025, DeepSeek-R1 briefly matched the top US model. As of March 2026, Anthropic's leading model holds a mere 2.7% advantage. Although the US still produces more top-tier AI models (50 in 2025 versus China's 30) and holds higher-impact patents, China now leads in publication volume, citation share, and patent grants. Its share among the top 100 most-cited AI papers grew from 33 in 2021 to 41 in 2024. The practical implication is clear: the US technological lead in AI model performance is no longer durable but shifts with each major model release.

The report also identifies a critical structural vulnerability for the global AI infrastructure. The US hosts 5,427 data centers, a number ten times higher than any other country, yet a single company, TSMC, fabricates almost every leading AI chip within them. The entire global AI hardware supply chain runs through one foundry in Taiwan, although a TSMC expansion in the US began operations in 2025. This monolithic dependence represents a significant risk to resilience and data sovereignty, fundamental aspects for organizations considering self-hosted or air-gapped deployments.

The Gap in AI Safety and Governance

AI safety benchmarking is not keeping pace with the evolving capabilities of models. While almost every frontier model developer reports results on ability benchmarks, the same is not true for responsible AI benchmarks. The report's table on safety and responsible AI benchmarks shows that most entries are simply empty. Only Claude Opus 4.5 reports results on more than two of the tracked responsible AI benchmarks, and only GPT-5.2 reports StrongREJECT. For most frontier models, no data is reported on benchmarks measuring fairness, security, and human agency.

This does not imply that labs are not conducting internal safety work. The report acknowledges that red-teaming and alignment testing occur, but these efforts are rarely disclosed using a common, externally comparable set of benchmarks. The effect is that external comparison in AI safety dimensions is effectively impossible for most models. Meanwhile, documented AI incidents rose to 362 in 2025, up from 233 in 2024. The governance response at the organizational level is struggling to match: the share of organizations rating their AI incident response as “excellent” dropped from 28% in 2024 to 18% in 2025, while the frequency of incidents per organization increased. This situation highlights a direct impact on TCO for enterprises, which face rising costs related to incident management and a lack of standards for safety evaluation.

Public Perception and Trust in Regulation

Public anxiety rises with AI adoption. Globally, 59% of people surveyed say AI's benefits outweigh its drawbacks, up from 55% in 2024. At the same time, 52% say AI products and services make them nervous, an increase of two percentage points in one year. Both figures are moving upward simultaneously, reflecting a public that is using AI more while becoming more uncertain about where it leads.

The expert-public divide on AI's employment effects is particularly sharp: 73% of AI experts expect AI to have a positive impact on how people do their jobs, compared with just 23% of the general public. These gaps matter because public trust shapes regulatory outcomes, and regulatory outcomes shape how AI is deployed. On that front, the report flags something striking: the US reported the lowest level of trust in its own government to regulate AI responsibly (31%), compared to a global average of 54%. The EU, on the other hand, is more trusted than the US or China to regulate AI effectively, with a median of 53% among 25 surveyed countries.