A New AI Approach to Combat Doping in Sports

Global anti-doping programs traditionally rely on biological testing, an effective yet costly methodology with inherent limitations. Each sample can exceed $800, and many prohibited substances have extremely short detection windows, leaving large segments of athletes without regular checks. This reality has spurred the search for complementary approaches capable of analyzing competition results to identify suspicious performance patterns.

In this context, a new system emerges, promising to strengthen anti-doping strategies. The project has processed a vast dataset of 1.6 million athletic performances, collected from over 19,000 competitions between 2010 and 2025. The goal is to provide an additional screening tool that, while not replacing biological tests, can direct resources more precisely and promptly.

Detection Methodologies and Validation

The system employs eight different detection methodologies, ranging from established statistical rules to advanced machine learning techniques and trajectory analysis. The latter, in particular, compares an athlete's current performances with their expected career progression, looking for significant deviations that could indicate the use of illicit substances.

The validation of these methodologies was conducted by comparing the system's results with publicly confirmed anti-doping violations. Trajectory-based methods demonstrated the best balance between identifying actual violations and minimizing false alarms. However, the system faces significant challenges, such as the incomplete nature of available data and the rarity of confirmed violations, which complicate model training and calibration.

Implications for Data Sovereignty and Deployment

A system managing such sensitive data as athletic performances and potential anti-doping violations raises crucial questions regarding data sovereignty and regulatory compliance. Handling personal and potentially incriminating information requires strict control over the processing and storage infrastructure. For sports organizations and anti-doping agencies, choosing an on-premise or self-hosted deployment can offer significant advantages.

Adopting an on-premise approach allows for direct control over hardware, physical and logical data security, and compliance with regulations like GDPR. This can reduce risks associated with reliance on third-party cloud providers, ensuring data remains within desired jurisdictional boundaries. Although an on-premise deployment might entail a higher initial CapEx investment, a long-term Total Cost of Ownership (TCO) analysis could reveal benefits in operational costs and, crucially, risk mitigation. The ability to operate in air-gapped environments or with limited connectivity is another factor to consider for maximum security.

Future Prospects and Support for Human Decision-Making

The system is designed to be a supportive tool, not a replacement, for existing anti-doping processes. It offers an interactive interface that allows experts to conduct in-depth investigations, emphasizing transparency and human judgment. This hybrid approach, combining the efficiency of automated analysis with the experience and discretion of investigators, is fundamental to ensuring the fairness and credibility of anti-doping programs.

Challenges related to data completeness and the rarity of confirmed violations remain central to the system's continuous improvement. Nevertheless, the integration of advanced data analysis and machine learning techniques represents a significant step towards smarter, more efficient, and proactive anti-doping programs, capable of protecting the integrity of sport in an era of increasing complexity.