LLMs Revolutionize Archives: Deciphering Handwriting at Scale

LLMs Uncover Secrets of Handwritten History

Deciphering handwriting at scale has long been one of the most formidable challenges in artificial intelligence and archival preservation. For decades, researchers and archivists have sought methods to make millions of pages of historical documents accessible, often written in dense cursive or with highly varied handwritings. While early attempts to automate this process date back to the 1960s, with optimistic predictions of machines capable of 'devouring' handwritten texts, reality demanded decades of specialized research and the development of entire commercial industries. Even pioneers like Yann LeCun, Turing Award laureate for his contributions to deep learning, acknowledged the problem's complexity, initially focusing on more controlled contexts, such as handwritten digit recognition.

Today, the landscape is radically changing thanks to the advent of Large Language Models (LLMs). These general-purpose AI models, while not perfect, have reached a sufficient level of accuracy to revolutionize archival practices. Pages that once required paleography training, custom software, or weeks of meticulous work can now produce usable transcriptions in seconds. Historical collections that were preserved but remained effectively inaccessible are becoming searchable, opening new opportunities for scholars and families to explore questions that were previously prohibitive due to time and cost.

The LLM Breakthrough in Archives: Precision, Speed, and Reduced Costs

The ability of LLMs to tackle the complexity of handwriting has been demonstrated by concrete research. Mark Humphries, a history professor at Wilfrid Laurier University, spent a decade grappling with 10 million pages of World War I pension records in Canada. These documents, written by hundreds of different clerks and administrators, made it impossible to use specialized models trained on a single handwriting. With OpenAI's GPT-4 release in 2023, Humphries began testing LLMs for transcription, yielding promising results.

Two years of systematic testing, with results published in May 2025 in Historical Methods, confirmed his observations. On a corpus of 50 English-language letters, legal records, and diary entries from the 18th and 19th centuries, LLMs outperformed Transkribus, a specialized handwriting recognition software used by over 150 major universities and archives, in accuracy, speed, and cost. While Transkribus recorded a character error rate of around 8% on untrained documents, Humphries' best LLM-based approach reduced this to below 2%, completing the work 50 times faster and at roughly 1/50th the cost. In response to these findings, Transkribus has announced the integration of Large Language Models directly into its platform, acknowledging the technology's potential. This trend supports Richard Sutton's theory that general methods leveraging computation will eventually outperform specialized ones.

Practical Implications and Data Sovereignty Relevance

The practical consequences of this innovation are already unfolding in various institutions. Lianne Leddy, a co-author of Humphries' research, is using these tools to trace the experiences of Indigenous women across North America through fur trade post journals, baptismal records, and marriage registries—documents often written by men with a limited perspective on these women's lives. The ability to read thousands of documents to find a handful of relevant details is transforming the scale of historical research.

The University of North Carolina at Chapel Hill is also experimenting with AI transcription across its special collections material, with particular success in handling ledgers, which feature variable tabular structures and have long been difficult to process. Archivists like Jackie Dean noted that models like Gemini handle tables exceptionally well, representing a major leap forward. Similarly, the Federal Reserve Bank of Philadelphia is using LLMs to extract data from historical vehicle registrations and property deeds, opening new economic research questions that were previously too costly to pursue. For those evaluating on-premise deployments, managing sensitive historical data, such as personal journals, government records, or family histories, raises crucial questions of data sovereignty and compliance. Adopting LLMs for these applications requires careful consideration of the trade-offs between using public cloud APIs and self-hosted or air-gapped solutions, which ensure complete control over data and inference processes—a fundamental aspect for the security and privacy of archival information.

Future Prospects and the Democratization of Historical Access

The journey of AI in handwriting recognition has been long and complex. From Yann LeCun's early insights in the 1980s, who viewed character recognition as a means to explore computer vision in an era of limited computational resources, modern systems have evolved to read entire lines of text and use language models to interpret context. While LeCun considers the problem largely solved for many general purposes, progress continues to be crucial for specialized groups working with particularly challenging historical documents. Even a marginal improvement in speed or reliability can unlock new research possibilities.

Mark Humphries is advancing this work with Archive Pearl, a not-for-profit tool currently in beta, designed to allow researchers to drag and drop hundreds of pages and get clean transcriptions back in minutes. The goal is democratization: making these tools accessible to a wider audience, not just professional historians, but also undergraduates and anyone conducting genealogical or family research. The ability to unlock texts in technical Latin or other archaic forms, which would require a lifetime of study to understand, represents a further step towards more inclusive access to global historical heritage. This on-premise or hybrid approach to managing such tools is critical for institutions wishing to maintain control over their sensitive data, while ensuring broader access and reduced operational costs.