The Role of AI in World History Research: Insights from Young Scholars

Introduction

In today’s era, artificial intelligence (AI) has permeated every aspect of human life, profoundly changing how we understand and transform the world. In academic research, AI technology offers efficiency in text processing and excels in content mining and algorithmic filtering, bringing convenience to research. However, it also presents inherent limitations such as value biases and ethical risks, making it a hot topic across various disciplines. This article invites three young scholars engaged in different national studies to discuss how AI is applied in world history research, its impact on research boundaries, and the challenges faced.

How AI Drives World History Research

Moderator: In recent years, AI technology has rapidly developed, and scholars across disciplines have explored its potential applications in their fields, including world history research. Can each of you share how AI plays a role in your specific research areas?

Wang Sijie: In my research on German history, the application of AI in both Chinese and foreign German historiography mainly focuses on optical character recognition and transcription of historical manuscripts and archives, as well as content mining using techniques like topic modeling and text reuse detection. AI has significantly deepened existing digital historical work, such as identifying hidden relationships and intermediary nodes in social network analysis of archives. While digital historians have long utilized programming languages for word frequency statistics and co-occurrence analysis to identify potential themes, these methods are often limited to statistical associations at the word level, making it difficult to capture deeper historical representations like semantic evolution and rhetorical differences. Recent advancements in deep learning pre-trained language models allow for the transformation of texts into vector structures that reflect contextual semantics, enabling the identification of the same historical theme under different expressions and generating explanatory summaries or labels directly.

Yao Nianda: In the international American historiography, the application of AI encompasses a comprehensive set of computational analysis methods centered on natural language processing and machine learning. This approach converts diverse historical materials, such as newspapers and government documents, into quantifiable objects, using techniques like topic modeling, text embedding, and semantic analysis to reveal long-term changes in language, concepts, and political discourse, providing new clues and evidence for historical interpretation. For instance, the Stanford team led by Nikil Garg analyzed large-scale 20th-century corpora to quantify changes in gender and ethnic stereotypes in language and connect them to social structural transformations. Another American scholar, Melissa Lee, tracked the transition of the term “United States” from plural to singular usage in 19th-century newspapers and congressional debates, highlighting how this shift reflected changing understandings of national sovereignty among Americans.

Yi Jinming: Recently, the intersection of medieval European history and AI has focused on using AI technology for automatic transcription, completion, and structural analysis of medieval materials, enhancing the readability, retrievability, and analyzability of ancient texts. For example, through handwriting recognition and layout analysis, tools like Transkribus automatically transcribe medieval manuscripts and archival images into searchable texts. Additionally, knowledge graphs and semantic web technologies structure relationships among people, places, and institutions found in charters, ledgers, and letters into queryable data networks. A research team from Spain proposed establishing a knowledge graph for medieval charters by combining expert annotations, community contributions, and provenance mechanisms to structure dispersed charter data into a queryable knowledge network, supporting systematic analysis of medieval social, legal, and economic relationships.

Limitations of AI in World History Research

Moderator: While AI significantly enhances research efficiency, it also has notable limitations. What are the current bottlenecks faced by AI technology in historical research?

Yao Nianda: There are several bottlenecks in applying AI to historical research, reflecting a structural mismatch between current AI technology and historical studies. Firstly, AI struggles to resonate emotionally with human society. As Croce pointed out, all history is contemporary history. A vital historical research topic often responds to current social issues and evokes emotional resonance among readers. Therefore, determining which historical problems are meaningful today relies heavily on researchers’ sensitivity to public issues and human experiences. AI can summarize existing discussions but cannot genuinely understand the emotional connections between historical issues and human practices.

Secondly, AI faces the unavoidable problem of semantic drift when analyzing historical texts. Most language models are trained on contemporary corpora, and applying them directly to historical text analysis can lead to misinterpretations based on modern semantics and language habits. Even attempts by teams like the University of Zurich to train models on historical corpora are limited by the incompleteness and imbalance of existing historical texts.

Moreover, AI’s value judgments are not neutral and are inevitably influenced by the mainstream norms and contemporary values present in the training data. When these models are used in historical research, they may inadvertently assess the past by contemporary standards, thus weakening the historical context.

Finally, a critical bottleneck is the “black box” nature of AI. In many cases, humanists find it challenging to explain how AI reaches a particular conclusion. For humanities disciplines that prioritize explainability and discussability, a lack of clarity in the analysis process makes it difficult to hold researchers accountable for their conclusions.

Yi Jinming: In text analysis, AI is mainly applied to types of historical materials that are abundant and digitized, such as contracts and correspondence, while its application in other areas remains limited. This limitation arises from two main reasons: first, the training of AI models heavily relies on large-scale, readable corpus data. For instance, a study by a team from the University of Bern in 2024 utilized over 6,000 letters from the Florentine merchant banking network. However, many medieval materials have not reached such scale and quality. Secondly, medieval documents often have complex handwriting, numerous abbreviations, and poor preservation, increasing the cost of text recognition and transcription. Although platforms like Transkribus have improved the feasibility of large-scale reading, training and proofreading still require significant human effort and time, leading researchers to prefer using already organized archival databases.

Wang Sijie: As mentioned, the imbalance of corpora affects the scope of AI usage. A similar issue arises from the fact that general large language models are primarily trained on data from the English-speaking world, which often leads to a Western-centric perspective in historical narratives. AI still struggles with semantic recognition and understanding of long and complex sentences in minority language materials. Additionally, the digitalization and open access of English and American archives provide significant advantages, with some databases offering APIs for automated batch retrieval and deep processing. This “digital divide” is particularly pronounced in transnational history research, where researchers tend to use easily accessible and highly structured English and American materials, impacting the restoration of the overall historical picture.

Coexisting with AI in Historical Research

Moderator: Given the limitations of AI, what methods can be employed to address these challenges?

Yao Nianda: The fundamental solution to these limitations lies in anticipating technological advancements that can eliminate these issues. However, a more realistic approach for humanists is to mitigate these limitations through methodological design and research norms, ensuring that AI remains controllable and verifiable. First, it is crucial to maintain the leading role of human researchers in the problem-setting phase. The determination of which historical questions are worth raising and why they are significant must stem from the researchers’ understanding of contemporary society and historiographical traditions, rather than being generated by models. Secondly, when using AI to analyze historical texts, research methods must clearly distinguish between contemporary language models and historical language, striving to restore the historical context of the materials. Lastly, in facing the “black box” nature of AI, historians should enhance the transparency of the research process and their sense of responsibility. Even if the algorithms themselves are not fully explainable, researchers should clarify the types of models used, the scope of the corpus, and the analysis steps, ensuring that the research path remains traceable and that conclusions can withstand academic scrutiny.

Wang Sijie: We could attempt to build specialized models for specific fields, such as those serving early American history or German historiography. These specialized models can utilize retrieval-augmented generation (RAG) techniques to conduct material retrieval through local structured knowledge bases, ensuring contextual anchoring while enhancing controllability. Specialized models have independent memory and parameters and can be fine-tuned for specific languages and historical contexts. Importantly, local knowledge bases can include diverse perspectives on historical narratives, allowing researchers to incorporate insights from local historians into their prompts to counteract potential geopolitical biases in the models.

Yi Jinming: AI should be viewed as a “hypothesis generation tool” rather than a “conclusion verification tool.” To avoid AI becoming merely an efficiency tool for existing historiographical propositions, it is crucial to redefine its methodological role. Instead of using models to validate already established economic trends or institutional judgments, we should position them as mechanisms for generating hypotheses, actively identifying historical problems that have not been fully explained by theoretical frameworks. For instance, algorithms can reveal latent networks of low-frequency individuals across regions or identify semantic combinations of unconventional contractual clauses. These outputs do not directly constitute historical conclusions but provide historians with new leads and research directions, which can then be interpreted and validated by researchers in the context of archives and institutional backgrounds.

Moderator: In the context of AI profoundly influencing academic research paradigms, how should young world historians seek a balance between upholding historiographical traditions and embracing technological changes?

Yi Jinming: As AI gradually enters historical research practices, the importance of historiographical training has not diminished; rather, it has become more pronounced. First, the formation of problem awareness relies on long-term historiographical training, not merely on technical mastery. Truly innovative research often stems from questioning and reconstructing existing explanations. This ability to question comes from familiarity with historiographical traditions, theoretical lineages, and methodological debates. Without an understanding of the history of historiography, it is challenging to judge whether a pattern generated by AI is a “new discovery” or a “repetition of old problems.” Secondly, historiographical training cultivates a keen awareness. AI relies on visible data, but historical research often focuses on absent voices, marginalized groups, and unrecorded narratives. Only scholars with long-term historiographical training will recognize which groups are systematically absent in contracts or administrative documents and design supplementary paths accordingly. Lastly, the ability to critique sources is irreplaceable. Regardless of how many text patterns a model identifies, researchers must assess whether these patterns arise from archival generation mechanisms or preservation biases. Thus, while actively utilizing AI technology, historians must prioritize traditional historiographical training.

Wang Sijie: Young scholars should allow AI to handle preliminary tasks like archival screening, text recognition, and literature translation, focusing their energies on more creative interpretative work. As archival materials continue to be made public and digitized, young scholars can gradually build a personal knowledge base composed of structured materials and diverse scholarly outputs from the early stages of their careers, transitioning from readers of archives to managers of data. With the support of RAG technology, personal knowledge bases can retrieve and identify semantic connections and integrate research viewpoints across multilingual corpora through keywords, greatly enhancing work efficiency. Additionally, young scholars should actively explore potential applications of AI in history. For example, using generative modeling techniques to simulate dialogues with historical figures based on their letters, diaries, and writings, or employing historical simulations to model key wartime decisions or diplomatic negotiations. Such applications can not only assist in history education but also inspire researchers’ academic creativity.

Yao Nianda: I believe the relationship between world historians and AI should not be viewed as adversarial or substitutive but as a conscious coexistence with boundaries. It is essential to clarify that emphasizing the importance of humans in research does not negate the value of technology. Historians are not difficult to replace by machines not merely because technology is not yet mature, but because their core value comes from the researchers’ awareness of problems and the meanings they assign to history. Therefore, humanists do not need to prove their irreplaceability by rejecting the use of AI. At the same time, we must be wary of another extreme tendency, where the efficiency brought by AI might unconsciously weaken researchers’ subjectivity. If researchers merely rely on models to generate conclusions, summaries, or analysis paths, research itself may degrade into organizing and restating model outputs. The key to coexisting with AI lies in clearly distinguishing between enhancing labor efficiency and replacing human thought.

Expert Commentary

Wang Tao, Professor at Nanjing University: The transformation of research methods in history is relatively slow, yet it does not reject methodological updates, actively incorporating interdisciplinary thinking. If Sima Qian could see the current discussions among young historians about AI in historical research, he might feel a sense of familiar strangeness. The strange part is the high-tech terminology that can be overwhelming. From quantitative history to digital humanities, big data, spatial analysis, and text mining, the recent impact of AI has produced terms like large language models and intelligent history. The technological shift in historical research should be validated. Historians are not pursuing technology for its own sake but hope that tedious research work can be made more efficient with technological support. Whether capturing semantics from vast texts or transcribing manuscripts, these are areas where large language models can excel. Young scholars, who are naturally more sensitive to these discussions, may feel hopeful because, according to traditional academic development paths, they need to publish papers quickly and efficiently to establish their academic reputation. With the assistance of AI, the paper generation process is undoubtedly optimized, which is a significant temptation. No one wants to be the last to use AI tools for historical research in the future.

If Sima Qian were to enter the AI era, he might not understand the technical concepts mentioned by the three young scholars, but he would certainly notice that beneath the technological aura, they are still discussing the comprehensibility, discussability, significance, and evaluation of history. This remains a topic he is somewhat familiar with, and he could even join the heated discussion among the three young scholars, adding a note of his own. Therefore, it is reassuring that while young scholars closely follow the most fashionable and cutting-edge methodologies, they can still adhere to the core of historiography as a guiding principle to define or evaluate the effectiveness and limitations of AI. They emphasize that as AI enters the realm of historical research, the foundational training in historiography must not be neglected, which is especially important. Only in this way can historical research counter the illusions brought by AI, overcome the exacerbated “digital divide,” and break through the “black box” nature of technology.

That said, traditional historiographical methodologies and developmental inertia are becoming increasingly untenable. Undoubtedly, for comprehensive research methodologies, history may no longer exist. Completing a thorough and summarizing academic review is an area where AI undoubtedly leads humans. The future development path, how to maintain technological control, such as the application of retrieval-augmented generation technology in world history research, requires more historians to continuously experiment in practice.