April 23, 2025

AI and the Propagation of Digital Fossils in Scientific Literature

Listen to this article as Podcast
0:00 / 0:00
AI and the Propagation of Digital Fossils in Scientific Literature

On Digital Fossils and the Fallibility of Artificial Intelligence

The increasing use of Artificial Intelligence (AI) in research and science presents not only enormous opportunities but also unexpected challenges. One example is the spread of the term "vegetative electron microscopy," an expression that sounds scientific but has no meaning whatsoever. How could such a nonsensical term become established in scientific publications?

The Emergence of a Digital Fossil

The origin of the misleading term can be traced back to errors in the digitization of older scientific papers from the 1950s. Presumably, during scanning and optical character recognition (OCR), words from different columns were incorrectly combined. "Vegetative" from one column and "electron microscopy" from another led to the meaningless composite. This initial error was subsequently amplified and spread through the use of AI systems.

The term appeared in some Iranian scientific papers and subsequently found its way into further publications and even newspaper articles. By being included in the training data of AI models like ChatGPT-3, GPT-4, and Claude 3.5, the term further solidified and became a so-called "digital fossil." Similar to biological fossils encased in rock, these digital artifacts remain in the information ecosystem and are repeatedly reproduced by AI.

The Challenge of Opaque AI Models

The lack of transparency in the training data of commercial AI models makes correcting such errors considerably difficult. Developers like OpenAI do not disclose detailed information about the datasets used, making the identification and elimination of errors nearly impossible. The question of how many more such "digital fossils" exist in AI systems thus remains unanswered.

Impact on Science and Research

The spread of nonsensical terms by AI raises questions about quality assurance in scientific research. AI-powered tools are increasingly used to create scientific texts, which increases the risk of spreading false information. Some screening tools now recognize the term "vegetative electron microscopy" as a warning sign for AI-generated content. However, this approach can only identify already known errors, not undiscovered ones.

The Way Forward

The challenge is to develop strategies to prevent the spread of misinformation through AI. More transparency regarding the training data of AI models would be an important step. Furthermore, improved methods for verifying and validating AI-generated content are necessary. The development of more robust algorithms capable of detecting inconsistencies and errors in the data is also crucial. Collaboration between technology companies, researchers, and publishers is essential to ensure the integrity of scientific information in the age of AI.

Bibliographie: - t3n.de: Dieser Nonsens-Begriff erobert die Wissenschaft – Grund ist ein KI-Fehler - t3n.de: KI-Fehler: Nonsens-Begriff erobert Wissenschaft - LinkedIn: t3n Magazin – Dieser Nonsens-Begriff erobert die Wissenschaft - X (ehemals Twitter): t3n – Status zum Thema - Threads: t3n_Magazin – Post zum Thema - t3n.de: Tag Künstliche Intelligenz - newstral.com: Dieser Nonsens-Begriff erobert die Wissenschaft – Grund ist ein KI-Fehler - t3n.de: Startseite - t3n.de: News