150,000 Fake Citations Discovered in Scientific Literature in 2025
Researchers at Cornell University, UCLA, and UC Berkeley found that approximately 150,000 fabricated references entered peer-reviewed journals in 2025, most originating from AI-generated text. The discovery emerged from analysis of 111 million citations across 2.5 million research papers published between 2020 and 2025 on arXiv, bioRxiv, SSRN, and PubMed Central.
The study tracked citations whose titles could not be verified against major academic databases including Semantic Scholar, OpenAlex, and Google Scholar. By comparing post-2022 trends against pre-ChatGPT baselines, researchers isolated the likely contribution of large language model hallucinations to the surge.
The Timeline Matches AI Tool Evolution
The steepest rise began around mid-2024, roughly 18 months after ChatGPT's public release. AI writing tools had evolved from assistants into citation-generation engines by that point.
The contamination is not concentrated in obviously fraudulent papers. Researchers found that fake references are typically scattered sparsely across otherwise legitimate manuscripts, suggesting many researchers copy AI-generated citations without verification.
Peer Review Failed to Catch Most Fake Citations
Nearly 79% of fake citations passed arXiv moderation. Among bioRxiv preprints later published in peer-reviewed journals indexed by PubMed Central, 85.3% of hallucinated references made it into final published versions.
A separate audit published in The Lancet examined biomedical papers from 2023 through early 2026. Researchers found more than 4,000 fabricated references embedded across 2,810 peer-reviewed papers.
The rate of contamination accelerated sharply. In 2023, roughly one in 2,828 papers contained at least one fabricated citation. By 2025, the figure had worsened to one in 458 papers. By early 2026, it had climbed to one in 277 papers.
One Paper Was 60% Fabricated
A 2025 paper in an open-access oncology journal on ureteroileal surgical techniques contained 18 fabricated references out of 30 verified citations-60% of the paper's bibliography. Nearly 98% of affected papers had not faced any publisher action at the time of the audit.
The Self-Reinforcing Problem
Researchers warn the problem may now be self-reinforcing. As fabricated references embed themselves in open-access repositories and citation databases, future AI models trained on that corpus risk absorbing and reproducing the same hallucinations.
Publishers need automated reference verification systems before papers are accepted for publication, researchers said. Fabricated citations could compromise clinical guidelines and systematic reviews that rely on accurate literature synthesis.
Understanding how generative AI and LLMs generate hallucinations is now essential for research professionals. AI research training can help teams develop better verification practices and recognize when citations need validation.
Your membership also unlocks: