Fake AI-Generated Citations Flooded Research in 2025
Nearly 150,000 fabricated citations entered scientific literature during 2025 as researchers increasingly used generative AI tools without verifying references, according to a cross-institutional study. The fake citations spread across academic databases, preprints, and peer-reviewed journals at an unprecedented scale, raising questions about the reliability of the scientific record.
Researchers from Cornell University, UCLA, and UC Berkeley analyzed roughly 111 million citations across 2.5 million academic papers on platforms including arXiv, bioRxiv, SSRN, and PubMed Central. They identified references whose titles could not be verified through major academic databases.
How the problem accelerated
The sharpest increase in fabricated citations began around mid-2024, coinciding with mainstream adoption of AI writing assistants that generate references automatically. Nearly 79% of fabricated references passed arXiv screening processes. Around 85% of fake citations in bioRxiv preprints later survived into peer-reviewed journal versions indexed in PubMed Central.
A hallucinated citation is an AI-generated reference that appears legitimate but does not exist. These may include invented paper titles, fabricated Digital Object Identifiers (DOIs), incorrect journal names, or mismatched author details. Researchers using AI assistants often assume generated citations are accurate without independently verifying them.
The biomedical research risk
Separate audits identified more than 4,000 fabricated references across 2,810 peer-reviewed biomedical papers, with affected studies continuing to rise through early 2026. The concern extends beyond publication errors.
Modern AI systems are trained on massive collections of publicly available academic material. If fabricated references remain embedded in those datasets, future AI models may reproduce and amplify the same fake citations. This creates a feedback loop where false information becomes part of the training infrastructure powering future research tools.
Clinical reviews, healthcare recommendations, and evidence-based guidance could eventually be compromised if fabricated references remain unchecked.
Why verification broke down
Academic publishing systems already struggle with growing submission volumes and limited reviewer capacity. Preprint platforms prioritize rapid publication, allowing errors to remain undetected before papers move into journals.
AI writing tools increasingly automate citation generation as part of drafting workflows. This encourages researchers to work faster while reducing manual verification. The combination created an environment where plausible-looking but non-existent references could blend into legitimate research papers.
What researchers are proposing
Suggested measures include automated reference checks against databases like Crossref and Semantic Scholar, mandatory machine-readable citation identifiers, and stricter editorial screening during submission.
Some experts argue that universities and research institutions should introduce clearer policies around AI-assisted writing and require authors to manually verify at least a portion of citations before publication.
Researchers using generative AI and LLM tools should understand how these systems can fail. Understanding the mechanics behind AI in research workflows helps identify when verification is necessary.
AI can accelerate scholarship dramatically, but without stronger safeguards, speed comes at the cost of scientific trust.
Your membership also unlocks: