AI Reviewing AI: 82% of Fabricated Papers Get Accepted

LLM reviewers green-light AI-made papers about 4 of 5 times, a new analysis finds. Without human sign-off and stronger checks, shaky work can seep into the literature.

Categorized in: AI News Science and Research

Published on: Nov 12, 2025

AI peer reviewers are green-lighting AI-fabricated papers. Often.

New evidence suggests large language model (LLM) "reviewers" recommend acceptance for AI-generated manuscripts roughly 4 out of 5 times. The analysis, posted Oct. 20 as an arXiv preprint (not yet peer reviewed), shows how easily automated review loops can normalize unsound work.

Researchers generated 600 fake manuscripts using GPT-5, then asked three other OpenAI models-o3, o4-mini, and GPT-4.1-to review them. Despite flagging some integrity issues, the AI reviewers still recommended acceptance up to 82% of the time.

"AI can be misused to attack this vulnerable system," says study author Fengqing Jiang of the University of Washington, who has not released the underlying BadScientist code to avoid misuse. While the test set focused on computer science, the same setup could be tuned to produce manuscripts in other fields.

The team submitted this work to AI Agents for Science, a conference where submissions were written and reviewed exclusively by AI. It's a live experiment in whether AI can generate hypotheses, methods, and results at acceptable quality-and where the guardrails fail.

Why this matters for researchers, editors, and lab leads

AI-only loops are now plausible: AI generates a study, AI reviews it, and the cycle repeats. That risks a flood of plausible-sounding but unsound papers that pass basic checks and pollute the literature.

Once bad citations and synthetic results enter the reference chain, they're hard to unwind. You don't need intent to deceive for this to be a problem-speed and scale are enough.

What the study signals

Surface-level critique isn't enough: Models can spot issues yet still over-recommend acceptance.
Incentives favor passivity: AI reviewers don't bear reputational cost, so they default to approval.
Field portability: If it works for CS, it can be adapted for chemistry, biology, and beyond.

Practical safeguards you can implement now

Human-in-the-loop by policy: Require named human sign-off for every accept/reject. AI can assist, not decide.
Integrity scoring: Add a rubric that weights red flags: unverifiable citations, missing data/code, statistical impossibilities, method-result mismatches.
Provenance statements: Mandate disclosure of AI use in writing, analysis, and visualization. Ask for prompts, model versions, and generation dates when feasible.
Identity and origin checks: Verify author identities (e.g., ORCID), require data/code deposits, and spot-check for synthetic references and duplicate text across submissions.
Hybrid review workflow: Allow LLMs for summaries and checklists, but not as the reviewer of record. Any AI-produced critique must be audited by humans.
Repro-lite: For empirical work, run a minimal reproduction: load data, execute core analysis, and compare key numbers to the manuscript.
Reviewer training: Teach editors and referees how to detect AI-generated text, images, and "too-clean" narratives-and how to use AI as a skeptical assistant, not a rubber stamp.

Community sentiment

A 2025 survey of more than 5,200 scientists by Nature reports that over 90% find it acceptable to use generative AI to edit or translate their own work. But 60% say using generative AI to conduct peer review isn't acceptable; 57% are fine with AI assisting reviewers by answering questions about a paper.

What to watch next

Policy updates: Expect clearer rules from journals and funders on AI disclosure, reviewer conduct, and minimum reproducibility.
Verification tech: Growth in provenance tools, reference validation, and automated checks for statistical anomalies and data leakage.
Benchmarking: Shared test suites to evaluate AI reviewers on rigor, not just convenience.

Bottom line: AI can accelerate peer review, but it should raise the bar for scrutiny-not lower it. Put humans on the hook for decisions, make integrity measurable, and treat AI as a tool to stress-test claims, not wave them through.

If you need to upskill reviewers and authors on safe, high-impact AI use in research, see our curated programs: AI courses by job.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Advertisement

AI Reviewing AI: 82% of Fabricated Papers Get Accepted

AI peer reviewers are green-lighting AI-fabricated papers. Often.

Why this matters for researchers, editors, and lab leads

What the study signals

Practical safeguards you can implement now

Community sentiment

What to watch next

Related AI News for Science and Research

How AI Slipped Into Peer Review: Faster Publishing, Murky Transparency, Untapped Rigor

From Busywork to Breakthroughs: Building Reliable Scientific AI Agents with NeMo Gym and NeMo RL

AI tips off scientists to a new monkeypox weak spot, opening the door to simpler vaccines and antibody therapies

AI spots chronic stress on routine CT: adrenal volume index tracks cortisol and predicts heart failure risk

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: