Hallucinations Up, Trust Down: Scientists Lose Faith in AI

Scientists are using AI more but trusting it less: adoption hit 62%, hallucination worries 64%, and security concerns rose 11 pts. Confidence it beats humans fell below one-third.

Categorized in: AI News Science and Research
Published on: Oct 14, 2025
Hallucinations Up, Trust Down: Scientists Lose Faith in AI

Scientists Are Using AI More - And Trusting It Less

Scientists are skeptical by training. The latest preview from Wiley's 2025 report on research and AI shows that skepticism is growing as hands-on use rises.

In 2024, 51% of surveyed scientists were worried about AI "hallucinations." In 2025, that jumped to 64% - even as AI adoption among researchers rose from 45% to 62%. Security and privacy concerns climbed by 11 percentage points, and confidence that AI is surpassing human ability dropped from "over half of use cases" to less than a third.

Why deeper use is eroding trust

Hallucinations aren't edge cases; they are common failure modes. We've already seen bogus citations in legal filings, misleading clinical suggestions, and fabricated travel details make it into the real world. Higher-capacity models don't automatically fix this; some tests show hallucinations persisting - and in certain settings, getting worse.

There's also an incentive problem. Users prefer confident systems over cautious ones, even if the confidence is misplaced. If a model hedges, engagement drops. If it bluffs, engagement holds. That bias pushes vendors to optimize for fluency and speed, not verifiability.

What this means for your lab

AI is useful for ideation, code scaffolding, and literature triage - but it is not a source of truth. Treat outputs like unreviewed notes from a keen intern: helpful, fast, and error-prone.

  • Force citations and provenance: Require sources, DOIs, and links in every answer. Reject unsourced claims.
  • Use retrieval over recall: Pair models with your vetted corpora (RAG) and log the exact passages used.
  • Add a second pass: Run a separate "critic" prompt to fact-check names, numbers, units, and references.
  • Benchmark tasks, not vibes: Track precision/recall on your actual workflows (screening, summarization, coding) with gold sets.
  • Keep humans in the loop: Assign review ownership. No AI-generated content should bypass a named reviewer.
  • Protect data: Use enterprise or on-prem options. Disable training on your prompts and outputs by default.
  • Version everything: Log model, temperature, system prompt, retrieval source, and time for each run to ensure reproducibility.
  • Red-team the edge cases: Unit conversions, rare diseases, homonyms, negations, out-of-distribution data. That's where errors hide.
  • Set stop conditions: Define "no answer" rules. A model that admits uncertainty is valuable - wire your process to accept it.

Practical prompts that reduce risk

  • "Answer only from the provided sources." If missing, respond: "Insufficient evidence in sources."
  • "List every assumption you made." Forces transparency you can audit.
  • "Cite with DOI/PMID and quote the exact sentence." Enables instant verification.
  • "If two sources conflict, show both and do not resolve." Prevents confident fiction.

Where this is headed

As researchers get closer to the machinery, the shine fades and the utility sharpens. Hype gives way to workflow design, measurement, and governance. That's progress.

If your team needs a structured way to implement safe, verifiable AI workflows in research settings, see our practical AI training by job function.

Bottom line: use AI for speed, but make your systems allergic to unverified claims. Curiosity plus rigor beats confidence every time.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)