Why AI Hallucinations Are Increasing and Why They Won't Disappear Anytime Soon

New AI models show rising hallucination rates, producing more inaccurate results despite advances. This persistent issue limits AI’s reliability across applications.

Published on: May 10, 2025
Why AI Hallucinations Are Increasing and Why They Won't Disappear Anytime Soon

AI Hallucinations Are Getting Worse – And They're Here to Stay

Recent evaluations reveal that newer AI reasoning models integrated into chatbots are producing more inaccurate results due to increased hallucination rates. This issue, which has persisted since the early days of large language models (LLMs), appears unlikely to disappear anytime soon.

What Are AI Hallucinations?

Hallucinations refer to errors made by LLMs like OpenAI’s ChatGPT or Google’s Gemini. These errors range from confidently stating false information as fact to providing answers that, while factually accurate, are irrelevant or fail to comply with the prompt’s instructions.

For instance, OpenAI’s recent technical report showed that their o3 and o4-mini models—released in April 2025—had hallucination rates of 33% and 48%, respectively, when summarizing factual information about people. This contrasts with their earlier o1 model from late 2024, which had a 16% hallucination rate.

Widespread Problem Across AI Models

This problem is not exclusive to OpenAI. A leaderboard by Vectara, which tracks hallucination rates, indicates that some reasoning models like DeepSeek-R1 have seen double-digit increases in hallucination compared to their previous versions. These reasoning models perform multiple steps to demonstrate logical answers before responding.

However, OpenAI clarifies that reasoning itself isn’t the root cause. “Hallucinations are not inherently more prevalent in reasoning models,” says an OpenAI spokesperson. The company is actively researching ways to reduce hallucinations across all its models.

Why Hallucinations Matter

High hallucination rates limit the usefulness of AI in practical applications. A research assistant that frequently provides false information demands extensive fact-checking. A paralegal-bot citing nonexistent cases could lead to legal complications. Customer service agents delivering outdated or incorrect policies create operational headaches.

Initially, AI developers expected hallucinations to decrease over time, and early updates did show improvements. But recent spikes in hallucination rates challenge this expectation, regardless of whether the model uses reasoning steps.

Limitations of Current Evaluation Methods

Vectara’s leaderboard ranks AI models based on factual consistency when summarizing documents. It finds hallucination rates are similar between reasoning and non-reasoning models from OpenAI and Google. Forrest Sheng Bao from Vectara notes that while the specific hallucination numbers are important, the overall ranking provides more insight.

However, this ranking mixes different types of hallucinations. For example, DeepSeek-R1’s 14.3% hallucination rate mainly involves “benign” hallucinations—answers logically supported but absent from the original text. This distinction is important when interpreting results.

Emily Bender from the University of Washington points out that testing based solely on text summarization doesn’t reflect performance on other tasks. LLMs generate responses by predicting likely next words rather than truly understanding or verifying content. This makes the term “hallucination” problematic, as it implies an unusual or fixable error, and anthropomorphizes AI systems that don’t actually “perceive” anything.

The Bigger Picture: Living with Imperfect AI

Arvind Narayanan from Princeton University highlights that hallucinations are just one part of the problem. Models also draw from unreliable or outdated data, and increasing training data or computational power hasn’t necessarily improved accuracy.

Given these challenges, it may be practical to use AI models only when fact-checking is quicker than manual research. Otherwise, relying on AI chatbots for factual information remains risky. For those interested in how AI tools are evolving and how to effectively work with them, exploring specialized courses can provide valuable insights and skills.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide