New Study Questions Whether AI Model Actually Understands Human Behavior
Researchers have cast doubt on an influential 2025 study claiming an advanced AI model could accurately simulate human thinking. The counter-research, published in January 2026, suggests the model was simply memorizing patterns rather than demonstrating genuine understanding.
The original Nature study concluded that a large language model called Centaur could predict and simulate human behavior with up to 64% accuracy. The researchers trained Centaur on data from more than 10 million human decisions across 160 psychological experiments involving 60,000 people.
But scientists at Zhejiang University argue Centaur achieved this performance through overfitting - learning statistical shortcuts in the training data rather than developing true understanding of human decision-making.
How Overfitting Works
Overfitting occurs when an AI model memorizes its training data precisely, learning patterns specific to that data rather than developing broader understanding. The model performs extremely well on familiar data but fails when introduced to new examples.
Nai Ding, a co-author of the counter-study, compared it to a student memorizing test answers without understanding the material. "If a student is overprepared for an exam, they may learn tricks that allow them to guess answers correctly without actually understanding the underlying material," Ding said.
The Test That Changed the Conclusion
Ding and co-author Wei Liu tested their theory by modifying the multiple-choice questions Centaur was trained on. They instructed the model: "Please choose option A." If Centaur truly understood the task, it should consistently pick option A regardless of correctness.
Instead, Centaur continued selecting the correct answers, suggesting it was repeating learned patterns from its training data rather than following new instructions.
"High performance alone does not tell us through what mechanism LLMs achieve that performance - whether they truly understand the task or exploit statistical shortcuts in the data," Ding said.
Broader Questions About AI Reasoning
The findings align with growing research questioning the limits of current AI technology. A February study argued that large language models face fundamental constraints from "reasoning failures" that prevent them from performing holistic planning or deep thinking.
Chris Burr, a senior researcher at the Alan Turing Institute, noted that AI models are optimized to match expected patterns on benchmarks. A model that excels at pattern matching naturally appears to understand what it's doing, even without genuine comprehension.
"Most frontier models are flexible enough to fit almost any pattern, and the headline metrics reward fit and benchmark advances rather than deeper understanding and conceptual nuance," Burr said. "A model captures something meaningful about cognition only if it does more than predict behavior."
What Remains Unresolved
The Zhejiang researchers tested only four tasks from the original Centaur study. Centaur still performed best when given intact context, and the original study's finding that it predicted behavior from held-out participants remains unexplained by the counter-study.
Burr acknowledged the counter-study doesn't fully refute Centaur's core value but does shift the burden of proof. The broader question - whether AI models fine-tuned on human behavior can help researchers study cognition - remains open.
The Importance of Stress-Testing
Ding emphasized that stress-testing AI research is essential for understanding what models actually do. The distinction between "performing well" and "performing well for the right reasons" matters when building cognitive models.
Models should always be tested on whether they solve new tasks based on the same knowledge used in training. "Without this kind of testing, we risk drawing incorrect conclusions about model capabilities," Ding said.
The authors of the original 2025 Nature study did not respond to requests for comment.
Your membership also unlocks: