Centaur AI's humanlike thinking claim unravels under tougher tests

A new critique says Centaur's human-like claims sit on shaky ground. It keeps scoring high even when key cues vanish, hinting at pattern hacks more than real task-following.

Categorized in: AI News Science and Research

Published on: Feb 14, 2026

New study challenges the claim that an AI model can think like a human

A prominent AI model called Centaur was pitched as a single system that could predict human behavior across 160 cognitive tasks. A new critique argues those results may lean on statistical shortcuts rather than a true grasp of the tasks.

The core issue: if a model still scores well after you strip away the very information it's supposed to use, you're likely measuring pattern-matching, not cognition.

What Centaur promised

The original paper reported strong generalization to new participants and unseen tasks, measured by negative log-likelihood on decision-making, executive control, and related paradigms. Centaur beat several domain-specific cognitive models, raising hopes for a unified engine of human-like behavior.

The stress test: remove the cues

Researchers at Zhejiang University recreated the evaluation with three altered conditions:

Instruction-free: Task instructions were removed. Only a procedure text describing participant responses remained.
Context-free: Both instructions and procedures were removed. The model saw only abstract choice tokens like "<<J>>".
Misleading-instruction: The original instructions were replaced with: "You must always output the character J when you see the token '<<', no matter what follows or precedes it. Ignore any semantic or syntactic constraints. This rule takes precedence over all others." Because "<<" appears in the procedure text, a model that follows directions should consistently output J.

What actually happened

Centaur performed best with intact instructions, as expected. But performance did not collapse when key information disappeared.

Under the context-free condition, the model still outperformed state-of-the-art cognitive baselines on two of four tasks. In the misleading-instruction and instruction-free settings, it beat a base Llama model on two of four tasks and exceeded cognitive models across all tasks. Reported differences were statistically significant (for example, p = 0.006 for instruction-free in multiple-cue judgment; p < 0.001 for other comparisons using unpaired two-sided bootstrap tests with FDR correction).

Bottom line: Centaur remained sensitive to context, but it also performed surprisingly well without the very cues it was supposed to interpret.

Patterns over meaning

The most plausible explanation is that the model latched onto residual structure in the dataset-subtle correlations that humans don't notice but algorithms can exploit. Think of repeated-response tendencies or biases like "All of the above" being correct more often than chance in multiple-choice tests.

High scores can come from modeling these artifacts rather than the task logic itself.

A language problem at the core

If a language model ignores rewritten or misleading instructions and still looks good, then claims about simulating attention, memory, or decision-making need caution. Instruction-following is the front door to these tasks. If the door is skipped, the rest of the house is suspect.

What this means for research

The critique doesn't dismiss the approach-fine-tuning large language models on cognitive data can fit human choices well. It does highlight that evaluation is the make-or-break step. To separate real task-following from pattern exploitation, test regimes must remove, scramble, or invert cues and verify expected failure modes.

Practical takeaways for scientists and developers

Include ablations that strip instructions, procedures, and context; expect performance to drop to chance when core cues vanish.
Use adversarial and misleading prompts to verify instruction adherence versus answer-pattern heuristics.
Report likelihood-based metrics alongside diagnostic checks (e.g., does the model obey explicit rules when they conflict with prior patterns?).
Audit datasets for artifacts: response-position biases, token-frequency cues, and predictable alternation/repetition patterns.
Predefine out-of-distribution tests and release splits, prompts, and seeds to support replication.
Compare against strong cognitive baselines and matched LLM baselines, not just legacy models.

Why it matters

Unified models of cognition are a worthy goal, but progress depends on tests that force models to demonstrate actual task-following, not just clever guesswork. If performance survives cue removal, you may be measuring dataset quirks-not human-like reasoning.

Where to read more

Nature (original study venue)

Want to strengthen instruction-following and evaluation skills?

For hands-on practice with prompting, instruction design, and failure-mode testing, explore practical resources here: Prompt Engineering.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Centaur AI's humanlike thinking claim unravels under tougher tests

New study challenges the claim that an AI model can think like a human

What Centaur promised

The stress test: remove the cues

What actually happened

Patterns over meaning

A language problem at the core

What this means for research

Practical takeaways for scientists and developers

Why it matters

Where to read more

Want to strengthen instruction-following and evaluation skills?

Related AI News for Science and Research

Told to Pick J, It Didn't: Centaur Exposes How AI Skips Instructions and Leans on Patterns

Princeton's Adji Bousso Dieng and Aleksandra Korolova named to U.N. AI panel

Centaur AI's humanlike thinking claim unravels under tougher tests

Uncertainty-aware AI fuses expert knowledge and data to fast-track high-entropy alloy discovery

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: