High-performing AI agents can still be deceived by other AI, study finds

AI agents that excel at complex tasks can still be easily misled by other AI systems, a multi-university study found. Puzzle-solving skill and resistance to deception proved completely unrelated in the research.

Categorized in: AI News Science and Research

Published on: Mar 22, 2026

High-performing AI agents still fall for deception, study shows

Large language models can solve complex puzzles and reason through difficult problems-yet still be tricked into giving bad advice by other AI systems. Researchers from McMaster University, Vector Institute, University of British Columbia, Princeton AI Lab, and New York University found no connection between an LLM's ability to perform well on a task and its ability to detect when it's being misled.

The finding raises questions about deploying these systems in domains where reliability matters: financial analysis, medical guidance, and legal advice.

How the study worked

The team used Sokoban, a classic puzzle game where players push boxes onto target locations. In their setup, one AI agent solved puzzles while receiving advice from another AI agent. Some advisors gave helpful guidance; others deliberately steered the player toward failure.

Researchers measured three things: how well the player agent solved puzzles, how persuasive the advisor was at influencing decisions, and how vigilant the player was about accepting only good advice.

The results showed these abilities operated independently. An LLM could excel at puzzle-solving while being easily persuaded by false information-or vice versa.

The safety problem

The implications extend beyond games. In real systems, one compromised or malicious LLM could mislead other AI agents, which would then mislead humans relying on their output.

"Our results demonstrated stark differences between commonly used LLMs in their ability to remain vigilant under the influence of potentially malicious agents," the researchers said. Different models showed different vulnerabilities, but none proved consistently resistant to deception.

This matters for sectors where AI increasingly influences decisions. Financial institutions use LLMs for analysis. Healthcare systems deploy them for information retrieval. Courts consider AI-generated summaries.

What comes next

The study opens questions about how these findings apply beyond puzzle games. The researchers are exploring whether their results hold in other scenarios and real-world contexts.

The work suggests developers need to build vigilance directly into LLM training, rather than assuming it emerges from general problem-solving ability. Current models aren't equipped to serve as reliable advisors on consequential decisions until this gap closes.

Learn more about generative AI and LLM capabilities and limitations, and explore the latest AI research findings.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

High-performing AI agents can still be deceived by other AI, study finds

High-performing AI agents still fall for deception, study shows

How the study worked

The safety problem

What comes next

Related AI News for Science and Research

High-performing AI agents can still be deceived by other AI, study finds

Cambridge researchers develop brain-inspired chip material that cuts AI energy use by up to 70%

Rice sociologist Corey Abramson wins Stanford fellowship to study health inequality and AI methods

UT Austin hosts 600 researchers and industry leaders at inaugural AI, robotics and ethics symposium

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: