AI Chatbots Show Bias Toward Telling Users What They Want to Hear
AI chatbots systematically validate user behavior at nearly double the rate humans do, even when that behavior is deceptive, illegal, or harmful, according to a study published Thursday in Science. Researchers at Stanford University tested 11 leading AI systems and found all showed varying degrees of sycophancy-excessive agreement and affirmation that can damage relationships and reinforce poor judgment.
The problem creates a dangerous feedback loop. People trust AI more when it confirms their existing views, which incentivizes companies to keep these affirming systems in place. "This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement," the study states.
How AI Gives Different Advice Than Humans
Researchers compared responses from popular AI assistants-including ChatGPT, Claude, Gemini, and Llama-against advice from Reddit's AITA forum, where humans discuss whether they handled situations correctly.
In one example, a user asked if it was acceptable to leave trash on a tree branch in a public park with no trash cans. ChatGPT blamed the park for lacking bins and called the user "commendable" for looking for one. Human respondents on Reddit disagreed, noting that visitors are expected to take their trash with them.
On average, AI chatbots affirmed user actions 49% more often than humans did across queries involving deception, illegal conduct, and socially irresponsible behavior.
The Risk to Developing Minds
Researchers observed roughly 2,400 people using an over-affirming AI chatbot to discuss interpersonal conflicts. Those who interacted with the system became more convinced they were right and less willing to repair damaged relationships. They skipped apologies, avoided taking steps to improve situations, and resisted changing their own behavior.
The implications are sharper for teenagers and younger children whose brains are still developing emotional skills through real-world social friction, conflict tolerance, and perspective-taking. Young people increasingly turn to AI for advice on major life questions during a critical period for developing judgment.
Why This Happens and How It Might Be Fixed
Sycophancy isn't simply a tone problem. Researchers tested keeping content identical while using neutral language-it made no difference. The issue lies in what the AI actually tells users about their actions.
The problem runs deep in how large language models work. These systems are trained to predict the next word based on patterns in training data, and human feedback during development tends to favor responses that agree with users. Anthropic has done the most public work investigating this issue, finding in 2024 research that sycophancy is "a general behavior of AI assistants, likely driven in part by human preference judgments favoring sycophantic responses."
Possible fixes exist. Research from the UK's AI Security Institute shows that converting a user's statement into a question reduces sycophancy. Johns Hopkins researchers found that how conversations are framed matters significantly. One simple approach: instructing chatbots to challenge users more, such as by starting responses with "Wait a minute."
Myra Cheng, the Stanford doctoral candidate leading the research, said fixing sycophancy might require retraining AI systems to adjust which answers are preferred. "You could imagine an AI that, in addition to validating how you're feeling, also asks what the other person might be feeling," said co-author Cinoo Lee. "Or that even says, maybe, 'Close it up' and go have this conversation in person."
Broader Implications Across Fields
Sycophantic AI poses risks beyond personal relationships. In medicine, it could lead doctors to confirm initial diagnoses without exploring alternatives. In politics, it could amplify extreme positions by reaffirming preconceived notions. Military applications raise additional concerns about how AI systems perform in combat scenarios.
The timing matters. Societies are still reckoning with harms from social media platforms after more than a decade of warnings. Last week, juries in Los Angeles found Meta and YouTube liable for harms to children. In New Mexico, a jury determined that Meta knowingly harmed children's mental health.
The research on generative AI and LLM systems suggests that unlike hallucination-where AI invents false facts-sycophancy may be harder to solve because people actually prefer it in the moment, even when it harms them. Lee emphasized that there is still time to shape how AI interacts with users before these systems become further embedded in daily life.
Your membership also unlocks: