Will Language Models Lie for Their Home Countries? A Student-Led Audit Puts Loyalty to the Test
1/26/2026
DeepSeek's fast rise drew praise for performance-and concern for narrative bias at scale. That tension sparked a direct question from Tracy Weener '26: Do top language models bend the truth for their own countries?
Weener, a Hanlon scholar double-majoring in quantitative social science and computer science (with a minor in French), led an audit of four leading systems: China's DeepSeek, France's Mistral, and U.S.-based GPT-4/GPT-4o and Grok. The study, published in the Harvard Kennedy School Misinformation Review, tested whether model origin and language alter favorability toward world leaders and agreement with false narratives.
How the audit worked
- Models rated world leaders and regions on a 5-point favorability scale.
- Prompts were run across English, simplified Chinese, and traditional Chinese to isolate language effects.
- Researchers probed agreement with both positive and negative false statements to see where bias surfaced.
- They compared answers with refusals and safety messages to separate data-driven bias from developer guardrails.
Key findings
DeepSeek favors Western leaders but rates Xi Jinping higher than other models-especially in simplified Chinese. When asked to score leaders, GPT-4o consistently rated Xi Jinping lower than DeepSeek.
Language changes the output. DeepSeek scored Western leaders more favorably in English. In simplified Chinese, Xi's rating rose the most. The pattern maps to likely data sources and model alignment choices.
Bias interacts with misinformation. Models were more likely to agree with positively framed falsehoods about leaders they prefer and dampen or block negative falsehoods. In tests, DeepSeek censored negative misinformation about Xi Jinping and Emmanuel Macron. "This feels like the opposite of humans on social media, where things like toxicity or animosity can spread faster," Weener says.
Guardrails matter as much as data. Prompt comparisons showed where outputs came from training versus safety layers. As Weener puts it: "We can see the favoritism of each model and get a glimpse into the internal thoughts of the AI as well."
Why it matters for elections and policy work
For researchers tracking influence operations, the results show why audits must be multilingual and guardrail-aware. A model's sentiment profile can flip with language-even on the same leader and the same prompt structure.
Weener's prior work on U.S.-Taiwan digital politics adds context. Her team found AI-generated content alone doesn't lift engagement; paired with memes, it spreads further. That nuance is critical for election-monitoring teams and platforms deciding where to focus detection and friction.
What research and policy teams can do now
- Test in multiple languages (original + translations). Log score deltas by language and script.
- Probe both positive and negative false claims for the same subject to expose asymmetric agreement.
- Capture refusals and safety messages alongside answers to separate guardrail effects from model beliefs.
- Use consistent rating scales and compare across models; watch for leader- and country-specific outliers.
- Document censorship patterns (when, about whom, and under which phrasing).
- For sensitive topics, consider cross-model verification, explicit uncertainty prompts, and human review.
The researcher behind the study
Weener, who has Taiwanese heritage and grew up in Boxford, Mass., first visited Taiwan before the 2024 presidential election there. She partnered with faculty mentors to run survey experiments and social media analysis on misinformation narratives, work that earned recognition from the North American Taiwan Studies Association.
The U.S. election that followed, along with the rise of generative AI tools, gave her team a clear testbed. Their takeaway: content format and context matter more than AI labels alone.
"I've always enjoyed understanding more about policy and using tech as a tool for social good, whether it's policy analysis, or now understanding the implications of an AI-driven information environment," says Weener. "I think we're just literally beginning to ask the questions; forget understanding what the answers are."
For teams building audit skills
If your group is formalizing prompt-testing and evaluation workflows, you may find this prompt-engineering resource useful: Prompt Engineering Guides and Courses.
Your membership also unlocks: