Researchers at the University of Gothenburg invented a fake eye condition called bixonimania to test if large language models could filter out fabricated data. Instead, major chatbots absorbed the misinformation and presented it as a real medical diagnosis, exposing vulnerabilities in how these systems process web-scraped training data.
The experiment
Almira Osmanovic ThunstrΓΆm, a medical researcher and AI strategist, designed the test after noticing gaps in how students understood model training. "It was interesting how few of them, or how few even people within A.I., understand how large language models are built," she said on Scientific American's "Science Quickly" podcast.
In early 2024, her team published two blog posts on Medium and two research reports on a preprint server. The fake papers listed a lead author named Lazljiv Izgubljenovic-which translates to "lying loser"-and included a title roughly translating to "Hyperpigmentation: A Real B.S. Design." The preprint server has since removed the documents.
How the models responded
Despite the blatant clues, major AI systems integrated the fabricated disease into their outputs. By April 2024, Microsoft Bing's Copilot described the condition as an "intriguing and relatively rare" illness. Google's Gemini and OpenAI's ChatGPT also diagnosed users with the fake disease when prompted about eye irritation or blue light exposure, and the condition even generated its own Wikipedia page.
This failure highlights a structural flaw in how Generative AI and LLM architectures ingest and weight unverified online sources. The models treated the fabricated preprints as credible scientific literature without applying basic fact-checking filters.
Human oversight failures
The study revealed that human researchers also failed to catch the fabrication. Several scientists cited the bogus preprints in their own work without reading the text.
The team had embedded obvious absurdities throughout the documents to test this exact scenario. These included acknowledgments of funding from the Galactic Triad and Lord of the Rings, alongside thanks to colleagues at the Starship Enterprise and Professor Ross Geller. At least one paper explicitly stated that the entire document was fabricated.
"This is a masterclass on how mis- and disinformation operates," said Alex Ruani, a misinformation researcher at University College London, to Nature. "If the scientific process itself and the systems that support that process are skilled, and they aren't capturing and filtering out chunks like these, we're doomed."
Why this matters for science and research professionals
Scientists relying on AI tools for literature reviews or data synthesis must treat model outputs as unverified claims rather than established facts. The bixonimania experiment proves that AI for Science & Research applications will faithfully reproduce web-based hallucinations if the underlying training pool contains unvetted preprints. Researchers must verify primary sources directly and scrutinize the provenance of any AI-generated citation before incorporating it into their work.
Your membership also unlocks: