Can ChatGPT Recognize Its Own Writing in Scientific Abstracts?
With generative AI becoming more common in scientific writing, telling apart AI-generated text from human-written content is a real challenge. But can ChatGPT itself identify if a scientific abstract was written by it or a human? A recent study explored this question by testing ChatGPT-4.0’s ability to recognize its own output.
Study Design
The research randomly selected 100 medical articles published in 2000—well before AI writing tools existed—from top internal medicine journals. For each, ChatGPT-4.0 generated a structured abstract based only on the article’s full text (with the original abstract removed). This resulted in 100 human-written and 100 AI-generated abstracts.
Then, ChatGPT-4.0 was asked to score each abstract twice on a scale from 0 to 10, where 0 meant “definitely human,” 10 meant “definitely ChatGPT,” and 5 was “undecided.” Scores from 0–4 were classified as human, 6–10 as AI, and 5 as uncertain.
Key Findings
- ChatGPT-4.0 misclassified nearly half of the abstracts in both evaluation rounds (49% and 47.5%).
- There was no significant difference in score distributions between human and AI abstracts.
- The model never used the “undecided” score (5), suggesting it forced a choice even when unsure.
- Consistency was low: agreement between the two rounds was only about 66.5%, with poor statistical reliability (Cohen’s kappa = 0.33).
In short, ChatGPT-4.0 failed to reliably and consistently tell if an abstract was written by itself or by humans.
What This Means for Writers and Editors
As AI tools integrate into writing workflows, distinguishing AI-generated content from human work is becoming critical—especially in scientific publishing where transparency matters. This study shows that relying on ChatGPT itself to detect AI-written text is not effective. Writers, editors, and reviewers can’t assume that ChatGPT can self-identify its outputs.
External detection tools exist but also struggle with accuracy. Human reviewers often can’t tell AI-generated writing apart either. The best current practice is to encourage clear disclosure of AI use in manuscript preparation and to develop better, specialized detection methods that combine linguistic analysis with contextual clues, such as editing history or metadata.
How Does This Compare to Other Research?
Other studies have shown that both humans and AI-detection software find it hard to detect AI-generated scientific writing. For example:
- Human reviewers performed just slightly better than chance when judging abstracts written by earlier ChatGPT versions.
- Many AI-detection tools fail to reach high accuracy, often dropping below 80%.
- Linguistic analysis tools like GLTR highlight differences in word predictability but require manual interpretation and are less effective with newer AI models like GPT-4.
A preprint study testing earlier language models reported higher self-detection accuracy (up to 83%), but it focused on simpler, short essays rather than complex scientific abstracts. This suggests that the difficulty of detection rises with text complexity and domain specificity.
Limitations to Keep in Mind
- The study only looked at abstracts from internal medicine journals published in 2000. Other fields or more recent articles might yield different results.
- The 0-10 scoring scale and classification thresholds were not externally validated and may influence results.
- Only one version of ChatGPT (GPT-4.0) and a single prompt were used for classification; different setups could affect outcomes.
Final Thoughts
For writers and editors, this study highlights the current limits of AI self-detection. ChatGPT can generate fluent scientific abstracts, but it can’t reliably recognize its own writing. This means that transparency policies, author disclosures, and continued development of detection tools remain essential to maintain trust in scientific publishing.
If you want to learn more about AI tools and how to work effectively with them in writing and publishing, check out some practical courses and resources on Complete AI Training.
Your membership also unlocks: