Gemini produces most human-like writing among AI chatbots as ChatGPT scores poorly in detection test

Google's Gemini fooled AI detection tools more consistently than ChatGPT and 10 other chatbots in a 12-model test. Grammarly caught only 43.5% of AI text overall, while GPTZero flagged roughly 99%.

Categorized in: AI News Writers

Published on: Apr 17, 2026

Google Gemini produces writing that fools detection tools most effectively

Google's Gemini generated text that passed AI detection tests more consistently than ChatGPT and 10 other major chatbots, according to an experiment comparing how well detection software identifies machine-written content.

Open Resource Application tested 12 AI models by asking each to write a long-form article indistinguishable from human writing. The resulting texts were run through Grammarly, QuillBot, and GPTZero-three widely used detection platforms.

Gemini's output scored lowest on Grammarly and registered zero detections on QuillBot. ChatGPT performed poorly by comparison.

Why Gemini's writing proved harder to detect

ORA attributed Gemini's performance to its sentence structure and narrative development. The model varies its phrasing rather than cycling through predictable patterns that detection tools recognize.

Most AI detectors flag repetitive sentence structures and formulaic language. Gemini diverges from these patterns. GPTZero, which assesses both predictability and overall structure, still identified most AI text-but models that develop ideas rather than recycle familiar phrases create harder detection targets.

Detection tools show wildly different results

The same text could pass one detector and fail another. Grammarly identified only 43.5 percent of AI-generated content overall. GPTZero caught approximately 99 percent.

For AI for Writers, this inconsistency creates real problems. A student assignment might pass plagiarism checks in one system and trigger alerts in another. Office workers face the same uncertainty-their writing could draw suspicion depending on which software their organization uses.

The detection problem gets harder

AI writing styles are diverging rather than converging. ChatGPT's distinctive voice, established early in the market, remains recognizable to detectors. Newer models developed their own styles, making pattern-based detection less reliable across the board.

Research suggests approximately half of online content may now be AI-generated. As models multiply and styles fragment, detection methods built on the assumption of a single AI writing pattern face fundamental limits.

The distinction between human and AI writing is becoming less stable. Detection tools may improve, and other models may follow Gemini's approach. For now, the criteria for judging whether text came from a human or a machine depends heavily on which tool does the judging.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Gemini produces most human-like writing among AI chatbots as ChatGPT scores poorly in detection test

Google Gemini produces writing that fools detection tools most effectively

Why Gemini's writing proved harder to detect

Detection tools show wildly different results

The detection problem gets harder

Related AI News for Writers

Dutch publisher floods bookstores with 2,000 undisclosed AI books at rate of 10 a day

Chicago journalists and voice actors sue Google, Amazon, Apple and others over AI voice training

Writer runs experiment to test whether readers can spot AI-generated prose

AI writes a column in seconds, but something gets lost in translation

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: