A systematic analysis of more than 14,000 Google search results reveals that AI-generated summaries often surface inconsistent, lower-quality sources while suppressing content from reputable publishers that block Gemini's training crawlers. The research, set to be presented at the ACM Conference on Research and Development in Information Retrieval in Australia this summer, is the first large-scale empirical evidence of how generative AI reshapes the information ecosystem for both publishers and everyday users.
Riley Grossman, a fourth-year doctoral student in business data science at New Jersey Institute of Technology, led the study with advisers Professor Yi Chen and Professor Cristian Borcea. The team originally focused on online privacy regulations but shifted emphasis after seeing publishers blame AI-driven search changes for newsroom cuts. "Generative AI disrupts the way that users see information presented to them," Grossman said. "It changes both the way that they get information from the web - so now instead of clicking links, clicking sources, you might just read an AI summary of those sources - and then secondarily it redirects you or refers you to a different list of sources."
A split in the source lists
The separation between AI Overview sources and traditional organic search results is stark. "If you only look at the links in the AI overview, you're going to end up on very different websites than if you look at the links that are in the organic search results below the AI overview," Grossman said. The researchers found that well-known outlets such as Nature and The New York Times, which have chosen to opt out of Gemini training, appear with dramatically lower frequency - or not at all - in Google's AI overviews. Google has not explained the specific mechanics behind this divergence.
"Google is notoriously not willing to play ball on some of those questions," Grossman said. "When publishers say you're stealing our traffic, Google's response so far has been we're actually not stealing your traffic, we're still referring the same amount of traffic to publishers … Google's maybe in denial about some of this stuff."
Hallucinations and zero-click erasure
The study documented a particularly glaring error. A query about an upcoming fight between YouTuber Jake Paul and boxer Anthony Joshua produced an AI summary claiming Paul won by unanimous decision "in a major upset." When the fight actually occurred, Joshua knocked out Paul and broke his jaw in two places. Many users never click beyond the summary, a behavior Grossman described as a zero-click search. "This is just a complete erasure for the site traffic. It's completely removing that traffic from any website," he said.
Without referral clicks, legitimate news operations may shrink or vanish - yet Google depends on those same outlets to produce the original reporting that feeds its search index. The relationship is co-dependent and increasingly strained.
Licensing deals and the power imbalance
One potential remedy, Grossman pointed out, is more licensing agreements like Google's existing deal with the Associated Press. "The problem is that if they're only one-off deals and the AI company, without any sort of regulation to be backing these deals, really holds all the power because they're coming to the table saying, 'You can either take this deal or we can continue to scrape your content for free and use it for free, and the only thing you can do is enter a really long, arduous legal battle with us' - that's certainly my concern," he said.
Grossman argued for a combination of regulatory oversight and an industry framework that brings transparency to payment terms, letting publishers negotiate based on actual market rates.
The spillover into politics and healthcare
Co-author Cristian Borcea, professor of computer science at NJIT, said the same research methods apply to politically sensitive queries and medical information. Political searches frequently returned AI overviews featuring less-credible, biased sources, the team found. Separately, the researchers are tracking a significant decline in traffic to government healthcare websites over the past 18 months - a drop they attribute partly to AI overviews and the removal of established health content from the web.
Why this matters for science and research
When AI summaries deprioritize sources that restrict bot access, the information shown to users can bypass peer-reviewed or rigorously fact-checked material. The same mechanism can distort evidence landscapes in fields where researchers rely on search for literature discovery, methodology checks, or policy data. This dynamic underscores why professionals working in AI for Science & Research need to treat AI-curated search results with the same scrutiny they would apply to any unverified dataset. The NJIT team's work provides a repeatable method for auditing how generative AI filters knowledge - a method that, as Borcea noted, is just as urgent in public health and democratic discourse as it is in publishing.
Your membership also unlocks: