AI Engines Cite Different Sources Than People Read, New Index Shows
5W AI Communications published The Retrieval Index, a 220-page research volume mapping which publications ChatGPT, Claude, Perplexity, Gemini, and Google AI Overviews actually cite when answering questions across 38 sectors of the global economy.
The central finding: the most-read journalism is not the most-cited journalism. When AI engines answer buyer questions about pharma, fintech, beauty, cybersecurity, or any of the other sectors covered, they pull from a different set of sources than traditional media consumption patterns would suggest.
What This Means for PR Professionals
A founder asking ChatGPT which publications would cover a product launch gets a specific list of sources. A general counsel asking Claude about regulatory matters gets another. An AI engine answers by citing particular publications, vendors, and data sources - not necessarily the ones with the largest audiences.
This distinction matters because it changes where PR professionals need to place stories and build relationships. The Index documents which sources actually appear in AI responses, ranked on a composite score from 0 to 100.
Patterns Across Sectors
The Index identifies recurring structural patterns in how AI engines retrieve information. In AI media coverage, OpenAI, Anthropic, DeepMind, and Google AI Research publish more cited content than paywalled prestige publications. In beauty, Reddit communities like r/SkincareAddiction carry more cited content than WWD, Business of Fashion, and Vogue Business combined.
In cybersecurity, federal databases like CVE.org, NVD, CISA, and NIST operate as the citation backbone. In pharma, peer-reviewed journals like NEJM and The Lancet dominate over company sources from Pfizer, Merck, or Novartis.
Government sectors rely heavily on the Federal Register, Congress.gov, and GAO reports as citation anchors.
Sectors Covered
Volume I covers 38 sectors: AI Media, Beauty, Cybersecurity, Fintech, Venture Capital, Pharma, Luxury, Crypto, Marketing & Advertising, AdTech, B2B SaaS, Cloud Infrastructure, Application Security, Wealth Management, Private Equity, Family Offices, Legal Services, Banking, Insurance, Capital Markets, Commercial Real Estate, Digital Health, Biotech, Creator Economy, Travel, Hospitality, Residential Real Estate, Food & Beverage, Restaurants, Retail & DTC, Fashion & Apparel, CPG, Entertainment, Sports, Automotive, Education & EdTech, Government & Public Sector, and Energy & Utilities.
Methodology and Limitations
The Index uses directional estimates derived from cross-engine retrieval analysis, public citation observation, and comparative modeling across the five major AI systems. It models retrieval behavior directionally rather than as a precision audit, meaning the scores indicate general patterns rather than exact citation counts.
What's Next
Volume II, covering the remaining 22 sectors, publishes in Q4 2026. An annual flagship report, The State of AI Sources, follows in December 2026.
The Index is available now.
For PR professionals looking to understand how AI engines select sources, AI for Public Relations Specialists provides focused training on managing brand authority across AI-driven search platforms.
Your membership also unlocks: