Wikipedia to AI Developers: Attribute, pay for scale, and stop scraping like humans
Wikipedia laid out a clear plan to keep its knowledge base sustainable as AI use surges. The ask is simple: give proper attribution to human contributors and use the paid Wikimedia Enterprise product when accessing content at scale.
This follows a spike in bot traffic that tried to mimic humans. After improving detection, Wikipedia reported that much of the unusual traffic in May and June came from evasive AI bots, while human page views fell 8% year-over-year.
What Wikimedia is asking for
- Attribution by default: Credit the volunteer editors whose work trains your models and powers your product.
- Use Wikimedia Enterprise for scale: It provides high-throughput access without overloading Wikipedia's servers and helps fund the nonprofit mission.
- Be transparent: Make it clear where information comes from and link back so people can visit, verify, and contribute.
Why this matters to IT, engineering, and product teams
Provenance isn't a nice-to-have anymore. Clear sourcing builds user trust, protects you from brittle data pipelines, and keeps the commons that your systems depend on healthy.
If fewer people visit and contribute to Wikipedia, content quality and freshness suffer over time-bad news for search, LLM outputs, and any feature that leans on open knowledge.
Practical steps to comply
- Adopt the official feed: Route high-volume access through Wikimedia Enterprise. Budget for it like any core data dependency.
- Show your work: Display citations and link back to the source article. Document attribution in your help center and release notes.
- Respect policies: Honor
robots.txt, send a truthful user agent, and don't evade rate limits. Treat 429s as a signal, not an obstacle. - Track provenance end-to-end: Update model cards and data statements with sourcing details. Keep logs that prove compliant access.
- Harden your bots: Identify them clearly, avoid human-like behavior patterns, and schedule fetches in off-peak windows.
- Support the ecosystem: Link users back to Wikipedia where possible so editors, donors, and moderators can keep the content improving.
No legal threats-just a clear line
Wikimedia isn't waving penalties here. The message is cooperative: use the paid channel for bulk access, attribute contributors, and stop scraping in ways that pretend to be human.
Wikipedia's AI plans (for editors, not replacements)
The organization also shared an AI strategy focused on helping editors with repetitive work, translations, and tooling-supporting humans rather than swapping them out.
Bottom line
If you build or run generative AI products, treat Wikipedia like any critical vendor. Pay for scaled access, credit your sources, and keep your bots honest. It's cleaner for your stack and better for the knowledge base your users rely on.
Related resources: Wikimedia Foundation
Want to upskill your team on responsible AI practices and product workflows? Explore curated programs at Complete AI Training.
Your membership also unlocks: