When AI Trains on Your Books Without Permission: What Writers Need to Know
Authors are facing a new challenge: their creative works are being used without consent to train artificial intelligence models. These AI systems, once trained, cannot erase or "unlearn" the content they've absorbed. For writers whose books have been scraped from unauthorized sources, this raises serious questions about control, rights, and future implications.
The Scale of Unauthorized Use
Thousands of books, including numerous editions and foreign versions, have been collected from pirate websites and incorporated into datasets that train large language models (LLMs) for major tech companies. This includes works across genres and authors, with copyright infringements spanning hundreds of thousands of titles.
Despite clear copyright laws, investigations reveal that some tech employees were aware this data collection was illegal. Yet, the practice persisted, leaving many writers feeling powerless as their intellectual property became fodder for AI training without consent or compensation.
Taking Action: What Can Writers Do?
One step authors can take is to work with licensing companies to explicitly exclude their works from being used by AI. This involves cataloging every edition and format of a book to set up formal restrictions. While this is a proactive approach, it’s labor-intensive and only effective if legitimate AI companies respect these licenses moving forward.
Unfortunately, this does not undo the unauthorized use that has already occurred. Since AI models cannot "unlearn" scraped content, any data already incorporated remains part of their training, making retroactive control impossible.
The Legal Landscape and Its Challenges
Current legislative proposals in some countries, such as the UK, suggest that training AI on copyrighted works could be considered fair use by default. Under these proposals, authors must "opt out" of having their works included, rather than "opt in." This means writers must actively exclude every ISBN, edition, and territory to protect their work. The process is complicated further by issues like short stories in anthologies or overlooked older editions.
An opt-in system, where AI companies can only use works explicitly permitted by authors, would offer clearer protection. However, such a system is not yet widely adopted.
Adding to the concern, proposed laws may allow AI-generated derivatives to be copyrighted—even if based on unlawfully used source material. This effectively protects outputs created from stolen content, raising questions about fairness and the future of creative rights.
Why This Matters Beyond Money
While debates often focus on revenue and wealth distribution, there is a deeper cultural cost. Storytelling is a cornerstone of human culture, carrying knowledge, wisdom, and shared experience across generations. Allowing AI companies to freely exploit this cultural heritage risks diluting its value and undermines the role of human creativity.
Moreover, widespread unauthorized AI training may impact how society processes information, potentially affecting critical thinking and the importance of truth in communication.
What Writers Should Watch For
- Stay informed about copyright laws related to AI training and how they evolve in your region.
- Consider registering your works with licensing platforms that offer AI usage controls.
- Engage with writer advocacy groups pushing for fair legislation and transparency from AI developers.
- Be cautious about sharing your work on platforms that may not respect copyright.
Looking Ahead
Efforts like demands for AI crawlers to identify themselves, respect copyright, and notify creators if their work has been scraped mark progress. However, much remains unresolved.
Writers must balance protecting their current works with continuing to create. While the fight to safeguard authors' rights continues, understanding the landscape and taking strategic steps is essential.
For writers interested in how AI might impact their profession, exploring AI courses tailored for various jobs can provide insight into the technology and its implications.
The intersection of creativity and AI is complex, but staying informed and proactive helps maintain control over your work and its future.
Your membership also unlocks: