Authors Sue Apple, Claim Apple Intelligence Was Trained on Thousands of Pirated Books

News Scandal in Silicon Valley: Apple Faces Lawsuit Over Alleged Use of Thousands of Pirated Books to Train Apple Intelligence

Writers are suing Apple for allegedly training Apple Intelligence on a massive stash of pirated books. The case targets the heart of modern AI: where training data comes from, who gets paid, and what happens when tech and creative work collide.

For authors, the stakes are obvious. If your work trained systems that now mimic your voice or compete with your titles, you want answers - and compensation.

Background of Apple Intelligence

Apple Intelligence is Apple's suite of AI features across iPhone, iPad, and Mac. It summarizes messages, rewrites text, automates actions, and personalizes content.

Behind the scenes are large language models such as OpenELM and Apple's Foundation Models. These systems need huge volumes of text. The lawsuit claims Apple pulled from sources it shouldn't have.

How the Pirated Books Controversy Emerged

Researchers and digital rights groups flagged a dataset called Books3 - a giant library scraped from pirate sites. Multiple AI companies have been linked to it. The suit says Apple is among them.

Over 183,000 digital books
Works by major authors and publishers
Thousands of titles that still earn royalties

The Plaintiffs Behind the Lawsuit

Authors Grady Hendrix and Jennifer Roberson filed a class-action case on behalf of writers whose books appear in Books3. Their core claims:

Apple used their copyrighted books without permission
Their writing was fed into Apple Intelligence models
Apple concealed the sources of its training data
Apple kept a private internal library of pirated content
The models output text that competes with their work

They're seeking damages, restitution, and an injunction that would stop Apple from using unlicensed texts in its models.

What Is Books3 and Why It Matters

Books3 is part of The Pile, a dataset assembled by EleutherAI. It includes books scraped from piracy hubs such as Bibliotik. For writers, the key issues are simple:

No consent from authors
Copyrighted works from major houses included
Used for commercial AI training
No notice or payment to rights holders

With Apple Intelligence embedded across millions of devices, the complaint argues Apple benefits from IP it didn't license.

How AI Training Works - And Where Copyright Issues Arise

Language models are trained on vast text corpora to learn patterns, style, and context. Sources can include public domain books, open datasets, Wikipedia, partner licenses - and, in disputed cases, scraped or unlicensed material.

The legal fight centers on whether ingesting copyrighted works for training is infringement. Some argue it's fair use; others say it's equivalent to scanning books to build a product without permission.

Context on fair use: U.S. Copyright Office: Fair Use.

Key Allegations Against Apple

Use of pirated books: Apple allegedly used Books3 and similar datasets to train OpenELM and its Foundation Models.
Lack of transparency: Apple has not disclosed full training data sources.
Private training library: Plaintiffs claim Apple kept an internal archive of copyrighted books.
Market dilution: Outputs can imitate an author's style, reducing demand for original work.
Unfair commercial advantage: Apple Intelligence helps sell devices using unlicensed IP, according to the suit.

Apple's Position and Industry Context

Apple hasn't issued a detailed public response to this case. Historically, it has said its models use licensed, publicly available, and user-provided data, with more on-device processing and user controls than rivals. The complaint challenges those claims.

Industry-Wide Problem: AI Models and Copyright

Apple is not alone. OpenAI, Google, and Meta have all faced claims tied to training data sourced from books, news, and websites. The bigger picture: AI development outpaced clear licensing norms. Courts are being asked to draw the lines.

Potential Legal Consequences for Apple

Monetary damages
Removal of copyrighted books from training sets
Limits on how Apple Intelligence operates
Mandatory data licensing for future training
Public disclosure of training sources

Impact on Authors and the Creative Economy

Writers worry about lost income, style imitation, and weaker copyright protections. A 2024 Authors Guild survey reported strong concern among authors about AI's impact on livelihoods and voice imitation.

For more context on advocacy and author rights: Authors Guild.

Summary of Key Allegations vs. Apple's Expected Defences

Use of pirated books: Plaintiffs say Books3 and similar libraries were used. Apple may argue: Data was mixed at scale; individual works aren't identifiable.
Copyright infringement: Works used without permission. Apple may argue: Training is transformative and falls under fair use.
Market dilution: Outputs compete with real books. Apple may argue: Features are assistive, not substitutes for authors.
Lack of transparency: Sources concealed. Apple may argue: Disclosure is limited for proprietary reasons.
Commercial benefit: Unlicensed inputs boosted features. Apple may argue: Any benefit is indirect and not tied to specific titles.

Why This Lawsuit Could Set a New Precedent for AI Regulation

Apple is a dominant player in consumer tech and fiercely protective of IP. A ruling against the company could ripple across the industry.

Clearer rules on what data can train models
Pressure to license books and pay rights holders
Greater transparency requirements
New licensing markets for authors
Slower release cycles for AI features across Big Tech

What Happens Next in Court

The case, filed in the Northern District of California, seeks a jury trial, injunctions, damages, restitution, attorney fees, and class certification. If the class is certified, thousands of authors could join.

What Writers Can Do Right Now

Audit where your books appear online, including piracy sites.
Register your copyrights and keep records of editions and formats.
Join or follow author advocacy groups for legal updates.
Clarify licensing terms with your agent or publisher for AI-related uses.
Track and document AI outputs that appear to imitate your style.

Conclusion

This case asks a blunt question: if your work trained AI, do you deserve a check and a choice? As courts weigh that question, writers should push for contracts, licensing models, and product disclosures that respect creative labor.

FAQ

What is Apple being sued for?
Apple is accused of using thousands of pirated books to train Apple Intelligence models without permission or payment.

Who filed the lawsuit?
Authors Grady Hendrix and Jennifer Roberson filed a class action on behalf of writers whose works appear in the Books3 dataset.

What is Books3?
A dataset of more than 180,000 books scraped from piracy sources, reportedly used by several AI companies for model training.

Could Apple face serious penalties?
Yes. Potential outcomes include damages, training set removals, operational limits, and new licensing requirements.

Why does this matter to writers?
It could set national rules on data used to train AI and whether authors must be compensated when their works are ingested.

Resource for upskilling: If you're mapping your next steps with AI and authorship, explore role-based learning paths here: Complete AI Training - Courses by Job.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Authors Sue Apple, Claim Apple Intelligence Was Trained on Thousands of Pirated Books

News Scandal in Silicon Valley: Apple Faces Lawsuit Over Alleged Use of Thousands of Pirated Books to Train Apple Intelligence

Background of Apple Intelligence

How the Pirated Books Controversy Emerged

The Plaintiffs Behind the Lawsuit

What Is Books3 and Why It Matters

How AI Training Works - And Where Copyright Issues Arise

Key Allegations Against Apple

Apple's Position and Industry Context

Industry-Wide Problem: AI Models and Copyright

Potential Legal Consequences for Apple

Impact on Authors and the Creative Economy

Summary of Key Allegations vs. Apple's Expected Defences

Why This Lawsuit Could Set a New Precedent for AI Regulation

What Happens Next in Court

What Writers Can Do Right Now

Conclusion

FAQ

Related AI News for Writers

Grammarly pulls Expert Review after authors blast AI for using their names

Letting AI finish your sentences can quietly change your mind

Grammarly pulls Expert Review AI after outcry over copying real writers' voices

AI Autocomplete Is Quietly Steering Your Opinions

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company: