Publishers and authors sue Meta over unauthorized use of copyrighted works to train Llama AI

Five major publishers and author Scott Turow sued Meta on May 5, alleging it scraped millions of copyrighted books from pirate sites to train its Llama AI. It's the first copyright suit by book publishers against a tech company over AI training data.

Categorized in: AI News Writers

Published on: May 08, 2026

Publishers and Authors Sue Meta Over Unauthorized Use of Books to Train Llama AI

Five major publishers and bestselling author Scott Turow filed a class action lawsuit against Meta and CEO Mark Zuckerberg on May 5, alleging the company scraped millions of copyrighted works from pirate sites to train its Llama language model without permission or payment. The suit, filed in New York federal court, marks the first copyright infringement case brought directly by book publishers against a tech company over AI development.

The plaintiffs-Elsevier, Cengage, Hachette, Macmillan, and McGraw Hill-organized with the Association of American Publishers. They claim Meta deliberately downloaded and torrented unauthorized copies of literary works to build Llama's training dataset.

What the Lawsuit Alleges

The complaint describes Llama as an "infinite substitution machine" that competes directly with human-authored works. The company's model can generate verbatim copies of training material, produce near-verbatim paraphrases, create knockoffs of popular novels, and flood markets with AI-generated alternatives.

The plaintiffs cite evidence that Meta employees understood their use of shadow libraries was legally questionable and "attempted to conceal" the practice. Internal emails suggest the company knew it was accessing unauthorized sources.

The lawsuit seeks a court declaration that Meta violated copyright law, an injunction against future infringement, maximum monetary damages, an accounting of Llama's training materials and methods, and destruction of all infringing copies.

Mixed Precedent in AI Copyright Cases

Courts have delivered conflicting rulings on whether AI training on copyrighted material qualifies as fair use. In June 2025, Judge William Alsup found that Anthropic's use of unauthorized books to train Claude AI was fair use-but ruled that keeping millions of unauthorized downloads in a permanent research library was not. That finding led to a $1.5 billion settlement.

In the same month, Judge Vincent Chhabria found AI training on copyrighted works to be fair use in a separate case against Meta. However, Chhabria suggested there may be an issue with "market dilution"-the possibility that AI-generated works could disrupt the market for human-authored content.

The publishers' complaint leans heavily on Chhabria's market dilution theory, arguing that Llama's outputs are already displacing human-authored works. They point to AI-generated books flooding Amazon's Kindle store as evidence the risk is not theoretical.

Industry Response

Meta said it will "fight this lawsuit aggressively," noting that courts have found AI training on copyrighted material can be fair use.

Maria Pallante, president and CEO of the Association of American Publishers, said Meta "made calculated decisions to enrich itself with literary properties that it did not create and does not own, when instead it could have partnered with publishers and authors."

McGraw Hill CEO Philip Moyer said the company supports AI's role in education but believes "protecting the foundational intellectual property rights of human authors" is essential. Hachette CEO David Shelley called the alleged conduct "wholesale theft," while Macmillan CEO Jon Yaged said it was "unconscionable" for one of the world's most valuable companies to steal from creators.

Broader Legal Landscape

Over 100 copyright lawsuits related to AI development are now pending in U.S. courts. This suit comes as two of the publisher plaintiffs-Cengage and Hachette-await a ruling on their bid to intervene in a separate case against Google over its Gemini AI service.

The publishers' complaint is broader than most publishing-related AI suits, proposing a class that includes copyright owners of novels, poems, nonfiction works, scientific journals, and other literary content. Meta's Llama can generate travel guides, book summaries, study guides, and imitations of published works on demand.

For writers, understanding how AI systems are trained and the legal frameworks governing their use is increasingly critical. AI for Writers resources can help you navigate the technology's implications for your work, while Generative AI and LLM Courses provide deeper technical context on how these systems operate.

Get Daily AI News

Your membership also unlocks:

700+ AI Courses

700+ Certifications

Personalized AI Learning Plan

6500+ AI Tools (no Ads)

Daily AI News by job industry (no Ads)

Publishers and authors sue Meta over unauthorized use of copyrighted works to train Llama AI

Publishers and Authors Sue Meta Over Unauthorized Use of Books to Train Llama AI

What the Lawsuit Alleges

Mixed Precedent in AI Copyright Cases

Industry Response

Broader Legal Landscape

Related AI News for Writers

AI-generated fiction still can't make characters do anything, a simple test shows

Peter Thiel-backed startup charges $2,000 to let wealthy clients use AI to scrutinize journalists' work

Commonwealth Short Story Prize winner accused of using AI to write his entry

About Complete AI:

Latest AI News for your Job:

Courses by AI Skill:

Courses by Job Field:

Courses by AI Company:

AI Tools for your Job:

AI Tools by Type:

AI Certifications by Skill:

AI Certifications by Job Field:

AI Certifications by Company:

Publishers and authors sue Meta over unauthorized use of copyrighted works to train Llama AI

Publishers and Authors Sue Meta Over Unauthorized Use of Books to Train Llama AI

What the Lawsuit Alleges

Mixed Precedent in AI Copyright Cases

Industry Response

Broader Legal Landscape

Related AI News for Writers

AI-generated fiction still can't make characters do anything, a simple test shows

Newsletter fires human writers and replaces them with AI days after raising $2 million from readers

Peter Thiel-backed startup charges $2,000 to let wealthy clients use AI to scrutinize journalists' work

Commonwealth Short Story Prize winner accused of using AI to write his entry