Authors Appeal Meta Ruling on AI Training With Pirated Books
A group of prominent authors including Ta-Nehisi Coates, Junot Diaz, and Laura Lippman filed a motion Wednesday asking a federal judge to authorize an appeal of a ruling that sided with Meta in a copyright dispute over the company's use of pirated books to train its Llama language model.
The authors argue the case raises a "novel and highly consequential" issue with implications for how AI companies acquire training data and how copyright law applies to generative AI and LLM development.
The Original Dispute
The lawsuit began in July 2023 when authors alleged Meta downloaded pirated books from free online "shadow libraries" and used them to train Llama. Meta claimed the use qualified as fair use under copyright law, arguing that large language models benefit "billions of people."
U.S. District Judge Colm Chhabria ruled in Meta's favor last year. He found the authors failed to demonstrate that Meta's copying would "flood the market with similar works, causing market dilution." Chhabria also determined Meta's use was transformative, a key factor in fair use analysis.
Yet Chhabria also wrote that using copyrighted works to train chatbots would "probably infringe copyright in most cases" because generative AI "has the potential to flood the market with endless amounts of images, songs, articles, books, and more." The contradiction underscores the unsettled legal territory.
A Split Decision in the Courts
The same week Chhabria issued his ruling, U.S. District Judge William Alsup reached the opposite conclusion in a separate case against Anthropic. Alsup ruled that Anthropic was not entitled to a fair use defense for downloading millions of books from piracy sites. Anthropic later settled for $1.5 billion.
The conflicting rulings highlight why the authors say appellate review is necessary. They argue that without clear guidance from a higher court, "Meta and other AI companies will likely continue to use shadow libraries to take copyrighted works without permission."
The Broader Stakes for Writers
The authors contend that permitting acquisition-by-piracy based on downstream fair use creates a feedback loop that undermines efforts to shut down illegal book repositories. Those shadow libraries "were once on the brink of collapse," the authors argue, but continued AI company demand keeps them operational and profitable.
The question of whether AI for writers and other creators can legally rely on pirated training data remains unresolved at the appellate level. Meta has until June 22 to respond to the authors' motion.
Your membership also unlocks: