Five Major Publishers and Author Sue Meta Over Llama Training Data
Five publishing houses and author Scott Turow filed a class action lawsuit against Meta and CEO Mark Zuckerberg on May 5, 2026, alleging the company used millions of copyrighted texts to train its Llama.3 generative AI platform without permission.
The plaintiffs are Hachette Book Group, Macmillan Publishing Group, Cengage Learning, McGraw Hill, and Elsevier. Publishing services company Scribe Inc joined the suit.
The lawsuit claims Meta committed willful copyright infringement by training its generative AI and LLM system on copyrighted works. The legal action signals escalating tension between AI developers and content creators over training data sourcing.
The case follows similar copyright disputes in the publishing and creative industries. Authors and publishers have challenged multiple AI companies over whether training on published works without licensing constitutes fair use.
For AI for IT & Development professionals, the lawsuit underscores growing legal exposure around data sourcing and model training. Organizations building or deploying generative AI systems face increasing scrutiny over how training datasets are obtained and used.
The outcome could affect how companies approach data acquisition for future AI development and set precedent for copyright liability in the sector.
Your membership also unlocks: