Independent publishers signal legal action against AI companies in the UK
Nearly 40 members of the Independent Publishers Guild (IPG) have sent pre-action letters to leading generative AI companies through Fox Williams LLP, alleging unlawful use of their works in AI training. The correspondence states that large-scale copying of books, journals, and other materials occurred without permission or a valid legal basis. The publishers signal they will litigate if engagement and remediation do not follow.
Why this matters for legal teams
This is a coordinated, industry-level move - not a one-off claim. It squarely targets the core inputs of foundation models and raises immediate questions on copying, dataset provenance, and licensing (Research). Expect requests for disclosure of training sources, retention policies, and opt-out mechanisms.
The likely legal theories (UK-focused)
- Copyright infringement (CDPA 1988): Reproduction for model training may be infringing absent a license or an applicable exception.
- Text and data mining (TDM) exception limits: The UK's Section 29A permits TDM for non-commercial research, not broad commercial training. See CDPA s29A.
- Contract/TPM circumvention: Scraping behind paywalls, DRM, or contrary to licensing terms could support additional claims.
- Database right: Systematic extraction from protected databases may trigger sui generis database claims.
- Moral rights/attribution: Limited in scope here but may arise depending on use and presentation.
Exposure and remedies on the table
- Injunctions: To restrain ongoing use of specific datasets or outputs trained on them.
- Damages or account of profits: Including potential additional damages for flagrancy.
- Disclosure orders: Norwich Pharmacal or similar relief to identify data sources and data flows.
- Declaratory relief: On the lawfulness of past and ongoing training practices.
Immediate actions for AI companies
- Audit datasets: Catalogue training sources, acquisition paths, licenses, and any DRM or contractual restrictions. Preserve evidence.
- Validate legal bases: Map use cases against CDPA exceptions and licenses. Do not rely on non-commercial TDM for commercial models.
- Tighten procurement: Centralize rights clearance, vet data vendors, and require provenance warranties/indemnities.
- Segment and quarantine: If provenance is unclear, isolate suspect data and retraining pipelines pending review.
- Transparency posture: Prepare to disclose high-level dataset information and opt-out processes without compromising trade secrets.
- Engage early: A structured response under the Pre-Action Conduct framework can narrow issues and reduce injunction risk.
Immediate actions for publishers
- Evidence collection: Compile ISBN lists, licensing terms, and digital access logs relevant to suspected scraping or bulk access.
- Contract review: Check platform terms, DRM settings, and vendor agreements for scraping prohibitions and audit rights.
- Rights posture: Clarify ownership, collective management arrangements, and any prior grants that could be asserted as defenses.
- Coordinated strategy: Consider group claims, cost-sharing, and venue selection (High Court vs. IPEC) based on case value and complexity.
Context you can use in briefing notes
The UK government considered expanding the TDM exception to cover commercial uses but stepped back after opposition from rights holders. Current law leaves limited room for unlicensed commercial training. Background materials are here: AI and IP: call for views. For teams seeking a structured route to better understand data sourcing, handling, and model training considerations, see the AI Learning Path for Data Scientists.
What to watch next
- Whether AI companies offer licensing frameworks or dataset transparency to head off filings.
- Test cases clarifying whether model training is "copying" and the scope of fair dealing equivalents in the UK.
- The use of targeted injunctions or disclosure orders focused on specific datasets rather than entire models.
Bottom line: the letters escalate a live risk. If your models rely on large-scale literary corpora, assume scrutiny, document your chain of rights, and be ready to show your homework.
Your membership also unlocks: