Third Circuit Showdown: Is Training AI on Legal Headnotes Fair Use?

Third Circuit will decide if training AI on proprietary legal headnotes is fair use or infringement. The ruling could set licensing, safeguard, and output rules across industries.

Categorized in: AI News Legal
Published on: Oct 12, 2025
Third Circuit Showdown: Is Training AI on Legal Headnotes Fair Use?

AI training on proprietary headnotes: what the 3rd Circuit could decide

Oct. 13, 2025

A Third Circuit appeal in Thomson Reuters v. ROSS Intelligence is poised to address a core question for the profession: does using proprietary legal headnotes to train AI infringe copyrights, or qualify as fair use? The answer will influence how legal research tools are built and licensed, and it will echo across medical, financial, and other data-heavy sectors.

Why this appeal matters

This is expected to be the first U.S. Circuit Court opinion to squarely weigh fair use in the AI training context. The court's framing of "training use," protectable expression in headnotes, and market harm could set the compliance baseline for commercial AI products that rely on premium databases.

The core legal questions

  • Are headnotes protectable expression distinct from unprotectable judicial opinions and facts?
  • Is copying headnotes to train a model "transformative," or is it a substitute for the original market?
  • Does intermediate copying for model training receive different treatment than outputs shown to users?
  • What evidence of market harm matters: present substitution, future licensing markets, or both?

Fair use factors at issue

  • Purpose and character: Training is functional, not expressive display. After the Supreme Court's guidance in Warhol v. Goldsmith (2023), "transformative" use must be justified by a distinct purpose that does not usurp the original's market.
  • Nature of the work: Primary law is not copyrightable, but headnotes and editorial enhancements can be. The more creative the selection and phrasing, the stronger the protection.
  • Amount and substantiality: Training often involves copying at scale. Courts may ask whether wholesale ingestion of headnotes was necessary and if less could have achieved similar performance.
  • Market effect: If training avoids or erodes demand for licensed headnotes-or impairs a plausible licensing market for training-this factor cuts against fair use. See the statutory factors in 17 U.S.C. ยง 107.

Possible outcomes and how they affect you

  • Training is fair use (with limits): Courts may permit training on proprietary annotations if outputs don't expose or replicate protectable expression. Expect stricter scrutiny on output controls and provenance.
  • Training requires a license: Vendors will need paid access to premium editorial content for pretraining and fine-tuning. Costs rise; licensing and compliance teams gain leverage.
  • Split outcome: Training may be allowed, but certain uses (e.g., verbatim or near-verbatim replication of editorial text) are restricted or penalized. Auditable safeguards become essential.

What in-house and firm leaders should do now

  • Audit model inputs: Inventory datasets used for training, fine-tuning, and retrieval. Flag any premium editorial content (headnotes, citators, treatises, proprietary tax or medical annotations).
  • Demand data provenance: Require vendors to disclose sources, licenses, and "clean room" processes. Push for documentation on scraping practices and removal of suspect datasets.
  • Tighten contracts: Add warranties on lawful data sourcing, no use of unlicensed editorial content, and indemnities for IP claims. Include audit rights and incident notification duties.
  • Control outputs: Configure models and RAG systems to avoid reproducing proprietary annotations. Use filters, retrieval constraints, and sampling tests to detect leakage.
  • Segment use-cases: Separate experimental models from client-facing tools. Restrict any dataset with uncertain rights to sandbox environments.
  • Plan for two budgets: One for data licenses if required; another for compliance tooling (monitoring, logging, dataset governance).
  • Preserve evidence: Keep records of data lineage, training runs, and red-team findings. This reduces exposure if litigation or a regulator requests details.

Signals to watch in the opinion

  • How the court defines "training use" vs. public display of expressive content.
  • Weight given to potential licensing markets for training datasets.
  • Whether wholesale copying for intermediate machine processing gets distinct treatment.
  • Any standard for acceptable safeguards to prevent output leakage of protected editorial text.

Impact beyond legal research

A ruling that favors licensing will push hospitals, banks, and publishers to review any model trained on paid, annotated databases. A more permissive fair use holding will still force vendors to prove that outputs do not reproduce proprietary editorial content.

Action step

If you oversee AI or knowledge systems, update procurement and product policies to square with both outcomes. For team enablement on safe AI deployment and vendor diligence, see AI courses by job.

Bottom line

The Third Circuit's approach to headnotes and AI training will set the operating rules for commercial research platforms. Prepare for either a licensing-first future or a fair-use-with-guardrails future-and make your documentation and contracts ready for both.


Tired of ads interrupting your AI News updates? Become a Member
Enjoy Ad-Free Experience
Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)