AI Assets in M&A Now Require Proof of Origin Before Closing
Buyers of AI-heavy companies are increasingly walking away from deals if sellers cannot prove where their training data came from and whether it was legally obtained. This marks a fundamental shift in how the technology sector approaches mergers and acquisitions.
The change reflects a hard reality: defects in an AI model's training data cannot be fixed the way software bugs can. Unlike traditional code, which developers patch after closing, corrupted training data often requires wholesale retraining or renegotiating rights at enterprise scale-if it's even possible.
The Bartz v. Anthropic Settlement Changed the Calculus
A $1.5 billion settlement last year in Bartz v. Anthropic forced the destruction of infringing datasets and signaled that courts would not automatically grant fair-use protection to AI developers. That outcome prompted the industry to stop assuming copyright infringement claims would disappear quietly.
Developers historically operated under a "move fast and break things" philosophy, prioritizing speed over compliance. The assumption was that the legal risk of copyright infringement posed less danger than missing the data-gathering window. That calculus no longer holds.
Buyers Now Demand a "Pedigree Log"
In AI-heavy deals, buyers increasingly require technical walk rights-the ability to terminate the transaction if due diligence reveals unverifiable model lineage or foundational licensing problems. They request forensic documentation of every dataset and training weight used to build the model.
If a seller cannot produce a verifiable history of where data came from, how it was acquired, and whether retention practices comply with law, the asset gets marked as impaired from day one. This directly suppresses deal value.
Buyers assess whether sellers have:
- Legal rights to use and retain all training data
- Clear mapping of data sources and model components to their risk tier
- Documentation proving compliance with applicable regulations
- Destruction or retention practices that can survive regulatory and litigation scrutiny
Regulatory Pressure Is Accelerating the Trend
The EU AI Act and laws in California, Colorado, Texas, and Utah now impose obligations on AI developers and deployers. While enforcement remains uncertain-particularly after recent federal challenges to state AI measures-the regulations are already shaping deal terms.
Buyers and licensees are asking harder questions about their own legal exposure. A seller who cannot demonstrate compliance with frameworks like the NIST AI Risk Management Framework faces longer closing timelines, carve-outs from the deal, and higher insurance costs.
What Sellers Must Do Now
Developers who can document lawful data origins, verifiable lineage, and governance practices consistent with emerging standards will close deals faster and at higher valuations. Those who cannot will face significant friction.
Sellers should prepare:
- Up-to-date development, testing, and training documentation aligned with legal frameworks
- Model cards that meet recognized standards
- Evidence of compliance with EU AI Act, state regulations, and industry frameworks
- Clear records of data acquisition, licensing, and retention practices
What Buyers Must Verify
Legal teams conducting AI acquisitions should treat data provenance as a closing condition, not a post-closing problem. AI for Legal professionals should assess the legal use of training data, map model components to risk tiers and jurisdictions, and confirm all datasets are lawful and documented.
Unresolved licensing conflicts should be negotiated before closing, not absorbed into the purchase price. Clear covenants and closing conditions prevent silent acceptance of risk.
Legal, technical, and acquisition teams need to understand AI-specific risks and applicable governance requirements. This is no longer optional expertise.
The Real Risk Is Asset Destruction
In traditional software M&A, the toolkit of indemnities, holdbacks, escrows, and insurance addresses most risks. In AI deals, monetary remedies miss the point. The risk is that the core asset becomes unusable because its origin, licensing, or governance cannot withstand enterprise-scale scrutiny.
A model cannot be patched into compliance if its training data was obtained illegally. It cannot be fixed if regulators demand destruction of infringing materials. The most valuable AI asset is not the one with the flashiest performance metrics-it is the one with lawful origins and documentation that survives audit.
Sellers who build with this standard in mind from the start will find buyers. Those who do not will find doors closing before negotiations even begin.
Your membership also unlocks: