Munich court: LLM memorization is copying; TDM exception fails in GEMA v OpenAI

Munich court: LLM memorisation is reproduction; TDM doesn't shield training. Providers face injunctions and liability; outputs may be public availability; UK diverges.

Categorized in: AI News Legal
Published on: Nov 13, 2025
Munich court: LLM memorization is copying; TDM exception fails in GEMA v OpenAI

GEMA vs. OpenAI: Munich Court says LLM memorisation is "reproduction," TDM exception doesn't shield training

On 11 November 2025, the Munich I Regional Court (42 O 14139/24) issued a precedent that will ripple through AI licensing and risk programs. The Court held that large language models can memorise protected texts in their parameters and that this fixation already counts as reproduction under Sec. 16 UrhG and Art. 2 InfoSoc Directive.

If a model outputs original and recognisable elements of those texts in response to simple prompts, that is another act of reproduction. Providing that output to users is making available to the public (Sec. 19a UrhG; Art. 3 InfoSoc). The Court granted injunctive relief, information claims, and found liability for damages (quantum pending). Personality rights claims for incorrect attribution were dismissed. The judgment is not final.

What sparked the dispute?

GEMA sued OpenAI over allegations that ChatGPT versions 4 and 4o had "fixed" nine well-known German song lyrics such that simple prompts produced outputs largely true to the original. OpenAI argued the model stores statistical knowledge, not specific training data, and that any infringements arose from user prompts. It also invoked the text and data mining (TDM) exception for training.

Memorisation as reproduction

The Court treated memorisation as reproduction within the meaning of Sec. 16 UrhG and Art. 2 InfoSoc. Copies "in any form and by any means" do not need to be directly perceptible. Encoding protected texts in parameters-so they can be extracted with simple prompts-was enough. The use of probability weights did not change the legal classification.

Pointing to research on extracting training data from LLMs, the Court rejected coincidence as an explanation for the near-verbatim outputs observed.

Outputs and responsibility

Outputs containing original, recognisable elements of lyrics amounted to further reproduction. Serving such content on demand to an indefinite number of users qualified as making available to the public.

Responsibility sat primarily with the provider. Because the provider controls architecture, data selection, and training-and thus the risk of memorisation-simple user prompts did not shift liability to end users.

Limits and "customary consent" rejected

The TDM exception (Sec. 44b UrhG) did not apply. The Court read it as permitting only preparatory copies for analysis. Once a model can produce reproducible works, right holder exploitation interests are affected and the exception's scope is exceeded. The Court also declined any extended or analogous application.

Sec. 57 UrhG on insignificant accessories was inapplicable because there was no protected main work and a training dataset is not a work. Research exceptions for TDM (Sec. 60d UrhG; Art. 3 DSM Directive) were irrelevant. Consent "by virtue of custom" was rejected; training generative AI is not a use right holders must ordinarily expect.

Contrast with London: Getty Images v. Stability AI

Just a week earlier, the London High Court (4 November 2025) dismissed a secondary infringement claim in Getty Images v. Stability AI, finding the model at issue was not an "infringing copy." Munich went the other way conceptually, expressly accepting that a model may "contain" copyright-relevant copies when protected works are reproducibly defined.

Important nuance: the High Court did not reach Getty's core allegation about training on Getty's photos, due to jurisdictional limits on primary infringement claims. How UK courts will treat training on third-party works remains open.

What this means for legal teams

  • Licensing pressure rises: Treat training and output as separate infringement vectors. Inventory training sources and close licensing gaps, especially for short works (lyrics, poems, slogans).
  • Data governance as defense: Document data provenance, rights status, intake filters, and deduplication. Keep audit trails for datasets, snapshots, and model versions.
  • Anti-memorisation engineering: Prioritise deduplication, regularisation, and targeted anti-memorisation techniques. Apply rigorous prompt/output filters for high-risk categories like lyrics and poetry.
  • Red-team and log: Systematically probe for verbatim recall with simple prompts. Log prompts and outputs to demonstrate controls, suppression, and remediation.
  • Usage controls: Calibrate rate limits, token caps, and post-processing to reduce extractability. Consider per-domain suppression lists and hash-based checks for known texts.
  • Contract hygiene: Update customer terms, AUPs, and indemnities to reflect output filtering, restricted use cases, and reporting obligations.
  • Jurisdictional strategy: Expect divergence between Germany/EU and the UK for now. Track appeals and potential CJEU guidance on reproduction and TDM.

Playbook for right holders and collecting societies

  • Duel strategy: Assert claims at both stages-training (memorisation) and output (reproduction/making available).
  • Evidence capture: Preserve prompts, time-stamped outputs, and near-verbatim matches. Use short prompts to show extractability.
  • Licensing offers: Prepare tiered licences for training and for output reuse. Define permitted excerpts and transformation thresholds.
  • Notifier tooling: Set up automated notices for lyrics/poetry outputs and track model/version identifiers in responses.

What's likely on appeal

  • Whether parameter-level encoding fits "reproduction" under Sec. 16 UrhG and Art. 2 InfoSoc.
  • How to test "extractability" and "simple prompts" as a threshold.
  • The scope of Sec. 44b UrhG TDM once outputs are involved, and any role for Art. 3-4 DSM TDM provisions.
  • Allocation of responsibility between provider and user for prompt-driven outputs.

Sources

Press release of 11/11/2025 (German): Munich I Regional Court

Directive 2001/29/EC (InfoSoc): EUR-Lex

Need to level up internal AI literacy?

For legal and compliance teams rolling out practical AI policies and controls, see curated training options here: Complete AI Training - Courses by Job


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide