Reddit Sues Anthropic for $1B, Alleging Claude Was Trained on Scraped Reddit Data
Reddit sues Anthropic for $1B, alleging Claude was trained on scraped subreddit data without consent, breaching user terms and privacy controls. Reddit seeks damages and a ban.

Reddit sues Anthropic for $1B over alleged unauthorized Claude training
Reddit filed suit against Anthropic in San Francisco Superior Court on June 4, 2025, alleging the AI company scraped and used Reddit content to train Claude without permission. The complaint frames the conduct as commercial exploitation of user-generated content and a direct breach of Reddit's User Agreement and policies.
According to the filing, Anthropic trained on data from subreddits such as r/explainlikeimfive, r/WritingPrompts, and r/AskHistorians starting in December 2021. Reddit cites public research acknowledgments and internal audit logs that it claims contradict Anthropic's statements about blocking Reddit from its crawlers in mid-2024.
Core allegations and factual background
Reddit alleges Anthropic's use of Reddit data was unauthorized and did not comply with platform rules requiring written consent for commercial use. The complaint highlights a mismatch between Anthropic's public statements about blocking Reddit and Reddit's logs showing more than 100,000 subsequent automated accesses by ClaudeBot and other agents.
The filing emphasizes user privacy risks from scraping without a license. Reddit claims unlicensed use bypasses its Compliance API, which is designed to notify licensees of deletions and block prohibited content categories.
Causes of action pleaded
- Breach of contract: Alleged violations of Reddit's User Agreement prohibiting commercial exploitation without consent.
- Unjust enrichment: Benefit derived from Reddit data without compensation or permission.
- Trespass to chattels: Interference with Reddit's systems via automated access contrary to stated terms and technical controls.
- Tortious interference: Disruption of Reddit's economic relationships, including data licensing.
- Unfair competition (Cal. Bus. & Prof. Code ยง17200): Alleged unlawful and unfair business practices tied to unauthorized scraping and use.
Technical and policy controls Reddit points to
Reddit cites robots.txt declarations (pre-May 2024 noting applicability to search engines, not commercial scrapers), IP rate limits, and anomaly detection meant to deter automated access. The company argues Anthropic evaded or ignored these measures.
Central to Reddit's privacy posture is the Compliance API, which propagates user deletion signals to licensees and restricts sensitive categories. Reddit contends unlicensed scraping sidesteps these protections and undermines user control.
Industry context and stakes
The dispute lands amid broader litigation over AI training data, where courts have issued mixed rulings on fair use and dataset provenance. The complaint notes Anthropic's separate author-case settlement for $1.5 billion in September 2025 and ongoing expansion of Claude's features.
Financially, Reddit highlights Anthropic's commercial gains and Amazon's multibillion-dollar investment and integrations, arguing that profits derived from Reddit data flowed to partners without compensation to Reddit or its users. Reddit also underscores its own shift toward licensing as a meaningful revenue stream alongside advertising.
Requested remedies
Reddit seeks specific performance, compensatory damages, disgorgement of profits, and injunctive relief blocking continued use of Reddit data. The filing also asks the court to prohibit commercial deployment of any technology derived from Reddit content, potentially including aspects of Claude, and demands a jury trial.
Practical implications for counsel
- For platforms: Tighten contractual gating for commercial access; require licensees to integrate deletion-notice APIs; document crawler access with audit logs; make non-permissible use explicit in terms and technical notices.
- For AI developers: Maintain verifiable dataset provenance and blocklist enforcement; implement deletion propagation and data minimization; treat robots.txt and terms as binding in commercial contexts; audit vendors and partners for derivative use and redistribution.
- Risk posture: Expect discovery pressure on crawl logs, training datasets, model cards, and content filters; consider model retraining or dataset purging protocols if injunctive relief or compliance orders issue.
- Contract vs. fair use: Even where fair use arguments exist, contract claims can proceed on separate grounds; address both in risk assessments and defenses.
Timeline
- December 2021: Anthropic allegedly begins training Claude on Reddit content per research papers co-authored by CEO Dario Amodei.
- March 2023: Claude launches publicly; responses indicate exposure to Reddit subreddits.
- May 2024: Anthropic claims Reddit is on crawler block list; Reddit audit logs purportedly show continued access.
- July 2024: Anthropic spokesperson says crawling stopped since mid-May; Reddit logs show 100,000+ accesses.
- June 4, 2025: Reddit files suit alleging breach of contract, unjust enrichment, and unfair competition.
- June 23, 2025: Federal court issues mixed ruling on fair use in a separate Anthropic copyright case.
- September 5, 2025: Anthropic agrees to a $1.5B settlement in a copyright class action over book datasets.
What to watch next
- Early motions testing contract enforceability against AI training and the scope of ยง17200 in scraping contexts.
- How courts weigh robots.txt, API gating, and terms as signals of authorization in commercial data use.
- Discovery into crawler identities, frequency, and referral ratios; Cloudflare telemetry and bot verification practices.
- Remedial scope: feasibility and contours of model takedown, retraining, or data deletion orders.
- Impact on licensing norms between platforms and AI developers, especially around deletion compliance.
Sources and references
- Joint statement on data scraping by data protection authorities (2023)
- California Unfair Competition Law (Bus. & Prof. Code ยง 17200)
Resource for legal teams
If your organization is evaluating AI data governance and training practices, upskilling cross-functional teams helps. See curated options by role at Complete AI Training.