NYT vs Perplexity: Paywall Scrapes, False Attributions, and a High-Stakes Test for AI and Journalism

NYT sues Perplexity over repackaging paywalled work and made-up answers shown with Times branding. Perplexity denies scraping and says it will fight in court.

Categorized in: AI News Legal
Published on: Dec 07, 2025
NYT vs Perplexity: Paywall Scrapes, False Attributions, and a High-Stakes Test for AI and Journalism

NYT vs. Perplexity: Copyright, Trademark, and the Fault Lines of "Answer Engines"

The New York Times has sued Perplexity in federal court, alleging that the "answer engine" copies and repurposes millions of paywalled articles, videos, and podcasts. The complaint claims Perplexity's responses are often verbatim or substantially similar to Times content, directly substituting for the publisher's offerings and eroding revenue and editorial control.

According to the filing, The Times sent multiple cease-and-desist letters over 18 months to pursue a license. The suit alleges Perplexity continued using protected material without permission or compensation.

The Trademark Angle: Hallucinations and Brand Harm

Beyond copyright, The Times alleges violations under the Lanham Act. The complaint ties this to fabricated answers ("hallucinations") that are falsely attributed to The Times and displayed with its registered marks, risking source confusion and reputational damage.

Perplexity's Position

Perplexity says it does not scrape data to build its core foundation models. It argues that it indexes public webpages and provides factual citations, positioning itself as an intelligent research assistant. The company, valued around $20 billion, frames the suit as a familiar clash between publishers and new technologies-drawing parallels to fights over radio and television-and says it will defend vigorously.

Why This Case Matters

This suit lands amid a broader conflict between publishers and AI developers over the use of proprietary material. The Times is already in high-stakes litigation with OpenAI and Microsoft, and Perplexity faces similar claims from Dow Jones, the New York Post, and the Chicago Tribune. Outcomes here will shape what's permissible for AI products and how creators get paid.

Key Legal Questions to Watch

  • Fair use: Are the outputs transformative or market substitutes? How extensive is copying, especially of paywalled content? What's the economic impact on the original works? See 17 U.S.C. ยง107 for factors.
  • Substantial similarity and verbatim reproduction: Do outputs reproduce expressive elements beyond facts? How frequently and at what length?
  • Access and paywalls: Were technical or contractual restrictions respected (robots.txt, terms of service)? Do any anti-circumvention issues arise if paywalls were bypassed?
  • Lanham Act: Do hallucinated attributions and trademark displays create likely confusion or false association? Are disclaimers or UI design choices mitigating or aggravating risk?
  • Model training vs. output liability: How do claims differ between training data ingestion, indexing, and response generation? Is there evidence of willful copying or systemic replication?
  • Remedies: Injunctive relief to curb outputs and data use; statutory and actual damages; profits; corrective statements; and attorneys' fees in appropriate circumstances.
  • Proof and discovery: Crawl logs, dataset lineage, paywall traversal records, citation systems, prompt/output logs, and A/B tests on attribution or hallucination handling.

Practical Moves for Legal Teams

  • Map data pipelines: sources, crawl rules, paywall status, and license coverage. Document robots.txt compliance and any access controls.
  • Strengthen product UX: precise citations, clear labeling, and guardrails for hallucinations; careful use of third-party marks; escalation paths for takedowns.
  • License proactively: negotiate coverage for training, indexing, and output display; bake in indemnities, audit rights, and suspension triggers for disputed sources.
  • Preserve evidence: training data catalogs, vendor contracts, scraping configurations, logs, and evaluations tied to attribution accuracy.
  • Risk transfer and contracts: review vendor and customer agreements for IP indemnity scope, caps, exclusions, and defense control.
  • Coordinate comms and legal: align public statements with litigation strategy to avoid admissions and manage brand risk.

Relevant References

Fair use overview: 17 U.S.C. ยง 107
Lanham Act false designation: 15 U.S.C. ยง 1125

If your legal team is building internal policies for AI tools, you may find this resource helpful: Complete AI Training - Courses by Job.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)
Advertisement
Stream Watch Guide