New York lawmakers approve bill targeting concealed web crawlers
New York's legislature has approved a bill that would prohibit automated web crawlers from hiding their identities when accessing news websites. The measure, A.11292/S.9934A, passed both chambers and now awaits further action.
The legislation targets what industry groups call "stealth crawlers" - bots that access news sites without properly identifying themselves or disclosing their purpose. Publishers say these crawlers collect articles and other content to train large language models used in AI products, depriving news organizations of revenue while shifting the economic value of their reporting to technology companies.
What the bill requires
If enacted, the law would require crawlers to accurately identify themselves when accessing covered news websites. It would also establish a private right of action, allowing publishers and broadcasters to sue violators.
The bill applies to newspapers, broadcasters, and other journalism providers. Disclosures would be required when AI systems use crawlers to gather content from any of these sources.
Industry support
The New York News Publishers Association backs the measure. Diane Kennedy, the association's president, said news organizations invest substantial resources in original reporting. "The proliferation of stealth crawlers enables technology companies and other actors to access the fruits of that investment without consent or transparency," Kennedy said.
Broadcasters also support the bill. David Donovan, president of the New York State Broadcasters Association, said local TV and radio stations face large volumes of unauthorized automated traffic seeking access to news content. "By protecting broadcast news operations from unauthorized access by Big Tech, the legislation ensures the economic foundations of producing original, local news," Donovan said.
The broader problem
Publishers describe two distinct harms. First, crawlers collect content without authorization or compensation, which is then used to train AI systems. Second, the volume of bot traffic itself creates operational costs by increasing server loads and infrastructure demands.
Danielle Coffey, president and CEO of the News/Media Alliance, said publishers are experiencing growing levels of automated traffic from bots. "Right now, news websites are drowning in bot traffic," Coffey said. "Bad bots are disguising their identities to overload publisher servers and access the quality content on our sites."
Litigation context
The New York bill reflects a broader legal dispute over how AI systems should access and use published content. The New York Times and the Chicago Tribune have sued AI chatbot developers, alleging their materials were used to train language models without authorization or payment. CNN sued Perplexity last month on similar allegations.
Assembly Member Steven Otis and State Senator Mike Gianaris sponsored the measure. Otis chairs the Assembly Science and Technology Committee.
For legal professionals, this legislation addresses emerging questions about content ownership, automated access rights, and enforcement mechanisms in the context of generative AI and LLM development. The bill's private right of action provision gives publishers a direct legal remedy - a detail that distinguishes it from other proposed regulatory approaches to AI content use.
Your membership also unlocks: