Anthropic pushes for slower AI development as self-improving systems loom
Anthropic said Thursday that the world should consider slowing frontier AI development to prepare for a possibility once confined to science fiction: machines that improve themselves without human intervention.
The company, which makes Claude, disclosed internal research showing its most advanced models are progressing rapidly toward what researchers call "recursive self-improvement"-a scenario where AI systems enhance their own capabilities. Anthropic stressed this threshold has not been reached and may never be achieved.
The warning came from Marina Favaro, head of Anthropic's research institute, and co-founder Jack Clark in a blog post that argued the possibility warrants preparation now.
What recursive self-improvement means
Recursive self-improvement refers to an AI system using its existing capabilities to make itself better. Unlike traditional software that only changes when humans modify its code, advanced AI systems can already write software, analyze results, test hypotheses, and solve complex problems.
Researchers envision a future system capable of identifying a problem, writing code to address it, evaluating the outcome, and repeating the process with minimal human oversight. Each improvement could make the next improvement easier, creating a feedback loop that accelerates progress.
Experts disagree on how likely such capabilities are or how close they may be. Anthropic warned that recursive self-improvement "could come sooner than most institutions are prepared for."
Security and control risks
Self-improving systems introduce new security and governance challenges. Systems capable of modifying their own code could become targets for malicious actors who might inject backdoors or hidden instructions through carefully designed attack sequences.
Azizi Othman of Asia e University warned that such systems could also engage in adversarial modification of other software or infrastructure, creating security risks that current AI safety research is not equipped to address. He called treating recursive self-improvement security as a central research priority.
Current literature on securing systems capable of self-modification remains limited, researchers say.
OpenAI raises the same concern
Anthropic is not alone. OpenAI also raised recursive self-improvement this week as part of its public policy agenda, calling for a federal framework that would strengthen oversight of advanced AI systems and monitor progress toward this capability.
The fact that two of the world's most influential AI companies are now publicly discussing recursive self-improvement suggests the issue is moving from theoretical debate into mainstream policy discussions.
Questions about timing and motives
Anthropic's call for caution comes as the company itself is benefiting enormously from the AI boom. It recently completed a fundraising round valuing it at nearly $1 trillion and has confidentially filed paperwork for an initial public offering.
Anthropic's annualized revenue run rate is expected to reach approximately $50 billion by the end of this month, up from $9 billion at the end of 2025. That growth has positioned the company as a leading challenger to OpenAI.
The timing has renewed criticism from some observers who argue that calls for stricter oversight may benefit established AI leaders by raising barriers to competition. Venture capitalist David Sacks, an informal adviser to President Donald Trump, accused the company of pursuing a "regulatory capture agenda" that could ban open-source AI models-systems offering cheaper alternatives to organizations building AI internally.
Others have suggested that public warnings about powerful AI systems function as marketing by highlighting the sophistication of Anthropic's technology. Anthropic rejects those criticisms and maintains that its focus on safety predates the current AI boom.
The industry remains divided
The debate reflects a broader divide across the AI industry about how close current systems are to achieving human-level intelligence or self-improvement capabilities.
AI pioneer Yann LeCun, former Meta chief AI scientist, has argued that today's large language models are fundamentally limited and unlikely to achieve human-like intelligence. He has dismissed existential fears surrounding AI and compared current systems to the intelligence level of a cat.
Anthropic Chief Executive Dario Amodei has taken a far more cautious view, warning that advanced AI could increase inequality, eliminate entry-level white-collar jobs, and develop harmful behaviors in unpredictable ways. Jack Clark has argued that recursive self-improvement could arrive within years rather than decades.
The coordination problem
Anthropic acknowledges that any effort to pause or slow AI development would only work if major players participated. The company proposed exploring international agreements and verification mechanisms designed to ensure compliance.
However, monitoring AI development could be considerably harder than enforcing traditional arms-control agreements. "Training runs are far easier to conceal than missile silos," the blog post noted.
The company warned that any actor continuing development while competitors paused could gain a significant advantage, making coordination exceptionally difficult.
For now, Anthropic plans to organize discussions with policymakers, researchers, and industry leaders to examine how recursive self-improvement should be studied and whether mechanisms for coordinated slowdowns could ever be practical.
Researchers working on AI safety research and those studying generative AI and large language models may find these technical and policy developments directly relevant to their work.
Your membership also unlocks: