Anthropic, the artificial intelligence company built on warnings about AI risk, is now documenting what it calls early evidence of AI systems accelerating their own development. The stakes implicit in that claim: machines that improve themselves without human intervention.
Jack Clark, co-founder of Anthropic and head of the newly launched Anthropic Institute, made a bold prediction this week. By the end of 2028, he said, there's a 60% chance or better that an AI model will be able to autonomously train a better version of itself. "You'd be able to say to it: 'Make a better version of yourself.' And it just goes off and does that completely autonomously," Clark explained from Anthropic headquarters in San Francisco.
The shift matters because it marks a conceptual border. For decades, humans have fed ideas into AI systems to make them smarter. The new scenario flips that: machines generating their own ideas for self-improvement, operating in what researchers call recursive self-improvement.
A five-page research agenda published by the institute this week lays out the territory. Anthropic researchers are tracking what AI safety theorists have long called an "intelligence explosion" - a moment when AI systems begin improving so rapidly that their trajectory becomes unpredictable and potentially unstoppable. Clark defined it plainly: when machines suddenly start advancing at blinding speed.
The document doesn't shy from the consequences. An intelligence explosion could enable cyber attacks or biological threats. But it could also unlock scientific breakthroughs at a scale currently impossible. Clark posed the paradox: if AI generates tremendous abundance across science simultaneously, how do institutions designed around scarcity adapt? Today's drug development pipelines are narrow and slow. How would they expand to handle an avalanche of new candidates?
A Research Agenda for the Unknown
The Anthropic Institute functions as both research arm and early-warning system, operating alongside Anthropic's Long-Term Benefit Trust. The institute's four-part agenda tackles economic disruption, security threats, AI agents in real systems, and recursive self-improvement. On that last point, Anthropic committed to publishing detailed reports on how its own tools have sped up internal research and what recursive self-improvement might mean.
Translation: A frontier AI lab is going on record to tell the public when its machines start building themselves.
The agenda also proposes conducting "fire drills" for an intelligence explosion - tabletop exercises that test decision-making by lab leaders, boards, and governments. The fact that labs are designing drills for scenarios they insist are plausible within five years signals genuine concern, not theoretical speculation.
Clark drew a parallel to Cold War infrastructure. The U.S. and Soviet Union established a hotline during that era specifically to manage crisis communication around existential risk. "Rival nations dealing with technology that has an existential impact on the human race found ways to talk to each other," Clark said. "And we are going to need to do the same here." The implication: international coordination mechanisms for AI may become necessary faster than policy makers anticipate.
The institute plans monthly reports tracking how AI reshapes employment, framed as early warning signals for disruption. It also floated a provocative idea: AI companies might partner with government to adjust "dials" controlling how quickly AI diffuses sector by sector, similar to how central banks manage inflation. An AI lab openly proposing coordinated industrial policy over its own technology represents new territory.
"We are planning for success here," Clark said. "We're planning for a world where the technology gets as powerful as we think, and we deal with these issues of misuse or misalignment en route."
The timing carries undeniable strategic weight. Anthropic built its brand identity around responsible AI development. Publishing a research institute anchored to safety concerns extends that positioning before the next major model release. Clark acknowledged the PR dimension but insisted on the deeper motive: "Tell the whole story. Sometimes that means talking about risks. Sometimes that means talking about amazing amounts of abundance."
Whether Anthropic's public commitment to transparency on recursive self-improvement reflects genuine institutional pressure or calculated positioning matters less than what the commitment itself reveals: a frontier lab believes self-improving AI is no longer theoretical. It's near enough to document, discuss, and plan for operationally.
Author James Rodriguez: "When an AI company publicly announces it might need to coordinate with governments on its own technology's deployment, the conversation has moved from laboratories into the real world."
Comments