Anthropic's New Claude Model Admits Defeat to Secret AI It Won't

Anthropic unveiled Claude Opus 4.7 on Thursday, marketing the flagship model as a substantial leap forward with sharper vision and better code-writing abilities. Yet the company simultaneously acknowledged a glaring limitation: the new release cannot match the performance of Mythos, a far more powerful system sitting in Anthropic's vault.

Mythos remains locked away, available only to a carefully vetted circle of tech and cybersecurity firms due to safety concerns. That Anthropic would publicly highlight a model it's unwilling to release speaks to the tension between advancing AI capabilities and containing potential risks.

In benchmark comparisons, Opus 4.7 outperformed rivals including ChatGPT 5.4 and Google's Gemini 3.1 Pro. The gap between it and Mythos Preview, however, was unmistakable. Anthropic's own data showed the chasm.

The timing matters. For weeks before this announcement, users had complained bitterly that Opus 4.6 had degraded in performance. A senior director at AMD posted on GitHub that "Claude has regressed to the point it cannot be trusted to perform complex engineering," a message that rippled across developer communities.

Speculation swirled over whether Anthropic had deliberately throttled the model, or "nerfed" it in industry parlance. The theory: the company was either cutting costs or hoarding computational resources for Mythos development. Anthropic denied redirecting resources away from Claude to other projects, though it offered no comprehensive explanation for the earlier complaints.

The new Opus 4.7 attempts to reset expectations. Anthropic highlighted improvements in vision capabilities, allowing the model to process images at higher resolution, and touted advances in creative and professional tasks like interface design and document preparation. The company emphasized that users can now delegate complex coding work "with confidence" rather than requiring constant oversight.

Adding to its toolkit, Anthropic introduced a new reasoning tier called "xhigh," sitting between standard "high" and maximum effort levels. This gives developers finer control over the tradeoff between computation time and reasoning depth on difficult problems. The company is also testing "task budgets" to help manage how Claude allocates reasoning effort across longer sequences of work.

Safety considerations underpin the release strategy. Anthropic said it will use Opus 4.7's real-world deployment to test guardrails aimed at preventing misuse for cybersecurity attacks. The company framed this as groundwork toward eventually releasing Mythos-class models more broadly, suggesting that each safeguard tested in the wild brings that moment closer.

The broader picture reveals the AI industry's central tension: capabilities racing ahead of safety infrastructure. Anthropic has a more powerful tool ready but cannot risk opening it to the general public. For now, the company is asking developers and enterprises to work with Opus 4.7 while the company figures out how to responsibly scale up to Mythos.

Author James Rodriguez: "Anthropic's candor about Mythos being superior is refreshing, but it also underscores an uncomfortable truth: the safest AI systems are the ones that never see the light of day."

Anthropic's New Claude Model Admits Defeat to Secret AI It Won't Release

Comments