OpenAI Unleashes Hacker-Hunting AI to Vetted Cyber Teams

OpenAI is giving select cybersecurity teams access to a less restricted version of GPT-5.5 designed to hunt for software vulnerabilities, the company announced Thursday. The move reflects a critical moment in AI development: powerful language models can now find and exploit bugs almost as effectively as human hackers.

The release puts OpenAI squarely in competition with Anthropic, whose Mythos model has dominated early testing for cybersecurity applications. Recent benchmarks show the two models performing at nearly identical levels, with Mythos holding only a marginal edge in some tests. Both have accomplished what no AI system has done before: autonomously executing multi-step corporate cyberattacks in controlled conditions.

Called GPT-5.5-Cyber, the new tool will be available only to organizations that clear OpenAI's vetting process for the highest tier of its Trusted Access for Cyber program. Approved defenders will get a model with significantly loosened guardrails compared to the public version. They can use it to write proof-of-concept exploits for newly discovered bugs, simulate attacks against their own systems, and reverse engineer malicious code. The system still blocks certain dangerous functions like credential theft and malware generation.

OpenAI is also releasing a second variant of GPT-5.5 for broader access within its cyber program. This version retains stricter safeguards but helps defenders understand unfamiliar code, identify affected systems, and review security patches.

The capabilities have sparked intense debate in Washington and Silicon Valley about risk management. The U.K. AI Security Institute recently tested both models on a simulated 32-step corporate attack chain. GPT-5.5 succeeded in 2 out of 10 runs. Mythos managed 3 out of 10. Before either model existed, no AI system had ever completed the test.

Anthropic and OpenAI are taking divergent paths to the same goal: giving defenders powerful tools while keeping them from malicious actors. Anthropic has restricted Mythos access to roughly 40 vetted organizations, many of which participate in Project Glasswing, a coordination effort where members share findings about the model's security implications. OpenAI is pursuing a more permissive strategy, distributing public versions with strong safeguards while creating parallel tracks for approved users with weaker restrictions.

The White House is monitoring these rollouts closely. Federal officials are drafting executive actions that could reshape how the government oversees future AI model releases, particularly those with national security implications.

Author James Rodriguez: "This is a high-wire act: OpenAI needs to move fast enough to help defenders keep pace with threats, but the margin for error is razor thin when you're handing out tools this powerful."

Comments