White House talks with Anthropic shift to setting AI security rules

The White House and Anthropic are working on a framework to assess AI security flaws, POLITICO exclusively reports.
Anthropic's AI models, Fable 5 and Mythos 5, face export controls due to security-flaw concerns.
The talks aim to set benchmarks for AI security risk assessment.

The White House and Anthropic are working on a framework that would assess the severity of security flaws in new AI models and guide potential government intervention, according to a senior White House official and an administration official familiar with the matter granted anonymity to discuss it with POLITICO.

The effort comes after the White House imposed export controls on Anthropic, which forced the company to suspend access for all users to Fable 5 and Mythos 5, its latest powerful AI models, over a perceived security flaw, known in the industry as a jailbreak.

Administration officials and Anthropic CEO Dario Amodei disagreed over the severity of the jailbreak, POLITICO previously reported, but the technology has outpaced the government infrastructure to define and assess such disputes. POLITICO — like Business Insider — is part of the Axel Springer Global Reporters Network.

The attempt to create a standardized method to evaluate this and future such incidents underscores how the administration is racing to establish guardrails for new and powerful models that some fear can, if left unchecked, threaten economic and national security.

The negotiations between Anthropic and the administration also reflect an understanding that no AI model can be completely immune to hacking — part of Anthropic's initial defense of its model — and that the government should lay out the rules for companies to measure security risks by, a sentiment relayed by other leading AI companies and country leaders at G7 meetings earlier this week in France.

The discussions between the White House and Anthropic — led on the company's side by Sarah Heck, head of public policy, and Tom Brown, cofounder — are aimed at developing a common set of benchmarks that could be used to assess future jailbreaks, including the extent to which safeguards were bypassed, the capabilities exposed, and the practical consequences of the breach.

Anthropic and the White House did not immediately respond to a request for comment.

While the export controls on Anthropic have yet to be lifted, the shift toward a technical standards-setting exercise is a sign that negotiations are progressing. On Friday, talks had effectively collapsed after Anthropic rejected demands to de-deploy Fable, arguing the vulnerability was limited and did not amount to a meaningful security flaw.

The White House responded by imposing export controls that barred foreign users from accessing the model, forcing the company to pull it from the market.

Over the weekend, however, senior administration officials and Anthropic leaders held a series of lengthy calls with Anthropic cofounder Tom Brown, Commerce Secretary Howard Lutnick, and National Cyber Director Sean Cairncross. Those conversations led to nearly a week of in-person meetings in Washington. Anthropic dispatched senior researchers and safeguards experts to the Commerce Department on Monday to patch things up with administration officials.

This story originally appeared on POLITICO and is courtesy of the Axel Springer Global Reporters Network, which harnesses the resources of the company's newsrooms to publish ambitious scoops, investigations, interviews, opinion pieces, and analysis. It allows journalists — including those from POLITICO, Business Insider, WELT, BILD, Onet, and Fakt — to collaborate on major stories for an international audience of hundreds of millions across platforms.

Read the original article on Business Insider

White House talks with Anthropic shift to setting AI security rules

Deepen your understanding

Intelligence Exchange