Policy & Regulation
Business Insider2 days ago
4

White House talks with Anthropic shift to setting AI security rules

AI

The White House and Anthropic are developing a framework to assess AI security flaws after export controls were imposed on Anthropic's latest models due to a jailbreak vulnerability.

White House talks with Anthropic shift to setting AI security rules

Intelligence Insights

Context + impact, normalized for TechCulture.

The Big Picture
The White House and Anthropic are collaborating on a framework to standardize the assessment of security flaws in AI models, following a dispute over a jailbreak vulnerability in Anthropic's Fable 5 and Mythos 5 models that led to export controls. The talks aim to create benchmarks for evaluating the severity of jailbreaks, including the extent of safeguard bypass, capabilities exposed, and practical consequences. This effort reflects the administration's push to establish guardrails for powerful AI models amid concerns over economic and national security. The negotiations, led by Anthropic's Sarah Heck and Tom Brown, have progressed from near-collapse to in-person meetings in Washington, signaling a shift toward technical standards-setting. The framework would guide potential government intervention and help companies measure security risks, acknowledging that no AI model is completely immune to hacking.
Why It Matters
This shift from reactive export controls to proactive security benchmarks signals a new era of AI regulation where the government and industry co-define risk. If successful, it could create a standardized framework for evaluating AI vulnerabilities, impacting how all AI companies deploy models globally. The outcome will set a precedent for balancing innovation with national security, potentially shaping global AI governance.

Deepen your understanding

Use our AI to break down complex signals.

Select an AI action to generate more depth.

Anthropic CEO Dario Amodei
Anthropic CEO Dario Amodei
Anthropic CEO Dario Amodei at a recent G7 lunch with world leaders and CEOs.

Anna Moneymaker/Getty Images

  • The White House and Anthropic are working on a framework to assess AI security flaws, POLITICO exclusively reports.
  • Anthropic's AI models, Fable 5 and Mythos 5, face export controls due to security-flaw concerns.
  • The talks aim to set benchmarks for AI security risk assessment.

The White House and Anthropic are working on a framework that would assess the severity of security flaws in new AI models and guide potential government intervention, according to a senior White House official and an administration official familiar with the matter granted anonymity to discuss it with POLITICO.

The effort comes after the White House imposed export controls on Anthropic, which forced the company to suspend access for all users to Fable 5 and Mythos 5, its latest powerful AI models, over a perceived security flaw, known in the industry as a jailbreak.

Administration officials and Anthropic CEO Dario Amodei disagreed over the severity of the jailbreak, POLITICO previously reported, but the technology has outpaced the government infrastructure to define and assess such disputes. POLITICO — like Business Insider — is part of the Axel Springer Global Reporters Network.

The attempt to create a standardized method to evaluate this and future such incidents underscores how the administration is racing to establish guardrails for new and powerful models that some fear can, if left unchecked, threaten economic and national security.

The negotiations between Anthropic and the administration also reflect an understanding that no AI model can be completely immune to hacking — part of Anthropic's initial defense of its model — and that the government should lay out the rules for companies to measure security risks by, a sentiment relayed by other leading AI companies and country leaders at G7 meetings earlier this week in France.

The discussions between the White House and Anthropic — led on the company's side by Sarah Heck, head of public policy, and Tom Brown, cofounder — are aimed at developing a common set of benchmarks that could be used to assess future jailbreaks, including the extent to which safeguards were bypassed, the capabilities exposed, and the practical consequences of the breach.

Anthropic and the White House did not immediately respond to a request for comment.

While the export controls on Anthropic have yet to be lifted, the shift toward a technical standards-setting exercise is a sign that negotiations are progressing. On Friday, talks had effectively collapsed after Anthropic rejected demands to de-deploy Fable, arguing the vulnerability was limited and did not amount to a meaningful security flaw.

The White House responded by imposing export controls that barred foreign users from accessing the model, forcing the company to pull it from the market.

Over the weekend, however, senior administration officials and Anthropic leaders held a series of lengthy calls with Anthropic cofounder Tom Brown, Commerce Secretary Howard Lutnick, and National Cyber Director Sean Cairncross. Those conversations led to nearly a week of in-person meetings in Washington. Anthropic dispatched senior researchers and safeguards experts to the Commerce Department on Monday to patch things up with administration officials.

This story originally appeared on POLITICO and is courtesy of the Axel Springer Global Reporters Network, which harnesses the resources of the company's newsrooms to publish ambitious scoops, investigations, interviews, opinion pieces, and analysis. It allows journalists — including those from POLITICO, Business Insider, WELT, BILD, Onet, and Fakt — to collaborate on major stories for an international audience of hundreds of millions across platforms.

Read the original article on Business Insider
Big Tech AI Cybersecurity Policy

Intelligence Exchange

0

Log in to participate in the exchange.

Sign In

Syncing Discussions...

Finding Related Intelligence...