Emerging
Jun 17, 20261
60%
Trump Administration Demands Anthropic Prevent All AI Jailbreaks, But Experts Say It May Be Impossible
The Trump administration is demanding that Anthropic prevent all jailbreaks of its Claude Fable 5 AI model as a condition for its release, but independent cybersecurity experts argue that completely blocking such vulnerabilities may be technically impossible. The NSA has identified ways to disable guardrails on the model, and the administration is placing responsibility on Anthropic to proactively test and report vulnerabilities rather than undertaking the work itself.
Quick Facts
Who
Trump administration
What
Trump administration took Claude Fable 5 AI model offline using export controls
When
Last week (model taken offline)
Where
Trump White House
- Trump administration took Claude Fable 5 AI model offline using export controls
- NSA concluded that guardrails on Fable 5 can be disabled
- Trump officials demanded Anthropic address jailbreaking vulnerabilities before rerelease
- Anthropic held technical meeting with Commerce Department and ONCD on Monday
- Administration shifted from debating significance to demanding solutions
The Trump administration is escalating its dispute with Anthropic over security vulnerabilities in the company's advanced Claude Fable 5 AI model, which officials took offline last week using export controls. Trump officials have made clear that if Anthropic wants to rerelease the model, the company must address what the government views as critical jailbreaking vulnerabilities—methods that use carefully crafted prompts to circumvent the model's safety guardrails and access restricted capabilities related to cybersecurity, chemistry, and biology.
The National Security Agency has concluded that guardrails protecting Fable 5 can be disabled, prompting the Trump administration to shift from debating whether jailbreaks pose a genuine threat to demanding solutions. Rather than taking on the technical burden itself, the administration is positioning the issue as Anthropic's responsibility to resolve. Officials argue that neither the Commerce Department's Center for AI Standards and Innovation nor the NSA has sufficient staff or bandwidth to continuously hunt for jailbreaks across all frontier AI models entering the market. Instead, the administration expects Anthropic to proactively test all its advanced models for vulnerabilities and report findings to the government.
Anthropic has pushed back on these concerns, contending that the administration's assessment is overblown and that jailbreak effects are minimal. The company reiterated this position during a technical meeting Monday with the Commerce Department and the Office of the National Cyber Director Sean Cairncross. However, independent cybersecurity experts have increasingly concluded that the White House's demand may be fundamentally unachievable. They argue that guardrails on AI models function only as a temporary measure—skilled users and increasingly sophisticated future AI systems will find new ways to bypass constraints, making comprehensive jailbreak prevention technically infeasible.
The standoff reflects broader tensions between the Trump administration's stricter approach to AI safety and the industry's view that perfect security is unattainable. A White House spokesperson declined to comment on the dispute.
Why This Matters
This dispute highlights a critical tension between government AI safety demands and technical reality. If the Trump administration's approach sets a precedent requiring AI companies to achieve "perfect security," it could either force unrealistic standards that stall AI development or signal regulatory overreach. For readers, this matters because it determines how AI safety regulations will be enforced, affecting both the pace of AI innovation and your exposure to potentially unsafe AI systems in the marketplace.
Timeline & Sources
Jun 16, 2026
WireTechnical meeting held between Anthropic and Commerce Department/Office of National Cyber Director
Jun 17, 2026
WireTrump officials indicated Anthropic must address vulnerabilities to rerelease model