![]() |
| Pix credit here |
I have been following the slow and quite interesting regulatory path taken by the U-S. government respecting the management, sometimes through regulatory, but often through actions of administrative direction under color of regulation, and almost always respecting the rapidly evolving approach of the U.S. government (in parallel to similar developments by state organs in China, but not the same in Europe) to the architecture (including the regulatory architecture) of national security (see, e.g., HERE, generally The Conceptual Architecture of America First). Anthropic has been a major player in the formulation of both regulatory policy and in shaping (though less successful here) the structures and guardrails within which administrative discretion may be exercised and regulatory objects (like Anthropic in some respects) protected against abuse of administrative discretion by public organs (in Anthropic's case, see eg HERE) without touching on the abusive exercise of decisonal authority of their own, mostly in and through markets). But that is the American way and aligned with the core premises of the US regulatory order that is driven by and through markets, and structured through a foundational suspicion of government that tends to see in the state a necessity (Anthropic has been no slouch in advocating strong State measures when it suits them, see here) in limited form. But that is the problem here--the fundamental open space for state action includes the protection of markets and more broadly protection against foreign interference. Reconciling the two is not easy and changes with circumstance and politics (see HERE, and HERE).
All of this now (again) comes to a head when Anthropic is vexed by a decision, with significant markets effects, of the issuance by organs of the U.S. government empowered to do so of an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. In response, Anthroipic issued, and circulated widely, its Statement on the US government directive to suspend access to Fable 5 and Mythos 5. This was the basis of substantial reporting by press and social media organs that form part of the communications neural network of state-market structural coupling. To their credit, some of these outlets noted the irony:
The timing is striking. Just two days before receiving the directive, Anthropic CEO Dario Amodei published an essay arguing that governments should have the power to block dangerous AI deployments. The essay compared frontier AI regulation to the Federal Aviation Administration’s aircraft testing standards. "Their release should be blocked or reversed as a threat to public safety if they do not meet high standards of safety," Amodei wrote. Technology influencer and Polyweb Founder Sara Tortoli (@sarainwondertech) captured the market's reaction in an Instagram post Thursday: "When you spend years describing your model as potentially civilization-ending, you should not be surprised when governments start treating your model like weapons." (Forbes)
The Anthropic Statement, which follows in full below, also noted the irony. They used that irony to refine their position--while markets are nonlinear, computational spaces where risk is the lubricant of innovation, the state must, in its role as protector of market spaces and defender of the nation, apply a two dimensional, linear and sequential framework to its engagement with market risk that may adversely impact the integrity of markets or economic policy now understood as a species of national security. "As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles."
That is a very cool trick, which in the short term may advantage Anthropic, but which in the long term merely postpones the fundamental issue of aligning state and market measures being built around the care and protection, the development and use, of AI and tech based innovation. To those ends some sort of public-private convergence is necessary, and
will necessarily require a fundamental transformation of the state from its text based block chain
type logic system (discussed HERE, and HERE by an AI agent) to the sort of computational spaces that are at the heart of emerging AI structures.
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Anthropic models will not be affected.
We received the directive from the government today at 5:21pm (ET). The letter did not provide specific details of its national security concern. Our understanding is that the government believes it has become aware of a method of bypassing, or “jailbreaking” Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known, minor vulnerabilities. These vulnerabilities all appear relatively simple, and we have found that other publicly-available models are able to discover them as well without requiring a bypass.
Anthropic’s posture with respect to Fable’s safeguards, as laid out in our launch blog post, is the following:
- We have instituted strong safeguards that greatly reduce the likelihood that Fable is misused for tasks related to cybersecurity (among others). In fact, our safeguards are so strong that many users have complained that they are overly broad.
- In the weeks leading up to the launch of Fable, Anthropic worked with the US government, the UK AISI, multiple private third-party organizations and internal teams to red-team Fable’s safeguards for thousands of hours in total.
- These tests showed that Fable’s safeguards are substantially more effective than those of any previously deployed model.
- No testers have yet been able to find a universal jailbreak—a jailbreak method that can very broadly bypass the model’s safeguards, unblocking a wide range of cyber capabilities.
- We suspect that perfect jailbreak resistance is not currently possible for any model provider. Every safeguard used in the industry is vulnerable to non-universal jailbreaks (which can elicit some cyber information in specific circumstances), and it is likely that universal jailbreaks will eventually be found in the future. We stated this clearly when we released Fable 5.
- Given that perfect jailbreak resistance does not appear to be possible today, Anthropic adopted a defense in depth strategy with Fable 5. We aimed to make jailbreaks either narrow (in the case of non-universal jailbreaks) or very expensive to produce (in the case of universal jailbreaks), and to combine this with thorough monitoring to quickly detect and shut down any successful attacks. This is also why Anthropic has required 30-day retention of customer data with Fable—a policy change that carries real costs for us with customers, but that allows us to research and mitigate jailbreaks.
- We stand by this defense in depth strategy. It reduces the risks posed by Fable, making them comparable to the risks of existing models already deployed across the industry.
- We have not even received a disclosure of a concerning non-universal potential jailbreak that led to a harmful result. The potential jailbreaks that have been disclosed to us are either entirely benign responses or are minor findings that provide no Mythos-specific uplift.
To date, the government has only given us verbal evidence of a potential narrow, non-universal jailbreak, which essentially consists of asking the model to read a specific codebase and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed a report that we believe is the basis of the government's directive and validated that the level of capability displayed there is widely available from other models (including OpenAI’s GPT-5.5), and is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours.
We are complying with the government’s legal directive and are removing access to Fable 5 and Mythos 5 for all users. However, we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people. If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.
As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.
We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.

No comments:
Post a Comment