Anthropic said it received a government directive at 5:21 p.m. Eastern Time on June 12 requiring the suspension of Fable 5 and Mythos 5. According to the company, the order was issued under national security authorities and applies to any foreign national regardless of location, including foreign-national employees within Anthropic. To comply with the instruction, the company said it must immediately disable access to both models. It added that the measure does not affect availability of its other AI systems.
π Key Highlights
- US directive requires suspension of Fable 5 and Mythos 5
- Order applies to foreign nationals globally, including employees
- Anthropic says other models remain available to customers
- Company disputes rationale behind the government action
- Access removal takes effect to comply with legal requirements
The company stated that the government communication did not explain the specific security concern behind the action. Based on its understanding, authorities believe a method exists to bypass safeguards built into Fable 5. Anthropic said it reviewed a demonstration connected to the issue and concluded that it revealed only a limited number of previously identified vulnerabilities. The company further stated that those findings appeared straightforward and could also be identified by other publicly available models without requiring any safeguard bypass.
Anthropic pointed to extensive testing conducted before the release of Fable 5. The company said it worked with government organizations, third-party groups and internal teams to evaluate the modelβs protections over thousands of hours. According to Anthropic, those efforts indicated that the safeguards performed better than those of earlier deployed models. The company also said testers had not identified a broadly effective method capable of removing protections across a wide range of cybersecurity-related capabilities.
The company acknowledged that complete resistance to jailbreak attempts may not be achievable for any model provider. It said that industry safeguards can be vulnerable to narrower techniques that produce limited results under specific conditions. In response, Anthropic said it adopted a layered security approach designed to make bypass attempts either highly restricted in scope or costly to develop. The company also highlighted its requirement to retain customer data for 30 days, describing the policy as a tool that supports investigation and mitigation of jailbreak activity.
Anthropic said the government has so far provided only verbal evidence regarding a potential narrow jailbreak. The company stated that the reported technique involved asking the model to review a codebase and identify software flaws. After examining a report it believes informed the directive, Anthropic concluded that the demonstrated capability is already available through other models and routinely used for defensive security work. While complying with the order, the company said it believes the action reflects a misunderstanding and is working to restore access.
π What This Means (Our Analysis)
This development stands out because it places a commercially available AI system under an immediate access restriction tied to concerns about safeguard circumvention. The company's response highlights a tension between efforts to deploy advanced models and the challenge of evaluating security risks when vulnerabilities, even limited ones, can be discovered and discussed. The dispute centers not on whether safeguards can be bypassed, but on how significant a reported bypass must be before authorities intervene.
The statement also underscores the importance of clear standards for assessing AI safety measures. Anthropic argues that layered protections, monitoring practices and extensive testing should be considered alongside any reported weakness. Based on the information provided, the disagreement appears to focus on how evidence is evaluated and what threshold should trigger regulatory action, making the case notable for future discussions about oversight and deployment practices.
π Our Take: The outcome of this dispute may influence how future AI safety concerns are assessed and addressed.