A version of Mythos 5, an Artificial Intelligence model whose creator recently said was too risky to cybersecurity to be released to the public, has now been made available with additional “safeguards” attached.
Claude Fable 5 is a version of Anthropic's Claude Mythos, which was released for previewing to select number of clients in April with a warning from the publisher that it posed a risk to global cybersecurity due to its ability to uncover vulnerabilities in digital infrastructure.
The model successfully uncovered vulnerabilities that had remained undetected for between 16 and 27 years, including flaws in OpenBSD and FFmpeg, software that underpins substantial portions of global digital infrastructure. In some cases, the model generated working exploits end to end with minimal human prompting, including by engineers without formal cybersecurity training.
Anthropic published a detailed system card outlining the model’s evaluation, risk assessments and reasoning behind the restricted release. The company said that, in its judgement, Claude Mythos Preview exceeds internal thresholds for general release under its Responsible Scaling Policy, particularly in relation to autonomy and cybersecurity risk.
Anthropic said this week that the safeguards it has put in place “are now robust enough for a general release”.
“We’ve deliberately tuned the safeguards to be cautious, and they are still stricter than would be ideal,” it said. “We recognize that this will be frustrating to some users, and our aim is to reduce false positives as we update and refine the safeguards after launch.
Anthropic said that it had added a number of safety “classifiers”: separate AI systems that detect potential misuse and prevent the main model (in this case Fable 5) from responding. These, the company said, would prevent the model being used for cyberhacking as well as ben used for the development of bio-weapons and risky biological research. Attempts to use Fable 5 for these tasks will cause it to fall back to an older version - Opus 4.8 - on most requests related to biology and chemistry. The new system will also include safeguards against “distillation” of the model by authoritarian governments to train competing AI models.
Claude Fable 5 will be made generally available to the public but Claude Mythos 5 – which is the same model as Fable 5 without the safeguards installed - will be made available to those organisations that were included in the preview release but will also be gradually released to new partners (in consultation with the US government). Anthropic also said that it will be developing a “trusted access program” that allows cybersecurity organizations to apply for access in a “more systematic manner”.

