Anthropic disables Claude Fable, a Mythos-class AI model

2026-06-13

A state switched off a frontier AI model by letter, three days after it launched. The instrument was export control. The precedent is process-free discretion, and it is not the first move in this fight.

Robert Nogacki · Kancelaria Prawna Skarbiec · 12 June 2026

On 12 June 2026, at 5:21 in the evening Washington time, the Commerce Department sent a letter, reportedly from Secretary Howard Lutnick to Anthropic’s chief executive, and a frontier AI model went dark. The directive required the company to suspend access to Fable 5 and its more capable sibling Mythos 5 for every foreign national, inside or outside the United States, the company’s own foreign-national engineers included. To comply, Anthropic disabled both models for all users; the rest of the Claude line was untouched. The models were three days old. Fable 5 had launched on 9 June.

The stated ground was national security. The specifics, by Anthropic’s own account, were never provided. An administration official later told Axios that the action followed an unnamed “other company” claiming it had found a way to jailbreak Mythos, and that access would remain restricted until the government “strengthens its national security systems,” possibly “within the next few weeks.” Hold that last phrase. A few weeks is the cadence of a software sprint, not of a genuine emergency of state.

Further reading – LinkedIn post

What was actually found

A jailbreak is not one thing, and the distinction matters more than the word. The technique apparently shown to the government amounts to asking the model to read a codebase and fix its flaws. That is automated vulnerability discovery and remediation, dual-use by construction: the identical capability that lets an attacker find a hole is what lets a defender close it. Anthropic’s account is that the demonstration surfaced a small number of already-known, minor vulnerabilities, reproducible by other public models including OpenAI’s GPT-5.5, and relied upon every day by the engineers who keep systems standing. The comparison to GPT-5.5 is Anthropic’s own, not a rhetorical flourish added here.

Technical deep-dive: universal vs. non-universal jailbreaks, and why the difference is decisive

Anthropic’s safeguard design was explicitly a defense in depth. The aim was to force any jailbreak into one of two unattractive shapes: non-universal (narrow, eliciting some restricted capability only in specific, engineered circumstances) or universal but expensive (broadly bypassing the safeguards, yet costly enough to produce that it cannot be casually scaled). Layered on top sat continuous monitoring and a deliberate thirty-day data-retention policy, so that any successful attempt could be detected and shut down rather than merely prevented in the abstract.

The numbers were not trivial. By Anthropic’s launch documentation, the attack success rate for offensive cyber tasks fell from 56.6 percent on the prior public model to 5.4 percent on Fable 5, and an external bug bounty logged more than a thousand hours without producing a single universal jailbreak. The items actually disclosed were either benign outputs or minor findings offering no Mythos-specific uplift.

The question a security regulator should be asking is therefore not “can the model do something dangerous” but “does this hand a capable adversary meaningful uplift beyond what is already freely available.” When the same output can be obtained from a competing commercial model, the capability is commoditised, and the marginal contribution of this particular vendor approaches zero.

This reframes the economics entirely. Disabling one vendor does not subtract a capability from the world; it redistributes the users to the vendor next door. That is substitution, not containment, a point export-control scholarship has made for decades about poorly calibrated restrictions that damage competitiveness without addressing the underlying risk. A control that changes the logo on the invoice while leaving the capability untouched is doing something, but it is not doing security.

The instrument

In the United States, export controls run through the Export Administration Regulations, administered by the Bureau of Industry and Security under the Export Control Reform Act of 2018, with emergency reach available through the International Emergency Economic Powers Act. They were built for tangible dual-use goods and munitions: centrifuges, machine tools, missile guidance. The extension to software was recent, brief, and, by June 2026, withdrawn, which is where the legal interest lies.

Technical deep-dive: the legal hook for model weights was built, then quietly removed

On 13 January 2025, BIS published a Framework for Artificial Intelligence Diffusion, creating a new control, ECCN 4E091, over the weights of the most advanced closed-weight models, those trained on more than ten-to-the-twenty-sixth computational operations. For the first time, an AI model’s weights were expressly an export-controlled item.

That hook did not last. Beginning 13 May 2025, BIS under the current administration moved to rescind the AI Diffusion Rule and instructed enforcement officials not to apply it; ECCN 4E091 was effectively removed. So by 12 June 2026, the one regulation that had placed model weights squarely “inside the tent” was no longer operative, and the June directive did not cite it.

Which leaves the central legal question open: on what authority does the directive rest? Residual EAR provisions, a fresh IEEPA emergency order (whose procedural predicates are far thinner than notice-and-comment rulemaking), or something else entirely. It cannot be answered, because the directive’s text has not been made public. The point is not a technicality. A state that suspends a product without naming the rule it is enforcing has, in substance, enforced no rule at all.

Technical deep-dive: the deemed-export logic, and how it locks out a company’s own engineers

Under the EAR, “any release in the United States of ‘technology’ or source code to a foreign person is a deemed export to the foreign person’s most recent country of citizenship or permanent residency.” No border crossing is required. The rule was designed for the laboratory and the engineering bay: show a controlled schematic to a visiting researcher and you have, in law, exported it.

The directive’s structure tracks that logic. It does not seize the model weights; it bars persons. The line it draws is nationality, not capability. Commentators have already flagged the novelty of applying that reasoning to a deployed consumer product accessed through an API, rather than to controlled technology reviewed in a lab.

The consequence is the part worth pausing on. A company can be ordered to lock its own engineers out of the product they built, on the territory where they built it, because of where those engineers were born. Export control, conceived as a rule about goods crossing borders, here operates as a rule about people who never moved.

The asymmetry

Maximum instrument, minimal disclosed evidence, no published process. Export control is among the most coercive economic tools a state possesses, carrying licensing requirements for export, re-export, and even domestic transfer, backed by civil and criminal penalties. Here, on the public record, it appears to have rested on a verbal description of a narrow, undisclosed flaw.

That is not a safety regime. It is discretion wearing a lab coat.

There is a darker irony folded inside it. Anthropic built thirty-day retention and active monitoring precisely so that misuse could be caught and shut down, a policy that overrode existing zero-retention enterprise agreements with no opt-out, absorbing a real commercial cost to make oversight possible. The order did not work through that apparatus; it reached for export-control power and suspended the model wholesale. The firm that invested most in being governable was governed least carefully.

The pattern

One directive might be an aberration. This one is not isolated, and that is what should worry the reader. The June shutdown is the latest move in a documented and escalating conflict between Anthropic and the current administration.

On 27 February 2026, the President directed federal agencies to cease using Anthropic’s technology and ordered a six-month phaseout. In March, the Defense Secretary designated Anthropic a “Supply-Chain Risk to National Security”, barring military contractors from dealing with the company, reportedly the first such public designation of a U.S. firm. A district court blocked that designation and halted the federal-use ban on 25 March. On 8 April, the D.C. Circuit declined to stay the designation pending appeal, leaving it partially operative.

Read against that record, “national security” begins to look less like a finding and more like an instrument of first resort. The fairness point cuts both ways, and intellectual honesty requires conceding it: a government locked in genuine, adversarial litigation with a vendor has real reasons for friction, and not every one of them is pretextual. But the concession sharpens rather than blunts the concern. A pattern of coercive action taken without a disclosed standard is not reassuring because it is a pattern. It is more alarming, because it suggests the discretion is becoming a habit.

The precedent

A template, then. If an unspecified national-security concern, plus an undisclosed demonstration from an unnamed third party, can recall a deployed model overnight, then every frontier system sits, in principle, one anonymous slide deck away from suspension. Anthropic itself asked that any such intervention be “transparent, fair, clear, and grounded in technical facts.” Four conditions named because, on this occasion, none was visibly met. The practical precedent is set, and the licensing architecture now hanging over the company, covering export, re-export, and domestic transfer alike, is its enforcement scaffolding.

Mood, not law

Capital and talent read these signals faster than regulators draft them. The lesson the market takes from 12 June is not that American AI is safer. It is that a deployment decision can be reversed by an opaque national-security move delivered after hours, on an authority the government declines to name, against a flaw it declines to specify, with restoration pencilled in for “a few weeks” once internal systems are tidied. That reads as mood, not law, and mood, unlike law, cannot be appealed.

The deeper point survives the rhetoric. A security policy that cannot state its reason, cannot show its process, cannot name the rule it enforces, and cannot demonstrate that it subtracts any capability from an adversary has not protected anyone. It has only discovered how cheaply it can act.

Part Two: A primer for the non-specialist

Frontier models, in one breath

A “frontier model” is whatever sits at the present outer edge of general-purpose artificial intelligence: the largest, most capable systems, trained on vast quantities of text and an almost vulgar amount of computing power. The previous generation of software was a drawer of single-purpose tools, a calculator here, a spell-checker there. A frontier model is closer to a fast, widely read generalist to whom you can hand almost any task: write this, debug that, read this contract, distil that dataset. Its defining trait is that the capability is not coded in feature by feature. It emerges from scale, which means even the builders are occasionally surprised by what the thing can do. Hold that detail. It is the source of both the promise and the unease, and it explains why governments have started to treat these systems the way they once treated enriched uranium.

The Anthropic / government clash, with the slogans removed

Strip away the headlines and the dispute is old and simple: who decides how a powerful tool may be used, the company that built it or the state that hosts it. Anthropic built its public identity on guardrails, declining certain military and high-risk uses and advertising restraint as a feature. The administration, by temperament and mandate, wanted fewer locks and more direct access, particularly for defence and security. This is not speculation. In February 2026 the President ordered federal agencies to stop using Anthropic’s products over a six-month phaseout; the Pentagon then branded the company a supply-chain risk; and the courts intervened, blocking the measure before an appeals panel let part of it stand. The June export action is the newest chapter of that book, not a stand-alone surprise. The useful way to hold it is structural, not partisan: a safety-first vendor and a capability-hungry state were always going to collide over the same question, and export control simply turned out to be the nearest lever.

What Fable is

Fable 5 is the publicly available, deliberately restrained version of Anthropic’s most capable internal system, the Mythos-class model. The two “use the same underlying model” and differ in how safeguards are applied and who may access each. Same engine, extra locks. It is built less like a chatbot than like an agent: it plans, executes, and iterates across multi-step work, large software projects, long technical analysis, sophisticated reasoning. And it is de-fanged precisely where the danger lives, refusing or down-routing high-risk requests to a weaker model. The short version for an outsider: Fable is the powerful model made safe enough for the public, which is exactly why a claim that its safety could be bypassed was always going to be politically combustible.

How a “jailbreak” actually works

A jailbreak is social engineering aimed at software. A model is trained to be helpful, coherent, and sensitive to context; a jailbreak weaponises those very virtues, dressing a forbidden request in a context the model misreads as legitimate, a role to play, a hypothetical to entertain, a long and innocent-seeming path that arrives somewhere it should not. No recipe is needed here, and none is given: the conceptual point is enough. The flexibility that makes these systems useful is the same surface an attacker pushes against.

The under-reported truth is that most jailbreaks are narrow and brittle. They work in one phrasing and fail in the next, they target a specific gap rather than the whole edifice, and they are patched once seen. A universal jailbreak, one that broadly and durably unlocks a model, is the genuinely serious object, and on the public account no one has produced one against Fable across more than a thousand hours of bounty testing. The distinction is not pedantry. It is the difference between a picked lock and a master key, and the June directive, by all available signs, concerned a picked lock.

Why most of the alarm is misplaced

Three risks are real and deserve adult attention. Capable models lower the cost of cybercrime, drafting more convincing phishing and basic scaffolding at scale. They make disinformation and harassment cheaper and more industrial. And the concentration of the most powerful systems in a few firms and a few governments is a genuine question of power, not science fiction.

Three fears are inflated. The first is the rogue-AI apocalypse: today’s frontier models are formidable prediction engines, not willful agents with hands on the power grid. The second is the “instant cyber-weapon for anyone”: offensive capability is throttled, logged, monitored, and, crucially, largely obtainable elsewhere already, so the marginal teenager is not thereby promoted to a nation-state threat. The third is the faith that a ban settles anything. It does not. It relocates the user; it does not delete the capability.

Which deposits us back at the door where the essay began. A control that cannot subtract a capability from the world has not made anyone safer. It has only chosen a target, and then mistaken the choosing for protection.