Anthropic Leaked Its Own Secret Model Through a Misconfigured CMS

You trust Anthropic with the future of AI because they did the work. They told the Pentagon no. They publish safety research before shipping products. Their entire origin story is "we left OpenAI because they weren't careful enough." If any lab was going to guard its own secrets with the same paranoia it applies to AI alignment — the science of keeping AI systems honest — you'd bet on this one.

Except AI safety culture and basic operational security turned out to be entirely different muscles. Last week, Anthropic proved it can't flex both at once.

3,000 Files, Zero Locks

On March 26, security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) revealed that Anthropic's CMS — a content management system, basically the software that stores blog drafts and uploads — held roughly 3,000 unpublished assets in a publicly searchable data store with no authentication required. Blog drafts, images, PDFs, even employee parental leave documents. The CMS defaulted every upload to "public" unless someone manually switched it to private. The kind of configuration mistake you learn to avoid in your first week managing servers.

Buried in those drafts: detailed descriptions of Claude Mythos, codenamed Capybara internally. Not a minor upgrade — Anthropic's own unpublished copy called it "larger and more intelligent than our Opus models" and "currently far ahead of any other AI model in cyber capabilities."

The drafts described Mythos scoring "dramatically higher" than Claude Opus 4.6 on coding, academic reasoning, and cybersecurity benchmarks — standardized tests that measure how well a model performs specific tasks. More ominously, the draft warned that Mythos "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders." Anthropic confirmed the model exists, calling it "a step change" currently in testing with a small group of early-access customers.

The cost detail is revealing too: their own drafts admitted Mythos is "very expensive for us to serve." So the most dangerous model they've ever built is also the one they can barely afford to run.

A Journalist Fixed Their Security

Fortune contacted Anthropic on Thursday, March 26. Anthropic locked the data store after the call. Not before. Not because their monitoring caught it. Because a reporter told them.

Anthropic's spokesperson called it "an issue with one of our external CMS tools" and stressed the materials were "early drafts" that "did not involve our core infrastructure, AI systems, customer data, or security architecture." Technically true. Entirely beside the point. Nobody worried about customer data. They worried that the company building models capable of autonomous cyber offense can't lock a storage bucket — a cloud container where files live.

Also leaked: details of an invite-only two-day retreat for European CEOs at an 18th-century English countryside manor, with Dario Amodei offering private strategy briefings. The safety lab meets enterprise customers in manor houses now.

Wall Street Treated a Draft Like a Weapon

On March 27, cybersecurity stocks cratered. CrowdStrike dropped 7%. Palo Alto Networks fell 6-7%. Okta lost 7%. The iShares Cybersecurity ETF shed 4.5%. SentinelOne and Fortinet each dropped 3%.

Not because Mythos shipped. Not because anyone got hacked. Because the description of a sufficiently capable model is now a market event. Investors read Anthropic's own language — "far ahead of any other AI model in cyber capabilities" — and priced in the possibility that AI-powered offense could commoditize premium cybersecurity products. The model doesn't need to be deployed to move billions in market cap. It just needs to exist credibly.

And every competitor — OpenAI, Google, xAI — now knows exactly what Anthropic is building, roughly where it benchmarks, and approximately when it ships. That's competitive intelligence companies pay millions for, handed out free via an unchecked default setting.

Ops Discipline Beats Manifestos

Your AI provider's safety manifesto means nothing if their content team can misconfigure a storage bucket and leak the entire product roadmap. Anthropic publishes some of the best alignment research in the industry. They also left their crown jewel in a public directory because someone didn't toggle a checkbox.

Judge companies by their operational discipline, not their blog posts. In this case, the blog posts were the problem.

Anthropic must now release Mythos with every benchmark pre-spoiled, the cybersecurity industry bracing for impact, and the permanent irony of being the safety company that couldn't secure a CMS. They built the most capable model they've ever made. Then they demonstrated that the biggest risk wasn't the model — it was the humans around it.

Anthropic Leaked Its Own Secret Model Through a Misconfigured CMS

3,000 Files, Zero Locks

A Journalist Fixed Their Security

Wall Street Treated a Draft Like a Weapon

Ops Discipline Beats Manifestos

Keep reading

$100M in Free Credits, a Pentagon Ban, and One Company Holding All the Keys

Anthropic's $800B Safety Promise Runs on the Honor System

Anthropic Invented Nutrition Labels for AI. Then Shipped Agents Without One.

The Locksmith Built the Lockpick