Security Research

Anthropic's Mythos Found 10,000 Vulnerabilities in a Month. Here's What That Means for Your Code.

Anthropic's unreleased Claude Mythos model found over 10,000 vulnerabilities across partners including Cloudflare (2,000 bugs), Mozilla (271 in Firefox) and Microsoft in just one month. Project Glasswing represents a 10x increase in bug-finding rate compared to traditional methods and prior Claude models.

The Numbers

Anthropic published results from Project Glasswing that should get the attention of every security team and every developer shipping AI-generated code. Their unreleased model, Claude Mythos, spent one month scanning code for partners including Cloudflare, Mozilla, Microsoft, Apple, Google and CrowdStrike. The results are staggering.

Target Findings Detail
Overall 10,000+ vulnerabilities Across all Glasswing partners in one month
Cloudflare 2,000 bugs total 400 rated critical or high severity
Mozilla Firefox 271 vulnerabilities 10x improvement over prior Claude model
Microsoft Patch releases growing Microsoft says patches are "trending larger" due to Mythos findings
Open source (1,000 projects) 23,019 total vulnerabilities 6,202 rated high or critical (27%)
macOS System breached Mythos used its own capabilities to breach macOS

What This Actually Means

The 10x multiplier is the number to focus on. Mozilla had already been scanning Firefox with an earlier version of Claude. Mythos found ten times more bugs in the same codebase. That is not a marginal improvement. That is a generational leap in automated vulnerability discovery.

Consider what happened with Cloudflare. This is a security company. Their entire business is protecting other people's infrastructure. Mythos found 400 critical and high-severity bugs in their code. If a security-focused engineering team is sitting on 400 serious vulnerabilities, the question every CTO should be asking is: what is hiding in code that has never been audited at all?

The open-source numbers are equally sobering. Across 1,000 projects, 27% of all vulnerabilities found were rated high or critical. These are the libraries and dependencies your applications rely on. If you are not tracking your software bill of materials (SBOM), you are flying blind on supply chain risk.

Microsoft's larger patches are not evidence of worse code. They are evidence that bugs were always there and nobody could find them at scale. Mythos can. That should change how every organization thinks about the gap between "we've passed our pentest" and "our code is actually secure."

Why Mythos Is Not Public (And What That Tells You)

Anthropic has explicitly stated that safeguards are not strong enough to prevent misuse of Mythos. A model this good at finding vulnerabilities is inherently good at creating exploits. This is responsible disclosure applied at the AI model level.

That decision is revealing. Anthropic is telling you that the gap between "finding a vulnerability" and "weaponizing it" has narrowed to near-zero when AI is the operator. The same capability that protects Cloudflare's infrastructure could, in the wrong hands, generate working exploits at scale. The NIST AI Risk Management Framework anticipated dual-use concerns like this, but the speed of capability growth has outpaced the policy response.

What Should You Do Right Now

If you are using AI to write code with tools like Cursor, Copilot or Claude Code, the code being generated carries the same vulnerability patterns Mythos is finding. AI writes code fast. It does not audit its own output. The volume that makes AI coding productive is the same volume that makes manual review impossible.

We wrote about this exact problem in our analysis of 100 AI-built websites and in our AI code slop breakdown. The patterns are consistent: hardcoded secrets, missing input validation, broken authentication flows and vulnerable dependencies. These are the same classes of bugs Mythos is flagging in production code at Cloudflare and Microsoft.

Here is what we recommend:

  1. Audit your AI-generated code before shipping. Use our AI code audit checklist as a starting point. If you are deploying code written by Cursor or Copilot without review, you are accumulating the same technical debt Mythos is now exposing in major enterprises.
  2. Do not wait for Mythos to go public. When it does become available, the bar for acceptable code quality will rise overnight. Organizations that have already been audited will be ahead. Those that have not will scramble.
  3. Track your open-source dependencies. The 27% critical/high rate across 1,000 open-source projects means your supply chain almost certainly contains unpatched vulnerabilities. Build and maintain an SBOM.
  4. Get a human-led audit now. Automated scanning finds patterns. Human auditors find logic flaws, business logic bypasses and architectural weaknesses that even Mythos will miss. The best security posture combines both.

FAQ

Can I use Claude Mythos to scan my code?
Not yet. Mythos Preview is only available to select partners through Anthropic's Project Glasswing program. Anthropic has not announced a public release date. The model's vulnerability-finding capabilities are restricted because the same skills that find bugs can also be used to create exploits.
Does this mean AI-written code is less secure?
Not inherently. But AI generates code at volume without security review. The same vulnerability patterns Mythos finds in human-written code exist in AI-generated code too. The difference is speed: AI produces vulnerable code faster than teams can audit it.
Should I stop using AI coding tools?
No. But you should audit what they produce before deploying it, the same way you would review code from a junior developer. Tools like Cursor, Copilot and Claude Code write functional code quickly but they do not perform security analysis on their own output.

We Audit AI-Generated Code

Our vibe coding security audit checks your Cursor, Copilot and Claude Code output for misconfigurations, exposed secrets and vulnerable dependencies. The same vulnerability patterns Mythos is finding in enterprise code exist in AI-generated projects. We find them before attackers do.

Get Your Code Audited