Anthropic's unreleased Claude Mythos model found over 10,000 vulnerabilities across partners including Cloudflare (2,000 bugs), Mozilla (271 in Firefox) and Microsoft in just one month. Project Glasswing represents a 10x increase in bug-finding rate compared to traditional methods and prior Claude models.
The Numbers
Anthropic published results from Project Glasswing that should get the attention of every security team and every developer shipping AI-generated code. Their unreleased model, Claude Mythos, spent one month scanning code for partners including Cloudflare, Mozilla, Microsoft, Apple, Google and CrowdStrike. The results are staggering.
| Target | Findings | Detail |
|---|---|---|
| Overall | 10,000+ vulnerabilities | Across all Glasswing partners in one month |
| Cloudflare | 2,000 bugs total | 400 rated critical or high severity |
| Mozilla Firefox | 271 vulnerabilities | 10x improvement over prior Claude model |
| Microsoft | Patch releases growing | Microsoft says patches are "trending larger" due to Mythos findings |
| Open source (1,000 projects) | 23,019 total vulnerabilities | 6,202 rated high or critical (27%) |
| macOS | System breached | Mythos used its own capabilities to breach macOS |
What This Actually Means
The 10x multiplier is the number to focus on. Mozilla had already been scanning Firefox with an earlier version of Claude. Mythos found ten times more bugs in the same codebase. That is not a marginal improvement. That is a generational leap in automated vulnerability discovery.
Consider what happened with Cloudflare. This is a security company. Their entire business is protecting other people's infrastructure. Mythos found 400 critical and high-severity bugs in their code. If a security-focused engineering team is sitting on 400 serious vulnerabilities, the question every CTO should be asking is: what is hiding in code that has never been audited at all?
The open-source numbers are equally sobering. Across 1,000 projects, 27% of all vulnerabilities found were rated high or critical. These are the libraries and dependencies your applications rely on. If you are not tracking your software bill of materials (SBOM), you are flying blind on supply chain risk.
Microsoft's larger patches are not evidence of worse code. They are evidence that bugs were always there and nobody could find them at scale. Mythos can. That should change how every organization thinks about the gap between "we've passed our pentest" and "our code is actually secure."
Why Mythos Is Not Public (And What That Tells You)
Anthropic has explicitly stated that safeguards are not strong enough to prevent misuse of Mythos. A model this good at finding vulnerabilities is inherently good at creating exploits. This is responsible disclosure applied at the AI model level.
That decision is revealing. Anthropic is telling you that the gap between "finding a vulnerability" and "weaponizing it" has narrowed to near-zero when AI is the operator. The same capability that protects Cloudflare's infrastructure could, in the wrong hands, generate working exploits at scale. The NIST AI Risk Management Framework anticipated dual-use concerns like this, but the speed of capability growth has outpaced the policy response.
What Should You Do Right Now
If you are using AI to write code with tools like Cursor, Copilot or Claude Code, the code being generated carries the same vulnerability patterns Mythos is finding. AI writes code fast. It does not audit its own output. The volume that makes AI coding productive is the same volume that makes manual review impossible.
We wrote about this exact problem in our analysis of 100 AI-built websites and in our AI code slop breakdown. The patterns are consistent: hardcoded secrets, missing input validation, broken authentication flows and vulnerable dependencies. These are the same classes of bugs Mythos is flagging in production code at Cloudflare and Microsoft.
Here is what we recommend:
- Audit your AI-generated code before shipping. Use our AI code audit checklist as a starting point. If you are deploying code written by Cursor or Copilot without review, you are accumulating the same technical debt Mythos is now exposing in major enterprises.
- Do not wait for Mythos to go public. When it does become available, the bar for acceptable code quality will rise overnight. Organizations that have already been audited will be ahead. Those that have not will scramble.
- Track your open-source dependencies. The 27% critical/high rate across 1,000 open-source projects means your supply chain almost certainly contains unpatched vulnerabilities. Build and maintain an SBOM.
- Get a human-led audit now. Automated scanning finds patterns. Human auditors find logic flaws, business logic bypasses and architectural weaknesses that even Mythos will miss. The best security posture combines both.
FAQ
Can I use Claude Mythos to scan my code?
Does this mean AI-written code is less secure?
Should I stop using AI coding tools?
We Audit AI-Generated Code
Our vibe coding security audit checks your Cursor, Copilot and Claude Code output for misconfigurations, exposed secrets and vulnerable dependencies. The same vulnerability patterns Mythos is finding in enterprise code exist in AI-generated projects. We find them before attackers do.
Get Your Code Audited