50 Audits. One Pattern.
Over the past 12 months, Sherlock Forensics has conducted security audits on 50 applications built with AI coding tools. These range from solo founder MVPs built entirely with Cursor to enterprise applications where professional development teams used Copilot and Claude Code as assistants.
The data is composite. No individual client is identified. But the patterns are consistent enough that the aggregate numbers tell a clear story: AI-generated code has a security problem, and most teams do not know it exists until someone looks.
Here is what we found.
The Headline Numbers
Across all 50 engagements, these are the top-line findings:
- 92% had at least one critical-severity vulnerability
- 78% stored secrets in plaintext or committed them to git
- 65% had no rate limiting on authentication endpoints
- 54% had SQL injection in at least one query
- 41% had exposed admin panels with no authentication
- 38% used
Math.random()for security-sensitive tokens
The median number of vulnerabilities per application was 7. The worst had 23. Only 4 of the 50 applications had zero critical findings, and all 4 were professionally developed with dedicated security review processes already in place.
Vulnerability Breakdown by Category
| Vulnerability Category | % of Apps | Avg Severity | OWASP Category |
|---|---|---|---|
| Secrets in plaintext or git history | 78% | Critical | A07 - Security Misconfiguration |
| Missing rate limiting on auth | 65% | High | A07 - Security Misconfiguration |
| SQL injection | 54% | Critical | A03 - Injection |
| Broken access control (IDOR) | 49% | High | A01 - Broken Access Control |
| Exposed admin panels | 41% | Critical | A01 - Broken Access Control |
| Weak cryptographic randomness | 38% | High | A02 - Cryptographic Failures |
| Missing CSRF protection | 35% | Medium | A01 - Broken Access Control |
| Session tokens in localStorage | 31% | Medium | A07 - Security Misconfiguration |
| Verbose error messages in production | 27% | Low | A07 - Security Misconfiguration |
Why Secrets Exposure Tops the List
78% is a staggering number. Nearly 4 out of 5 AI-built applications had API keys, database credentials or JWT secrets either hardcoded in source files or present in git history.
This happens because AI coding tools optimize for getting the feature working. When you ask Cursor or Claude Code to integrate Stripe, the fastest path is to drop the API key directly into the code. The AI does this because it produces a working result. It does not consider that the key will be committed to version control, pushed to GitHub and indexed by automated scanners within minutes.
In several audits, we found Stripe live keys, OpenAI API keys and Supabase service role keys sitting in client-side JavaScript. Not in environment variables. Not in a secrets manager. In the source code that ships to every user's browser.
The SQL Injection Problem
54% of applications had at least one injectable query. This is 2026 and SQL injection remains the most exploitable vulnerability in the stack. AI coding tools generate string-concatenated queries because that pattern appears frequently in their training data. The code works. It returns the right results. It also lets an attacker dump your entire database with a single crafted input.
The fix is parameterized queries, and every major database library supports them. But AI tools default to concatenation because it requires fewer lines of code and the training data contains millions of examples of it.
Vibe-Coded vs. Professionally Developed
Of the 50 applications, 28 were vibe-coded (built entirely by non-developers using AI tools like Cursor, Bolt or Lovable) and 22 were professionally developed by engineering teams using AI as an assistant (Copilot, Claude Code, ChatGPT).
The difference was significant:
| Metric | Vibe-Coded (n=28) | Pro + AI (n=22) |
|---|---|---|
| Avg vulnerabilities per app | 11 | 4 |
| % with critical findings | 100% | 73% |
| % with exposed secrets | 93% | 59% |
| % with SQL injection | 71% | 32% |
| Avg remediation time | 5 days | 2 days |
Every single vibe-coded application had at least one critical vulnerability. Not most. All of them. The professional teams fared better because experienced developers recognize common security anti-patterns even when AI generates them. They know to use environment variables for secrets. They know to parameterize queries. They catch the obvious mistakes, even if they miss the subtle ones.
But 73% of professionally developed applications still had critical findings. AI assistance does not eliminate the need for a dedicated security audit. It just reduces the number of findings when the audit happens.
The Exposed Admin Panel Problem
41% of applications had admin panels accessible without authentication. The pattern is predictable: the founder asks the AI to build an admin dashboard. The AI creates it at /admin or /dashboard. It works perfectly. It shows user data, revenue numbers and system configuration. And it has no login requirement because the founder never asked for one and the AI did not add one unprompted.
An attacker does not need to find a vulnerability when the front door is open. A simple directory scan reveals these panels in seconds. We found admin interfaces exposing full user databases, Stripe subscription data and environment variables in production.
Math.random() in Security Contexts
38% of applications used JavaScript's Math.random() to generate session tokens, password reset links or API keys. Math.random() is not cryptographically secure. Its output is predictable. An attacker who observes a few values can reconstruct the internal state and predict every future output.
The correct function is crypto.getRandomValues() in the browser or crypto.randomBytes() in Node.js. AI tools know both exist. They choose Math.random() because it appears in more training examples and produces shorter code. The result looks random to a human reviewer. It is trivially predictable to an attacker.
What These Numbers Mean for You
If you built an application with AI assistance, the probability that it contains at least one critical vulnerability is over 90%. That is not an estimate. It is what the data shows.
The good news: every vulnerability we found was fixable. Most took hours, not weeks. The typical remediation cycle after a quick audit is 2 to 5 days. The cost of finding and fixing these issues before an attacker does is a fraction of the cost of a breach.
The IBM 2024 Cost of a Data Breach Report puts the average breach at $4.88 million. A quick audit from Sherlock Forensics costs $1,500 CAD. The math favors the audit every time.
Recommendations
Based on 50 engagements, here is what we recommend for any team shipping AI-generated code:
- Audit before launch. Not after. Not when you get users. Before the first deployment. A quick audit covers the critical vulnerability classes in 3 to 5 business days.
- Scan git history for secrets. It is not enough to remove secrets from the current codebase. If they were ever committed, they are in the history. Rotate every key that has touched version control.
- Parameterize every database query. No exceptions. If your code builds SQL strings with user input, it is injectable.
- Add authentication to every admin route. If it shows data that is not meant for every user, it needs a login check and an authorization check.
- Replace Math.random() in all security contexts. Session tokens, reset links, API keys and anything else that needs to be unpredictable.
If your application handles user data, processes payments or stores PII, a security audit is not optional. It is the minimum responsible step before production. Vibe-coded applications need it most, but even professionally developed AI-assisted apps benefit from an independent review.