What We Found Auditing 50 AI-Built Applications

Sherlock Forensics audited 50 AI-built applications and found that 92% had at least one critical vulnerability. 78% stored secrets in plaintext, 54% had SQL injection and 41% had exposed admin panels. Vibe-coded apps averaged 11 vulnerabilities compared to 4 in professionally developed apps using AI assistance. Quick audits from $1,500 CAD. Full assessments from $5,000 CAD.

50 Audits. One Pattern.

Over the past 12 months, Sherlock Forensics has conducted security audits on 50 applications built with AI coding tools. These range from solo founder MVPs built entirely with Cursor to enterprise applications where professional development teams used Copilot and Claude Code as assistants.

The data is composite. No individual client is identified. But the patterns are consistent enough that the aggregate numbers tell a clear story: AI-generated code has a security problem, and most teams do not know it exists until someone looks.

Here is what we found.

The Headline Numbers

Across all 50 engagements, these are the top-line findings:

  • 92% had at least one critical-severity vulnerability
  • 78% stored secrets in plaintext or committed them to git
  • 65% had no rate limiting on authentication endpoints
  • 54% had SQL injection in at least one query
  • 41% had exposed admin panels with no authentication
  • 38% used Math.random() for security-sensitive tokens

The median number of vulnerabilities per application was 7. The worst had 23. Only 4 of the 50 applications had zero critical findings, and all 4 were professionally developed with dedicated security review processes already in place.

Vulnerability Breakdown by Category

Vulnerability Category % of Apps Avg Severity OWASP Category
Secrets in plaintext or git history 78% Critical A07 - Security Misconfiguration
Missing rate limiting on auth 65% High A07 - Security Misconfiguration
SQL injection 54% Critical A03 - Injection
Broken access control (IDOR) 49% High A01 - Broken Access Control
Exposed admin panels 41% Critical A01 - Broken Access Control
Weak cryptographic randomness 38% High A02 - Cryptographic Failures
Missing CSRF protection 35% Medium A01 - Broken Access Control
Session tokens in localStorage 31% Medium A07 - Security Misconfiguration
Verbose error messages in production 27% Low A07 - Security Misconfiguration

Why Secrets Exposure Tops the List

78% is a staggering number. Nearly 4 out of 5 AI-built applications had API keys, database credentials or JWT secrets either hardcoded in source files or present in git history.

This happens because AI coding tools optimize for getting the feature working. When you ask Cursor or Claude Code to integrate Stripe, the fastest path is to drop the API key directly into the code. The AI does this because it produces a working result. It does not consider that the key will be committed to version control, pushed to GitHub and indexed by automated scanners within minutes.

In several audits, we found Stripe live keys, OpenAI API keys and Supabase service role keys sitting in client-side JavaScript. Not in environment variables. Not in a secrets manager. In the source code that ships to every user's browser.

The SQL Injection Problem

54% of applications had at least one injectable query. This is 2026 and SQL injection remains the most exploitable vulnerability in the stack. AI coding tools generate string-concatenated queries because that pattern appears frequently in their training data. The code works. It returns the right results. It also lets an attacker dump your entire database with a single crafted input.

The fix is parameterized queries, and every major database library supports them. But AI tools default to concatenation because it requires fewer lines of code and the training data contains millions of examples of it.

Vibe-Coded vs. Professionally Developed

Of the 50 applications, 28 were vibe-coded (built entirely by non-developers using AI tools like Cursor, Bolt or Lovable) and 22 were professionally developed by engineering teams using AI as an assistant (Copilot, Claude Code, ChatGPT).

The difference was significant:

Metric Vibe-Coded (n=28) Pro + AI (n=22)
Avg vulnerabilities per app 11 4
% with critical findings 100% 73%
% with exposed secrets 93% 59%
% with SQL injection 71% 32%
Avg remediation time 5 days 2 days

Every single vibe-coded application had at least one critical vulnerability. Not most. All of them. The professional teams fared better because experienced developers recognize common security anti-patterns even when AI generates them. They know to use environment variables for secrets. They know to parameterize queries. They catch the obvious mistakes, even if they miss the subtle ones.

But 73% of professionally developed applications still had critical findings. AI assistance does not eliminate the need for a dedicated security audit. It just reduces the number of findings when the audit happens.

The Exposed Admin Panel Problem

41% of applications had admin panels accessible without authentication. The pattern is predictable: the founder asks the AI to build an admin dashboard. The AI creates it at /admin or /dashboard. It works perfectly. It shows user data, revenue numbers and system configuration. And it has no login requirement because the founder never asked for one and the AI did not add one unprompted.

An attacker does not need to find a vulnerability when the front door is open. A simple directory scan reveals these panels in seconds. We found admin interfaces exposing full user databases, Stripe subscription data and environment variables in production.

Math.random() in Security Contexts

38% of applications used JavaScript's Math.random() to generate session tokens, password reset links or API keys. Math.random() is not cryptographically secure. Its output is predictable. An attacker who observes a few values can reconstruct the internal state and predict every future output.

The correct function is crypto.getRandomValues() in the browser or crypto.randomBytes() in Node.js. AI tools know both exist. They choose Math.random() because it appears in more training examples and produces shorter code. The result looks random to a human reviewer. It is trivially predictable to an attacker.

What These Numbers Mean for You

If you built an application with AI assistance, the probability that it contains at least one critical vulnerability is over 90%. That is not an estimate. It is what the data shows.

The good news: every vulnerability we found was fixable. Most took hours, not weeks. The typical remediation cycle after a quick audit is 2 to 5 days. The cost of finding and fixing these issues before an attacker does is a fraction of the cost of a breach.

The IBM 2024 Cost of a Data Breach Report puts the average breach at $4.88 million. A quick audit from Sherlock Forensics costs $1,500 CAD. The math favors the audit every time.

Recommendations

Based on 50 engagements, here is what we recommend for any team shipping AI-generated code:

  1. Audit before launch. Not after. Not when you get users. Before the first deployment. A quick audit covers the critical vulnerability classes in 3 to 5 business days.
  2. Scan git history for secrets. It is not enough to remove secrets from the current codebase. If they were ever committed, they are in the history. Rotate every key that has touched version control.
  3. Parameterize every database query. No exceptions. If your code builds SQL strings with user input, it is injectable.
  4. Add authentication to every admin route. If it shows data that is not meant for every user, it needs a login check and an authorization check.
  5. Replace Math.random() in all security contexts. Session tokens, reset links, API keys and anything else that needs to be unpredictable.

If your application handles user data, processes payments or stores PII, a security audit is not optional. It is the minimum responsible step before production. Vibe-coded applications need it most, but even professionally developed AI-assisted apps benefit from an independent review.

FAQ

AI Code Audit Questions

What are the most common AI code vulnerabilities?
Based on 50 audits, the most common are: secrets in plaintext (78%), missing rate limiting on auth endpoints (65%), SQL injection (54%), broken access control (49%), exposed admin panels (41%) and weak cryptographic randomness (38%). These patterns are consistent across AI tools including Cursor, Copilot, Claude Code and ChatGPT.
How many vulnerabilities does a typical AI-built app have?
The median is 7 vulnerabilities per application. Vibe-coded apps average 11. Professionally developed apps using AI assistance average 4. 92% of all AI-built applications had at least one critical-severity finding.
Are vibe-coded apps more vulnerable than professionally developed ones?
Yes. Our data shows vibe-coded applications average 2.75 times more vulnerabilities. 100% of vibe-coded apps had critical findings compared to 73% of professionally developed apps. Professional developers catch common anti-patterns that non-developers miss, but both groups benefit from independent security review.