security-scoringvibe-codinghackability

Why Your AI-Built App Gets an F (And How to Get an A)

March 30, 20266 min read

We launched VibeArmor with a traditional security scoring model. Within 48 hours, we realized it was completely broken.

Shopify scored an F. Stripe scored a D. Meanwhile, a todo app with zero authentication and an exposed database got a B+ because it had all its HTTP headers in place.

That is the fundamental problem with security scanners: they measure hygiene instead of hackability.

The Hygiene Problem

Traditional scanners check for things like missing X-Frame-Options headers, X-Content-Type-Options, and whether your cookies have the Secure flag. These are real things, but they are not how apps get hacked.

Here is what a typical hygiene-based scanner produces:

Stripe.com — "Missing X-Frame-Options, missing Permissions-Policy, server version disclosed." Grade: D.
Shopify.com — "Open CORS policy, missing CSP on some routes, cookie without SameSite." Grade: F.
Random todo app — "All headers present, HTTPS configured, cookies look good." Grade: B+.

That todo app had no authentication, no RLS, and the Supabase service role key in the client bundle. But it had great HTTP headers, so it scored well.

This is not useful. It is actively misleading.

The Tier Pivot: Hackability Over Hygiene

We rebuilt our entire scoring system around one question: can someone actually hack this app?

Every check is now classified into one of three tiers:

Tier 1: Can Someone Steal Your Data? (36 checks)

These prove exploitability. Exposed secrets, authentication bypass, SQL injection, cross-user data access. If you fail any Tier 1 check, your grade drops hard — because these are the things that lead to data breaches, not theoretical risks.

A single exposed service role key is worth more than 50 missing HTTP headers.

Tier 2: Are Your Defenses Solid? (34 checks)

These are real security gaps that require specific conditions to exploit. HTTPS misconfigurations, missing Content-Security-Policy, no rate limiting on login, cookie security flags. They matter, but they do not prove someone can walk in and take data right now.

Tier 3: Informational (30 checks)

Best practices that are good to know but never affect your grade. Missing X-Frame-Options on a marketing page is not a vulnerability. Server version disclosure is trivia, not a hack. These checks are shown for completeness but carry zero weight.

What Changed in Practice

With the new scoring model:

Stripe scores an A+ — because it has zero exploitable vulnerabilities, even if it is missing some informational headers.
Shopify scores an A — because its CORS is intentional (they run a public API) and nothing is actually exploitable.
That todo app scores an F — because an exposed service role key and no RLS means anyone can read every user's data.

This is the correct ranking. The scanner now agrees with what a penetration tester would tell you.

How the Scoring Works

The math is intentionally weighted toward hackability:

Tier 1 critical finding: -25 points. One exposed secret drops you from A to C.
Tier 1 high finding: -15 points.
Tier 1 medium finding: -5 points.
Tier 2 findings: -10 / -5 / -2 points depending on severity.
Tier 3 findings: 0 points. Always zero. They show up in your report but never hurt your grade.

There are also floors to prevent absurd results:

If you have zero Tier 1 findings, your score cannot drop below 75 (C+) no matter how many Tier 2 issues exist.
If you have zero Tier 1 and zero Tier 2 findings, your score cannot drop below 90 (A-).
The absolute floor is 40 (F) — we do not go lower.

What This Means for You

If your app gets an F, do not panic. It means we found something that is actively exploitable — and every finding comes with a specific fix you can paste into Cursor.

The path from F to A is usually 3-5 fixes. Move your secrets server-side, enable RLS, add rate limiting to your login endpoint. That alone will get most apps to a B or higher.

Stop worrying about HTTP headers. Start worrying about whether someone can read your users' data.

Scan your app free

Paste a URL, get a letter grade and Cursor-ready fixes in 3 minutes. No signup required.

Start Free Scan