AI-verified findings: drop false positives, see reasoning
Scanning catches a lot. Most of it's worth fixing — some of it isn't. Telling those apart used to be on you. Now there's a second pass that does it for you.
decoy-scan --verify runs every finding through a two-stage AI pipeline.
Haiku 4.5 triages each one as P0/P1/P2 in a single batched call, then
Sonnet 4.6 revalidates the P0/P1 set with full context — server config,
tool definitions, surrounding findings — and decides whether to keep
each one or drop it as a false positive. You get back a ranked list
with confidence scores and short reasoning per finding. False positives
collapse into an expandable group with the reason they were dropped.
How to try it
npx decoy-scan --verifyA free Decoy account is required (we need an email — the verify
pipeline calls Anthropic and we'd rather not pay anonymously into the
void). On first run the CLI prints a one-time claim URL —
app.decoy.run/d/<installId> — that links the local install to an
account in two clicks. After that, --verify just works.
The CLI prints raw and verified counts side by side. On the dashboard, the scan detail view shows a blurred verification panel right next to the findings list with a single "Verify all" button — one click reveals the verified set with reasoning expanded inline.
What's included
- Free: 5 AI verifications per month, per account. Enough to vet the worst findings on most repos.
- Team: 30 verifications per seat per day. Full reasoning per finding, fingerprinting, and false-positive correlation across your install base.
- Business: 60 verifications per seat per day, plus the rest of Team's features.
Caps are fair-use guards (most users never hit them) sized to keep the unit economics honest — verify isn't unlimited because Anthropic API costs aren't.
Verification reuses what's already there: Haiku for cheap triage, Sonnet for the heavy lift, your existing token for auth. Total round-trip is typically 5–10 seconds for a scan with a dozen findings.
Why this exists
Regex-based scanners over-flag by design — they have to, because the cost of a missed real finding is much higher than the cost of a false positive. AI verification flips that ratio at the moment you're staring at the results, so the actionable list is shorter and trustworthy.