Suppress KeywordDetector false positives (0.5.1)#76
Open
debu-sinha wants to merge 2 commits into
Open
Conversation
Deeper pass over the top-50 MCP scan surfaced ~73 medium "Secret Keyword"
findings that were not secrets. KeywordDetector captures the value after a
password=/secret=/token= keyword, which is often a descriptive identifier, a
validation regex, a CI expression, or translated UI text.
Fixes (scoped to KeywordDetector so provider keys are never affected):
- Identifier values like SecurityApiKey, clientSecretBasic, security.apiKey,
CryptoService, password_auth (split on separators + camelCase into real
words; high-entropy keys decompose into 2-char fragments and are kept).
- Validation regex values like AIza[0-9A-Za-z-_]{35} and sk-[a-zA-Z0-9]{20}.
- CI templating expressions like ${{ secrets.GITHUB_TOKEN }}.
- Localization files (locales/, i18n/, *.lang.ts) treated as low confidence.
Adds regression tests with guards proving real high-entropy custom secrets,
AWS keys, and private keys still flag. Bumps to 0.5.1.
0ded821 to
1e1e7ac
Compare
Owner
Author
|
Dropped the dashboard regeneration from this PR: the weekly MCP scan updated the dashboard on main in the meantime, causing conflicts on generated files. The dashboard will refresh with these 0.5.1 fixes on the next weekly run. This PR is now code-only (the credential scanner fixes + tests + version bump). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Second false-positive pass over the top-50 MCP ecosystem scan, focused on the
credential scanner's detect-secrets KeywordDetector results. These produced ~73
medium "Secret Keyword" findings that were not secrets. Ships as 0.5.1.
Result on the same corpus: medium "Secret Keyword" 73 -> 22 (-70%). The
remaining matches are genuinely ambiguous (telemetry keys, datastore config,
config-schema prose) where further suppression would risk missing real secrets.
Changes
All fixes are scoped to KeywordDetector, so provider-key detectors (AWS, Private
Key, OpenAI, etc.) are never affected. Each has a regression test.
names such as
SecurityApiKey,clientSecretBasic,security.apiKey,CryptoService,password_auth. Detected by splitting on separators andcamelCase into real word tokens; high-entropy keys decompose into two-char
fragments and are kept.
AIza[0-9A-Za-z-_]{35},sk-[a-zA-Z0-9]{20})and CI templating expressions (
${{ secrets.GITHUB_TOKEN }}).locales/,i18n/,*.lang.ts) aslow-confidence context.
Testing Performed
Local Unit
pytest tests/ -q-> 658 passed, 2 skipped, 3 xfailed (5 new tests).Guard tests confirm real high-entropy custom secrets, AWS keys, and private
keys still flag.
ruff check src/ tests/-> clean.Read-only Smoke Test
spot-checked as ambiguous or legitimate, not a clear FP.
Risk & Rollback
Low. KeywordDetector-scoped suppression only; provider detectors untouched.
Guard tests prevent over-suppression. Rollback is a branch revert.
Docs