Skip to content

Add --exclude glob flag to skill scan#17

Open
serval-frenchie wants to merge 1 commit into
NVIDIA:mainfrom
serval-frenchie:cli/exclude-glob-flag
Open

Add --exclude glob flag to skill scan#17
serval-frenchie wants to merge 1 commit into
NVIDIA:mainfrom
serval-frenchie:cli/exclude-glob-flag

Conversation

@serval-frenchie

Copy link
Copy Markdown

Summary

  • Adds a repeatable --exclude PATTERN option to skillspector scan
  • Patterns are applied during file discovery in build_context, so excluded files are absent from both the Components table and any analyzer findings
  • Uses fnmatch semantics against the path relative to the scan root

Problem

SkillSpector's static analyzers read every file with utf-8 + errors='replace' and run regex patterns against the result. When a skill bundle ships binary assets (PDFs, images), raw bytes can incidentally match patterns like shell=True, -rf /, or --no-verify, producing HIGH-severity false positives. There is currently no way to suppress these short of moving the file out of the skill folder, which is a non-starter for adoption in CI: every push surfaces a HIGH finding that has to be dismissed manually in the GitHub Security tab.

Solution

skillspector scan ./my-skill/ --exclude '*.pdf' --exclude 'assets/*'
  • Repeatable flag (List[str] via Typer), threaded into graph state as exclude_patterns
  • Filtering happens once, in _walk_skill_files, before components, file_cache, and component_metadata are built — so the rest of the pipeline is unchanged
  • Each exclusion is logged at DEBUG (visible with --verbose)
  • Excluding everything is valid: the scan succeeds with no findings

Tests

  • tests/nodes/test_build_context.py: exclusion filters components/file_cache/metadata, directory glob patterns, no-match leaves the list intact, exclude-everything yields empty state
  • tests/unit/test_cli.py: end-to-end CLI invocation with a fixture skill containing a PDF whose bytes match TM1 (Tool Parameter Abuse) — asserts the PDF is absent from components and produces no issues; also covers the repeatable form and the exclude-everything case
  • Existing suite passes (make test-unit)

Test plan

  • make test-unit is green
  • make lint and make format-check are clean
  • Manual: scan a skill containing a binary asset that previously triggered TM1; confirm the finding disappears with --exclude '*.pdf' and reappears without it

Adds a repeatable `--exclude PATTERN` option to `skillspector scan`.
Patterns use fnmatch semantics against the path relative to the scan
root and are applied during file discovery in build_context, so
excluded files are absent from both the components list and any
analyzer findings.

The motivating case is binary assets (e.g. a marketing-template PDF
in `assets/`) whose raw bytes happen to match static regex patterns
like `shell=True` or `-rf /`, producing HIGH-severity false positives
that block adoption in CI. With this flag, those files can be
filtered without moving them out of the skill bundle.

Excluded files are logged at DEBUG, surfaced via `--verbose`.
Excluding everything is valid; the scan succeeds with no findings.
@serval-frenchie serval-frenchie marked this pull request as ready for review June 6, 2026 02:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant