Skip to content

feat: add Rulehound detection rule import integration (refs #38)#42

Draft
JJcyborg wants to merge 2 commits into
mainfrom
feat/issue-38-rulehound-rules
Draft

feat: add Rulehound detection rule import integration (refs #38)#42
JJcyborg wants to merge 2 commits into
mainfrom
feat/issue-38-rulehound-rules

Conversation

@JJcyborg

@JJcyborg JJcyborg commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Adds a script to import and convert Rulehound (Sigma) detection rules into Falco-compatible format:

  • scripts/import_rulehound_rules.py — fetches Sigma Linux rules from SigmaHQ, converts to Falco YAML with proper event-type and MITRE tag mapping
  • tests/test_rulehound_import.py — 33 tests covering conversion, field extraction, and output validation
  • docs/rulehound.md — documentation on usage, category mapping, and integration steps

Closes #38

@JJcyborg

JJcyborg commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

PR Review: Rulehound Detection Rule Import Integration

Reviewing the diff for code correctness, error handling, test coverage, documentation quality, style consistency, and security concerns.


🔴 Critical Issue: Falco condition values are unquoted

In scripts/import_rulehound_rules.py the condition builder generates unquoted values for proc.name comparisons:

# Current (broken):
condition_parts.append(f"proc.name = {procs[0]}")           # → proc.name = rm
condition_parts.append(f"proc.name in ({procs_str})")       # → proc.name in (rm, shred)

# Required (Falco needs quoted strings):
condition_parts.append(f"proc.name = \"{procs[0]}\"")       # → proc.name = "rm"
condition_parts.append(f"proc.name in ({quoted_procs_str})") # → proc.name in ("rm", "shred")

Without quoted string values, Falco will fail to parse these rules. This affects every rule that extracts process names.


🟡 Should Fix

1. pr_stub.txt should be removed
This is a development artifact containing draft PR notes unrelated to this change. It does not belong in the repository.

2. yaml import should be at module level
PyYAML is a required dependency. The import yaml inside fetch_sigma_rules() and convert_and_write() is a pattern for optional deps, but here yaml is essential. Move it to the top-level imports.

3. proc_name_exists added unconditionally to all rules
The condition builder always bolts on proc_name_exists, even for file_event rules that only have path conditions and no process name references. The result is conditions like open_write and proc_name_exists and fd.name endswith "..." which adds an unnecessary check. Only include proc_name_exists when procs are actually used.

4. Silent error swallowing in _github_request
Returning [] on any HTTPError means a 401 auth failure or 403 rate-limit is indistinguishable from an empty directory. Consider raising on 4xx class errors, or at least on 401/403.


🟡 Test Coverage Gaps

5. No mock-based tests for network functions
_github_request and _fetch_raw have zero test coverage. These are critical for a network-fetching script; they should have unit tests using unittest.mock to verify error handling.

6. Integration tests skip in CI
TestOutputValidity tests use pytest.skip when the output directory is absent, so they never run in CI. Consider creating a temp directory and running convert_and_write with mock data.


✅ What Looks Good

  • Documentation (docs/rulehound.md): Well-structured, covers what Rulehound is, how the import works, mapping tables, usage, and integration steps.
  • Conversion architecture: The flatten → extract → map pipeline is clean and well-separated.
  • MITRE tag mapping: Thorough and correct.
  • Severity-to-priority mapping: Complete and sensible.
  • Unit tests for conversion logic: Good coverage of _flatten_detection, _mitre_tags, _extract_procs, _extract_paths, and sigma_to_falco_rule.
  • Description truncation: Properly handles long descriptions with a 300-char cap.
  • Wildcard filtering: Correctly excludes glob patterns from process name extraction.

Recommendation: Fix the quoting bug (item 🔴) and remove pr_stub.txt, then this is ready for another review pass.

@JJcyborg JJcyborg force-pushed the feat/issue-38-rulehound-rules branch from 7666600 to 2714ca6 Compare June 9, 2026 02:39
@JJcyborg

JJcyborg commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

Addressed all review feedback:

  • Critical fix: Quoted Falco condition values -- proc.name values now generate with quotes: proc.name = "rm" and proc.name in ("rm", "shred") instead of bare strings that would cause parser failures
  • Moved import yaml to module top level -- yaml is a required dependency, no longer lazily imported inside loops
  • Made proc_name_exists conditional -- only added to condition when there are process-related conditions (not for file_event rules with no process context)
  • Improved HTTP error handling -- _github_request now raises RuntimeError on 401/403 auth errors instead of silently returning empty list, making auth failures distinguishable from empty directories
  • Removed stray pr_stub.txt (was not present in this rebased branch)
  • All 31 tests passing (2 skipped for integration)

Ready for re-review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorporate detection rules

1 participant