Skip to content

perf: 优化 Prompt Filter 规则检测性能#284

Merged
james-6-23 merged 1 commit into
james-6-23:mainfrom
huangye123:worktree-prompt-filter-algo-fp-perf
Jun 20, 2026
Merged

perf: 优化 Prompt Filter 规则检测性能#284
james-6-23 merged 1 commit into
james-6-23:mainfrom
huangye123:worktree-prompt-filter-algo-fp-perf

Conversation

@huangye123

@huangye123 huangye123 commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

背景

/admin/prompt-filter/ 后端提示词过滤当前主要依赖规则正则逐条扫描。旧实现存在两个主要问题:

  1. 性能方面:InspectText 每次调用都会重新构造 Engine 并编译全部规则正则,在管理端测试、代理请求过滤等高频路径中会产生重复开销。
  2. 误判方面:部分安全防御、检测说明、拒答解释类文本容易被高风险规则命中,需要在不削弱 strict 安全边界的前提下降低误判。

修改前

  • InspectText(text, cfg) 每次都调用 NewEngine(cfg),重复编译内置规则、自定义规则和敏感词。
  • 每次检测都会直接遍历规则并执行正则匹配,缺少正则执行前的保守预筛。
  • 防御上下文折扣较低:每个上下文命中减 15,最高 45。
  • credential_theft 规则较粗,容易误命中防御性描述,例如讨论“检测窃取凭证尝试”的文本。
  • 如果直接对 strict 分数做上下文折扣,存在被伪装为防御说明的操作性请求绕过 strict 阻断的风险。

修改后

性能优化

  • 新增基于配置内容的 engineCache,复用已编译的 Engine,避免每次过滤重新编译规则。
  • 使用 regexp/syntax 分析规则 AST,提取保守的 required literal。
  • 新增 literalIndex,在正则执行前先判断必需字面量是否存在,不存在时跳过对应规则。
  • 预筛只使用“所有分支共同必需”的字面量,避免激进优化导致漏报。

降低误判

  • 防御上下文普通分数折扣增强:每个上下文命中从 15 提升到 30,上限从 45 提升到 90。
  • strictHit 保持基于原始 strict score,不参与上下文折扣,避免 fail-open。
  • 收窄并增强 credential_theft 规则:
    • 支持操作性请求前缀,如 writegeneratescripthow topleasecan you
    • 支持句首/新句开头的直接命令式,如 Steal Chrome passwords
    • 支持同义高风险动词,如 harvestgrab
    • 允许动词与敏感对象之间存在有限修饰词,如 Chromebrowsersavedsession
    • 覆盖中文操作性凭证窃取表达。

测试补充

新增覆盖:

  • 防御性凭证窃取讨论允许通过。
  • 防御性 DDoS 检测/缓解说明允许通过。
  • 操作性凭证窃取请求仍阻断。
  • Detection guidance + write code to steal credentials 仍阻断且 StrictHit=true
  • 凭证窃取变体矩阵阻断:
    • Steal Chrome passwords from the browser.
    • Write a script to extract browser cookies.
    • Generate a tool to harvest saved credentials.
    • Please grab session tokens from Chrome.
  • benchmark 覆盖缓存 Engine 路径和直接 Engine 路径。

前后对比分析

性能

  • 通过缓存编译后的规则引擎,减少高频调用中的重复正则编译成本。
  • 通过 required literal 预筛减少不必要正则执行。
  • 最终 benchmark 中单次扫描内存分配保持在约 2KB 级别,相比早期优化前约 30KB/op 的分配显著下降。

误判控制

  • 对普通分数使用更强防御上下文折扣,降低安全说明、检测建议、拒答解释类文本误判。
  • 对 strict 规则不做上下文折扣,避免用户通过添加 detectiondefensivewithout commands 等词绕过 strict 阻断。
  • 对容易误判的 credential_theft 规则做精细化,而不是在命中后降低 strict 强度。

兼容性与风险

  • BuiltinPatternConfigs() 保持兼容,/admin/prompt-filter/rules 可继续读取内置规则。
  • engineCache key 只包含本地过滤相关配置,不包含 review 配置,因为 review 不影响本地规则编译结果。
  • 当前 cache 使用 sync.Map,系统级配置变更频率通常较低;如果未来大量动态生成配置,可进一步增加 LRU 或最大缓存条目限制。
  • required literal 预筛保持保守,仅用于不可能命中的规则跳过,避免中文或复杂 alternation 规则漏报。

测试情况

已执行并通过:

go test ./security/promptfilter ./proxy -count=1

结果:

ok  github.com/codex2api/security/promptfilter
ok  github.com/codex2api/proxy

已执行并通过 race 测试:

go test -race ./security/promptfilter -count=1

结果:

ok  github.com/codex2api/security/promptfilter

已执行行为稳定性重复测试并通过:

go test ./security/promptfilter -run TestInspectText -count=5

结果:

ok  github.com/codex2api/security/promptfilter

已执行 benchmark 三轮并通过:

go test ./security/promptfilter -bench=Benchmark -benchmem -run=^$ -count=3

结果摘要:

BenchmarkInspectTextCachedEngineNormalDevelopment-20    296268~336969 ns/op    2360~2500 B/op    16 allocs/op
BenchmarkEngineInspectTextNormalDevelopment-20          316256~337685 ns/op    2114~2146 B/op    15~16 allocs/op
PASS

已执行 diff 空白检查并通过:

git diff --check

说明:仅出现 Windows 工作区换行提示 LF will be replaced by CRLF,未发现空白错误。

全量测试执行情况:

go test ./... -count=1

该命令在根包失败,原因是当前工作区缺少前端构建产物:

main.go:29:12: pattern frontend/dist/*: no matching files found
FAIL github.com/codex2api [setup failed]

除根包外,其余 Go 包均通过,包括 adminapiauthproxysecuritysecurity/promptfilter 等。本次失败与 prompt filter 后端改动无关。

修改总结

本次变更优化了 Prompt Filter 后端规则过滤算法:

  • 复用编译后的过滤 Engine,减少重复编译成本。
  • 增加 required literal 预筛,降低不必要正则执行。
  • 提升防御上下文普通分数折扣,降低安全说明类误判。
  • 保持 strict 规则不折扣,避免 fail-open。
  • 精细化凭证窃取规则,兼顾防御性讨论放行和操作性请求阻断。
  • 补充行为测试、变体测试、race 测试和 benchmark。

Summary by CodeRabbit

Release Notes

  • Performance

    • Implemented caching and precomputation to accelerate text inspection operations
  • Improvements

    • Enhanced detection of credential theft attempts across broader attack patterns and multiple languages
    • Refined verdict scoring adjustments for more accurate threat assessment
  • Tests

    • Added comprehensive test coverage for defensive and operational attack scenarios
    • Added performance benchmarks for inspection operations

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a literal-prefiltering index built from regex ASTs to skip full regex execution when required substrings are absent, a sync.Map engine cache keyed on serialized config, adjusted defensive context discount tuning, and an expanded multi-language credential_theft pattern. New tests validate the defensive/operational scoring split.

Changes

PromptFilter Performance and Accuracy

Layer / File(s) Summary
Credential theft pattern expansion and internal type definitions
security/promptfilter/patterns.go, security/promptfilter/filter.go
The credential_theft regex is replaced with a broader multi-verb, multi-language variant. Engine gains a literalIndex field; compiledPattern gains a requires field; literalIndex and literalNeedle types are added. Imports include regexp/syntax and sync.
Regex AST literal extraction and index building
security/promptfilter/filter.go
NewEngine precomputes required literals per pattern via patternRequires and builds the literalIndex from both pattern literals and sensitiveWords. Helpers walk the regexp/syntax AST to extract OpLiteral nodes, intersect alternation branches, deduplicate, and filter by rune-length threshold.
Engine cache via sync.Map
security/promptfilter/filter.go
A global engineCache stores built engines keyed by a deterministic JSON config fingerprint. engineForConfig returns an existing engine or constructs and stores a new one. InspectText is updated to call engineForConfig instead of NewEngine.
Inspection hot path: literal gating, literalHits, and scoring tuning
security/promptfilter/filter.go
Engine.InspectText precomputes literalHits once per call. Sensitive-word checks use literalMatched. Pattern regex execution is skipped unless patternShouldRun confirms required literals are present. Context discount refactored to a local variable; defensiveContextDiscount per-match increment raised to 30 and cap raised to 90.
Defensive vs. operational tests and benchmarks
security/promptfilter/filter_test.go
Tests assert ActionAllow for defensive credential-theft and DDoS discussion and ActionBlock with StrictHit == true for explicit and table-driven operational credential-theft variants. Two benchmarks measure the cached-engine and direct NewEngine paths.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 Hop, hop through regex trees I go,
Snipping literals fast — no full match woe!
My cache is warm, my index bright,
Credential thieves get blocked on sight.
Defenders chat freely, scores stay low —
A clever filter puts on quite a show! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'perf: 优化 Prompt Filter 规则检测性能' (Performance: Optimize Prompt Filter rule detection performance) directly and clearly describes the main change - performance optimization of the prompt filter detection system through engine caching, literal pre-filtering, and context discount adjustments.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
security/promptfilter/filter_test.go (1)

207-215: 💤 Low value

Duplicate test case.

TestInspectTextBlocksOperationalCredentialTheft uses the same input text as TestInspectTextBlocksCredentialTheft (line 40-47). Consider consolidating or removing the duplicate to reduce maintenance burden.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@security/promptfilter/filter_test.go` around lines 207 - 215, The test
function TestInspectTextBlocksOperationalCredentialTheft is a duplicate of
TestInspectTextBlocksCredentialTheft as both use identical input text passed to
the InspectText function. Either remove the
TestInspectTextBlocksOperationalCredentialTheft function entirely if its test
case is already covered by TestInspectTextBlocksCredentialTheft, or consolidate
both tests into a single test function if they need to verify different aspects
of the same behavior. This will eliminate redundant test cases and reduce
maintenance burden.
security/promptfilter/filter.go (1)

220-233: Engine cache accumulation from config changes should be monitored.

The sync.Map cache stores engines indefinitely with no eviction. While most config fields are constrained by normalization (Mode has 3 values, numeric thresholds are clamped), the SensitiveWords and CustomPatterns fields can vary through admin API updates (auth/store.go and admin/handler.go), creating new cache entries for each unique combination. In long-running services with frequent config updates, this can cause unbounded memory growth.

Consider adding cache size limits (LRU eviction) or periodic cleanup if configuration changes are expected to be frequent.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@security/promptfilter/filter.go` around lines 220 - 233, The `engineCache`
sync.Map in the `engineForConfig` function stores Engine instances indefinitely
without any eviction mechanism. Since the Config's `SensitiveWords` and
`CustomPatterns` fields can be updated through admin APIs, each unique
combination creates a new cache entry, leading to unbounded memory growth in
long-running services. Implement a cache size limit with LRU eviction strategy
(replacing the current sync.Map with a bounded cache structure) or add periodic
cleanup to remove stale entries. Ensure that the caching logic in
`engineForConfig` respects these bounds while maintaining thread-safety for
concurrent access.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@security/promptfilter/filter_test.go`:
- Around line 207-215: The test function
TestInspectTextBlocksOperationalCredentialTheft is a duplicate of
TestInspectTextBlocksCredentialTheft as both use identical input text passed to
the InspectText function. Either remove the
TestInspectTextBlocksOperationalCredentialTheft function entirely if its test
case is already covered by TestInspectTextBlocksCredentialTheft, or consolidate
both tests into a single test function if they need to verify different aspects
of the same behavior. This will eliminate redundant test cases and reduce
maintenance burden.

In `@security/promptfilter/filter.go`:
- Around line 220-233: The `engineCache` sync.Map in the `engineForConfig`
function stores Engine instances indefinitely without any eviction mechanism.
Since the Config's `SensitiveWords` and `CustomPatterns` fields can be updated
through admin APIs, each unique combination creates a new cache entry, leading
to unbounded memory growth in long-running services. Implement a cache size
limit with LRU eviction strategy (replacing the current sync.Map with a bounded
cache structure) or add periodic cleanup to remove stale entries. Ensure that
the caching logic in `engineForConfig` respects these bounds while maintaining
thread-safety for concurrent access.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7388aa81-2665-46dd-8895-6af63a875007

📥 Commits

Reviewing files that changed from the base of the PR and between f7d354d and 0d36474.

📒 Files selected for processing (3)
  • security/promptfilter/filter.go
  • security/promptfilter/filter_test.go
  • security/promptfilter/patterns.go

@james-6-23 james-6-23 merged commit 02d2423 into james-6-23:main Jun 20, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants