perf: Rules lazy cache evaluator#2325
Open
pindo696 wants to merge 3 commits into
Open
Conversation
use local cache with ttl or warm cache time instead of lazy db cache load on each evaluation. Lets say a part of RHINENG-24321
Reviewer's GuideIntroduces an in-memory, TTL-based cache for insights_rule metadata with lazy per-rule loading during advisor evaluation, configurable via environment variable, and adds tests to validate cache behavior and advisor merge semantics. Sequence diagram for advisor evaluation using lazy TTL rule cachesequenceDiagram
participant Evaluator
participant Config
participant DBPool
participant DB
Note over Evaluator,DB: Initialization warmup
Evaluator->>DBPool: connection()
DBPool-->>Evaluator: AsyncConnection
Evaluator->>DB: _load_rule_cache(conn)
DB-->>Evaluator: all insights_rule rows
Evaluator->>Evaluator: _set_rule_cache_expiry()
Evaluator->>Evaluator: store rule_cache
Note over Evaluator,DB: Per-system evaluation
Evaluator->>DBPool: connection()
DBPool-->>Evaluator: AsyncConnection conn
Evaluator->>Evaluator: _get_rule_cache(conn)
Evaluator->>Config: read insights_load_rule_cache_ttl_sec
Config-->>Evaluator: ttl
Evaluator->>Evaluator: _rule_cache_expired()
alt cache expired
Evaluator->>DB: _load_rule_cache(conn)
DB-->>Evaluator: refreshed insights_rule rows
Evaluator->>Evaluator: _set_rule_cache_expiry()
else cache valid
Evaluator->>Evaluator: reuse existing rule_cache
end
Evaluator->>Evaluator: snapshot = dict(rule_cache)
loop for each rule hit in rule_results
Evaluator->>Evaluator: resolve_rule_cache(rule_name)
alt rule in snapshot
Evaluator-->>Evaluator: return cached RuleCache
else rule missing in snapshot
alt rule_name in missing_rules_not_found
Evaluator-->>Evaluator: return None
else
Evaluator->>DB: _load_rule_cache_for_name(conn, rule_name)
DB-->>Evaluator: RuleCache or None
alt rule found
Evaluator->>Evaluator: update rule_cache and snapshot
Evaluator-->>Evaluator: return RuleCache
else rule not found
Evaluator->>Evaluator: add to missing_rules_not_found
Evaluator-->>Evaluator: return None
end
end
end
Evaluator->>Evaluator: merge advisor result using RuleCache
end
Class diagram for evaluator rule cache and config changesclassDiagram
class EvaluatorLogic {
- db_pool AsyncConnectionPool
- module_cache Dict~str, ModuleCache~
- vulnerable_package_cache Dict~(int, int, Optional~int~), VulnerablePackageCache~
- skipped_rules List~str~
- rule_cache Dict~str, RuleCache~
- rule_cache_expires_at Optional~datetime~
+ __init__(db_pool AsyncConnectionPool)
+ init() Async
- _load_cve_impact_cache() Dict~str, CveImpactCache~
- _load_cve_cache() Dict~str, CveCache~
- _load_rule_cache(conn AsyncConnection) Dict~str, RuleCache~
- _set_rule_cache_expiry() None
- _rule_cache_expired() bool
- _load_rule_cache_for_name(conn AsyncConnection, name str) Optional~RuleCache~
- _get_rule_cache(conn Optional~AsyncConnection~) Dict~str, RuleCache~
- _load_package_name_cache() Dict~str, PackageNameCache~
- _evaluate_advisor_res(rule_results Dict, sys_vuln_rows Dict~str, SystemVulnerabilitiesRow~, rule_cache Dict~str, RuleCache~, system_platform SystemPlatform, unpatched_cves Set~str~, conn AsyncConnection) Dict~str, SystemVulnerabilitiesRow~
+ evaluate_vulnerabilities(system_platform SystemPlatform, conn AsyncConnection) Dict~str, SystemVulnerabilitiesRow~
}
class RuleCache {
+ id int
+ playbook_count int
}
class Config {
+ insights_load_rule_cache_ttl_sec int
}
EvaluatorLogic --> RuleCache : uses
EvaluatorLogic --> Config : reads TTL from
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
SC Environment Impact AssessmentOverall Impact: ⚪ NONE No SC Environment-specific impacts detected in this PR. What was checkedThis PR was automatically scanned for:
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The new rule cache (
self.rule_cache/self.rule_cache_expires_at) is mutated from multiple async call paths without any locking, so under concurrent evaluations you may see races (e.g., redundant reloads or partially updated caches); consider adding anasyncio.Lockaround cache refresh/mutation to make cache updates atomic per process. - In
_get_rule_cacheyou re-readCFG.insights_load_rule_cache_ttl_secand recompute expiry on every access; if runtime reconfiguration is not required, caching the TTL on theEvaluatorLogicinstance at init time would simplify the logic and avoid subtle behavior changes ifCFGis mutated later.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new rule cache (`self.rule_cache` / `self.rule_cache_expires_at`) is mutated from multiple async call paths without any locking, so under concurrent evaluations you may see races (e.g., redundant reloads or partially updated caches); consider adding an `asyncio.Lock` around cache refresh/mutation to make cache updates atomic per process.
- In `_get_rule_cache` you re-read `CFG.insights_load_rule_cache_ttl_sec` and recompute expiry on every access; if runtime reconfiguration is not required, caching the TTL on the `EvaluatorLogic` instance at init time would simplify the logic and avoid subtle behavior changes if `CFG` is mutated later.
## Individual Comments
### Comment 1
<location path="evaluator/logic.py" line_range="60" />
<code_context>
self.module_cache: Dict[str, ModuleCache] = {}
self.vulnerable_package_cache: Dict[(int, int, Optional[int]), VulnerablePackageCache] = {}
self.skipped_rules = ["CVE_2017_5715_cpu_virt|VIRT_CVE_2017_5715_CPU_3_ONLYKERNEL", "CVE_2017_5715_cpu_virt"]
+ self.rule_cache: Dict[str, RuleCache] = {}
+ self.rule_cache_expires_at: Optional[datetime] = None
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the rule caching logic to use a single canonical cache with helper methods instead of a local snapshot and inner async closure.
A lot of the complexity comes from maintaining both `self.rule_cache` and a separate `rule_cache` snapshot plus the inner `resolve_rule_cache` closure. You can keep the same behavior (TTL + lazy load on miss) with a single canonical cache and a small helper method, which removes duplication and makes ownership clearer.
### 1. Make `_get_rule_cache` return the canonical cache (no copy)
```python
async def _get_rule_cache(
self, conn: Optional[AsyncConnection] = None
) -> Dict[str, RuleCache]:
"""Return the rule cache, refreshing from DB when cache TTL expired."""
if self._rule_cache_expired():
if conn is None:
async with self.db_pool.connection() as conn:
self.rule_cache = await self._load_rule_cache(conn)
else:
self.rule_cache = await self._load_rule_cache(conn)
self._set_rule_cache_expiry()
return self.rule_cache # <- no dict(...) copy
```
This removes the need to keep a snapshot in sync with `self.rule_cache`.
### 2. Extract `resolve_rule_cache` into a dedicated method operating only on `self.rule_cache`
```python
async def _get_rule_from_cache(
self,
conn: AsyncConnection,
rule_name: str,
missing_rules_not_found: Set[str],
) -> Optional[RuleCache]:
cache = await self._get_rule_cache(conn)
cached = cache.get(rule_name)
if cached:
return cached
if rule_name in missing_rules_not_found:
return None
fetched = await self._load_rule_cache_for_name(conn, rule_name)
if fetched:
# update canonical cache only
self.rule_cache[rule_name] = fetched
return fetched
missing_rules_not_found.add(rule_name)
return None
```
Now all cache logic (TTL + lazy load) is encapsulated, and there is a single source of truth: `self.rule_cache`.
### 3. Simplify `_evaluate_advisor_res` to use the helper (no local cache or inner async)
Change the signature to drop `rule_cache` (internal method, so safe to adjust call sites):
```python
async def _evaluate_advisor_res(
self,
rule_results: dict,
sys_vuln_rows: Dict[str, SystemVulnerabilitiesRow],
system_platform: SystemPlatform,
unpatched_cves: Set[str],
conn: AsyncConnection,
) -> Dict[str, SystemVulnerabilitiesRow]:
"""Merge results from vmaas package evaluation with advisor rule evaluation"""
missing_rules_not_found: Set[str] = set()
for cve, hit_details in rule_results["rule_hits"].items():
...
rule = hit_details["rule_id"]
if rule in self.skipped_rules:
continue
if not isinstance(hit_details["details"], dict):
hit_details["details"] = json.loads(hit_details["details"])
rule_db = await self._get_rule_from_cache(conn, rule, missing_rules_not_found)
if not rule_db:
continue
...
```
And in the second loop:
```python
rule = pass_details["rule_id"]
if rule in self.skipped_rules:
continue
rule_db = await self._get_rule_from_cache(conn, rule, missing_rules_not_found)
if not rule_db:
continue
...
```
### 4. Adjust the call site
```python
with RULES_EVAL_TIME.time():
# warm/refresh cache; returned dict is `self.rule_cache`, but caller
# no longer needs to keep its own snapshot
await self._get_rule_cache(conn)
sys_vuln_rows = await self._evaluate_advisor_res(
system_platform.rule_results,
sys_vuln_rows,
system_platform,
unpatched_cves_set,
conn,
)
```
Outcome:
- Single cache (`self.rule_cache`), no snapshot that needs manual synchronization.
- No inner async closure capturing multiple mutable structures.
- TTL and lazy loading remain, but are hidden behind `_get_rule_cache` and `_get_rule_from_cache`, making `_evaluate_advisor_res` easier to read and reason about.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Return rule cache from get rule cache (no dict copy), inner closure to class fnc, test updated
Contributor
Author
|
/retest |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Use local cache for rules and reuse it while the cache is still warm (ttl). After cache ttl expires, preload rules from DB cache.
Secure Coding Practices Checklist GitHub Link
Secure Coding Checklist
Summary by Sourcery
Introduce a TTL-based in-memory cache for insights rules and lazy loading of missing rule metadata during advisor evaluation to reduce repeated full-table database queries.
Enhancements:
Build:
Tests: