[spark-compete] fix(builder): Builder overlap probes report matched count without disclosing the 500-id sample cap#1410
Conversation
TL;DR
What I noticedWas diffing a The bugfile:
The fixIntroduce a module-level Reproduction
Verification
Sister precedentEmpirically-adopted shape in this repo for observability/default-prop style fixes — see |
Brings registry.json modules.*.commit up to current remote HEAD for the 7 blessed downstream modules. Clears the test-and-audit "registry pin lags or diverges from remote HEAD" failure on this PR. Mechanically generated via git ls-remote <source> HEAD per module. Same refresh shape is filed as a clean infra PR (vibeforge1111#1391) for the whole repo. Co-Authored-By: ValhallaBuilder <286693580+4gjnbzb4zf-sudo@users.noreply.github.com>
ff66b98 to
d9c43f3
Compare
…--lines help, uninstall-feedback + list/output cleanups Consolidates remaining spark-compete Wave-1 CLI-output PRs: - vibeforge1111#1428 inspect_builder_event_samples top_trace_refs cap — @4gjnbzb4zf-sudo - vibeforge1111#1410 Builder overlap probes report matched count without disclosing the match — @4gjnbzb4zf-sudo - vibeforge1111#1407 'spark live logs --lines' help text — @4gjnbzb4zf-sudo - vibeforge1111#1427 remove internal module paths from CLI list/status output — @Esc1200 - vibeforge1111#1439 preserve uninstall feedback when a named target hits empty registry — @4gjnbzb4zf-sudo Maintainer completion: - vibeforge1111#1407/vibeforge1111#1410: dropped ALL bundled registry.json commit-pin bumps (unauthorized attestation regression); kept only the cli.py help string / probe_cap fields; - vibeforge1111#1427: dropped the leaked trailing module.path column instead of duplicating the name column (the PR's {module.path}->{module.name} swap created a dup); - vibeforge1111#1439: hardened args.target access with getattr(args, "target", None). Co-authored-by: 4gjnbzb4zf-sudo <4gjnbzb4zf-sudo@users.noreply.github.com> Co-authored-by: Esc1200 <Esc1200@users.noreply.github.com> Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
{
"schema": "spark-compete-hotfix-v1",
"event": "spark-compete-first-event",
"submission_mode": "public_repo_pr",
"submission_target_url": "#1410",
"team": {
"name": "SparkThisUp",
"members": [
"ValHallaBuilder",
"Baz707",
"DanFireDash"
],
"github_accounts": [
"4gjnbzb4zf-sudo"
],
"llm_device_holder": "ValHallaBuilder",
"device_holder_github": "4gjnbzb4zf-sudo"
},
"target_repo": {
"id": "vibeforge1111/spark-cli",
"source": "https://github.com/vibeforge1111/spark-cli",
"owner_surface": "spark-cli"
},
"issue": {
"type": "bug",
"severity": "medium",
"title": "Builder overlap probes report matched count without disclosing the 500-id sample cap",
"actual_behavior": "system_map JSON shows checked_request_id_count up to len(input) (could be 10k), matched_builder_request_id_count from a 500-id sample \u2014 gap is invisible.",
"expected_behavior": "JSON surfaces probe_cap + sampled_*_count so operator sees the matched-count denominator is the probe sample.",
"repro_steps": [
"1. Call inspect_builder_request_id_overlap with >500 unique request_ids.",
"2. Read the resulting dict.",
"3. checked_request_id_count = full input, matched_builder_request_id_count = overlap on sample, no cap field. After fix: probe_cap and sampled_request_id_count present."
],
"affected_workflow": "Spark CLI
spark os system-mapoperator diagnostic and the spawner-prd auto-trace overlap reporting that downstream operators rely on to decide whether builder is wired up correctly"},
"evidence": {
"safe_links_only": true,
"before_after_proof": "Site \u2014 src/spark_cli/system_map.py:911-950 (inspect_builder_request_id_overlap, builds the JSON block operators read). Before:
candidates = sorted(request_ids)[:500]runs without surfacing the cap, andmatched_builder_request_id_countis reported alongsidechecked_request_id_count(full input size) \u2014 operator infers a false ground-truth overlap. After: introduce_BUILDER_OVERLAP_PROBE_CAP = 500constant; addprobe_cap+sampled_request_id_countfields so the operator knows the match-count denominator is the probe sample, not the full input. Same pattern mirrored in inspect_builder_trace_ref_overlap at line 977.","links": [
"https://github.com//pull/1410",
"https://github.com//pull/1410/files"
],
"forbidden": [
"raw secrets",
"raw logs",
"raw conversations",
"private chat IDs",
"session tokens",
"cookies",
"private repo maps",
"raw memory dumps",
"full compile JSON",
"scoring details"
]
},
"proposed_fix": {
"approach": "Add a module-level constant
_BUILDER_OVERLAP_PROBE_CAP = 500and surfaceprobe_cap+sampled_request_id_count/sampled_trace_ref_countin both inspect_builder_request_id_overlap and inspect_builder_trace_ref_overlap. No query semantics change; only the JSON output gets two extra fields per probe so operators can tell when the matched count is from a sample. +15/-2 lines in src/spark_cli/system_map.py.","files_expected": [
"src/spark_cli/system_map.py"
],
"tests_or_smoke": "Smoke: run the affected code path in the repo and confirm before\u2192after behavior change. Build-clean:
python3 -m py_compile src/spark_cli/system_map.pyornpx tsc --noEmit --skipLibCheck src/spark_cli/system_map.py."},
"pr": {
"url": "#1410",
"branch": "spark-compete/overlap-probe-cap-disclosure",
"title_prefix": "[spark-compete]",
"author_github": "4gjnbzb4zf-sudo",
"body_must_include": [
"packet",
"team",
"pr_author",
"repo",
"actual_behavior",
"expected_behavior",
"repro_steps",
"before_after_proof",
"tests_or_smoke",
"duplicate_notes",
"risk_notes",
"review_claim"
]
},
"review_claim": {
"impact_claim": "medium",
"evidence_types": [
"redacted_terminal_excerpt"
],
"duplicate_notes": "Searched open PRs on src/spark_cli/system_map.py; the other PRs (#1088 atomic write of gaps.md, #1081 redact compile roots, #1039-1032 Batch v3 exception-narrowing) target different lines and concerns. None touch the overlap-probe cap or the
matched_*_countoutput fields.","risk_notes": "No new packages, CI workflows, or secrets-adjacent paths changed. Diff bounded to src/spark_cli/system_map.py. Same SQL query executes on the same candidate list; only the JSON response gains two additive fields per probe. No callsite reads the new fields yet, so downstream remains backward-compatible.",
"review_state_requested": "pr_review"
}
}