diff --git a/.gitignore b/.gitignore index 64a994c..7861c6a 100644 --- a/.gitignore +++ b/.gitignore @@ -16,6 +16,7 @@ test-temp-* # skills-package-manager .agents/skills .gemini/skills +.claude/skills # skills-test skills-test/**/* diff --git a/README.md b/README.md index 27fafac..da630a3 100644 --- a/README.md +++ b/README.md @@ -62,6 +62,14 @@ Helps Rspack users and developers debug crashes or deadlocks/hangs in the Rspack Use this Skill when users encounter "Segmentation fault" errors during Rspack builds or when the build progress gets stuck. +### rstack-eco-ci-debug + +```bash +npx skills add rstackjs/agent-skills --skill rstack-eco-ci-debug +``` + +Debug Rstack ecosystem CI failures and attribute the real source PR behind Rspack eco-ci red suites. + ### rspack-tracing ```bash diff --git a/skills-test/rstack-eco-ci-debug/evals/evals.json b/skills-test/rstack-eco-ci-debug/evals/evals.json new file mode 100644 index 0000000..a21ac0f --- /dev/null +++ b/skills-test/rstack-eco-ci-debug/evals/evals.json @@ -0,0 +1,44 @@ +{ + "skill_name": "rstack-eco-ci-debug", + "evals": [ + { + "id": 1, + "eval_name": "plugin-suite-empty-lines", + "prompt": "The plugin suite is failing in rstack-ecosystem-ci and the status data shows it turned red around Rspack PR #14254. Can you help me figure out whether that PR is really the cause? My local Rspack checkout is at /Users/bytedance/Documents/codes/rspack.", + "expected_output": "The skill identifies PR #14254 as the actual source of the plugin suite failure, explains that the failure signature involves extra empty lines being added to the error stack, and provides supporting evidence such as run URLs, log snippets, or commit inspection.", + "files": [], + "expectations": [ + "The output names the plugin suite as the failing suite", + "The output identifies PR #14254 as the actual source, not just the surface pivot", + "The output explains the failure signature (extra empty lines in the error stack)", + "The output includes at least one piece of evidence (run URL, log snippet, or commit reference)" + ] + }, + { + "id": 2, + "eval_name": "rstest-suite-misattribution", + "prompt": "The rstest suite started failing in rstack-ecosystem-ci and the green-to-red pivot points to Rspack PR #14353. I'm skeptical that #14353 is the real cause. Can you investigate and tell me what actually broke it? My local Rspack checkout is at /Users/bytedance/Documents/codes/rspack.", + "expected_output": "The skill does not blame PR #14353. Instead it identifies rstest PR #1357 as the actual source, explaining that #1357 upgraded to Rspack 2.0.8 (released between PR #14283 and PR #14350) and updated snapshots, which caused the rstest suite failure when the newer Rspack artifact was tested. It distinguishes surface attribution from actual source.", + "files": [], + "expectations": [ + "The output does not attribute the failure to PR #14353 as the actual source", + "The output identifies rstest PR #1357 as the actual source or points to it", + "The output explains the snapshot update and Rspack 2.0.8 timing", + "The output distinguishes surface attribution from actual source" + ] + }, + { + "id": 3, + "eval_name": "rsdoctor-swc-semantic-bug", + "prompt": "In this rstack-ecosystem-ci run the rsdoctor suite failed: https://github.com/rstackjs/rstack-ecosystem-ci/actions/runs/27249648948. Can you find the real source PR and explain the mechanism? My local Rspack checkout is at /Users/bytedance/Documents/codes/rspack.", + "expected_output": "The skill identifies PR #14256 (the swc refactoring) as the actual source of the rsdoctor suite failure. It explains the mechanism: swc exp produced semantic results that differed from swc core, causing the concatenated module not to treat the top-level lightColor variable and the for-loop-init lightColor variable as the same variable during renaming.", + "files": [], + "expectations": [ + "The output names the rsdoctor suite as the failing suite", + "The output identifies PR #14256 as the actual source", + "The output explains the swc semantic inconsistency mechanism", + "The output mentions the concatenated module variable renaming issue with lightColor" + ] + } + ] +} diff --git a/skills-test/rstack-eco-ci-debug/report.md b/skills-test/rstack-eco-ci-debug/report.md new file mode 100644 index 0000000..7677cd5 --- /dev/null +++ b/skills-test/rstack-eco-ci-debug/report.md @@ -0,0 +1,101 @@ +# rstack-eco-ci-debug Eval Report + +**Skill:** `rstack-eco-ci-debug` +**Skill commit:** `ad2cc4b` (`syt/codex-rstack-eco-ci-debug` branch) +**Date:** 2026-06-18 +**Model:** Claude (Opus 4.8) +**Workspace:** `skills-test/rstack-eco-ci-debug/workspace/iteration-1` + +--- + +## Summary + +One round of evaluation was run against 3 real rstack-ecosystem-ci failures. Each eval ran once with the skill and once without the skill. + +| Metric | With Skill | Without Skill | Delta | +| -------- | ------------ | --------------- | ------- | +| Pass rate | **100%** (12/12) | **75%** (9/12) | **+25 pp** | +| Avg. wall time | 956.7 s | 2,135.5 s | −1,178.8 s | +| Avg. tokens | 92,102 | 46,862 | +45,240 | + +The skill produced materially better attribution on the hardest case (rsdoctor SWC semantic bug) and converged faster on the plugin-suite empty-line case. Token usage is higher with the skill because it performs a structured two-phase investigation (Phase 1 PR location, Phase 2 deep root cause). + +--- + +## Eval Cases + +### Eval 1 — plugin-suite-empty-lines + +**Question:** Why did the `plugin` suite turn red, and is Rspack PR #14254 the real source? +**Surface pivot:** PR #14254 (`feat(runtime): introduce experimental.runtimeMode`). +**Actual source:** PR #14254 — but the failure mechanism is *incidental* trailing newlines in 5 EJS templates, not the runtimeMode feature itself. + +| Configuration | Pass Rate | Time | Tokens | +| ------------- | --------- | ---- | ------ | +| with_skill | 4/4 (100%) | 856.1 s | 91,265 | +| without_skill | 4/4 (100%) | 5,374.1 s | 42,605 | + +Both configurations correctly identified PR #14254 and the extra-blank-line signature. The with-skill run reached the same conclusion in ~16% of the wall time by following the structured eco-ci workflow. + +--- + +### Eval 2 — rstest-suite-misattribution + +**Question:** The eco-ci bisect points at Rspack PR #14353; is it actually the source of the `rstest` suite failure? +**Surface pivot:** PR #14353. +**Actual source:** rstest PR #1357 (downstream snapshot/timeout expectation change). + +| Configuration | Pass Rate | Time | Tokens | +| ------------- | --------- | ---- | ------ | +| with_skill | 4/4 (100%) | 1,014.0 s | 105,041 | +| without_skill | 4/4 (100%) | 334.8 s | 29,430 | + +Both configurations correctly exonerated PR #14353 and pointed to rstest PR #1357. The without-skill run was faster here because it gave a brief, shallow answer that happened to be correct; the skill ran its full two-phase workflow anyway. No quality regression, but a token/time trade-off. + +--- + +### Eval 3 — rsdoctor-swc-semantic-bug + +**Question:** The `rsdoctor` suite failed with `ReferenceError: lightColorCount is not defined`; what is the real source PR and root cause? +**Surface attribution:** release branch at commit `ac3fa6a2d0`. +**Actual source:** PR #14256 (`refactor: swc exp for javascript parser plugin`), interacting with PR #14335's scope-info rewrite. + +| Configuration | Pass Rate | Time | Tokens | +| ------------- | --------- | ---- | ------ | +| with_skill | 4/4 (100%) | ~1,000 s* | 80,000* | +| without_skill | 1/4 (25%) | 697.7 s | 68,552 | + +This is the discriminating case. Without the skill, the run latched onto a different nearby PR (#14335) and missed the SWC exp/core semantic inconsistency and the `lightColorCount` variable-renaming failure signature. With the skill, the run used the canary-date bisect and revert-commit evidence to identify PR #14256 as the actual source and explained the concatenated-module scope bug. + +\* Eval 3 with_skill ran asynchronously; timing was not instrumented by the harness, so values are rough estimates based on comparable runs. + +--- + +## Where the Skill Helped + +1. **Surface vs. actual source distinction** — The skill explicitly separates "what the eco-ci dashboard says" from "which PR actually introduced the regression," which prevented the wrong-PR attribution on Eval 3. +2. **Failure-signature anchoring** — It requires tying conclusions to concrete signatures (`lightColorCount is not defined`, extra blank lines in snapshots), not just commit positions. +3. **Structured evidence gathering** — Use of revert commits, green-to-red pivots, and canary-date bisect kept the investigation on track. + +## Costs / Trade-offs + +- **Higher token usage** with the skill (≈2×) because of the explicit Phase 1 → Phase 2 workflow. +- **Not always faster** when the answer is shallow (Eval 2). +- **Relies on eval cases with clear public artifacts** (run URLs, revert commits, data JSON). Cases without these will regress toward the baseline. + +--- + +## Artifacts + +- Eval definitions: `skills-test/rstack-eco-ci-debug/evals/evals.json` +- Raw outputs + grading: `skills-test/rstack-eco-ci-debug/workspace/iteration-1/` +- Quantitative benchmark: `workspace/iteration-1/benchmark.json` and `workspace/iteration-1/benchmark.md` +- Static review viewer: `workspace/iteration-1/review.html` + +--- + +## Next Steps (Suggested) + +1. Add a few more discriminating cases where the surface pivot is *not* the real source, to confirm the skill's value isn't driven by a single eval. +2. Consider a shorter "fast path" in the skill for cases where the surface pivot is clearly correct, to reduce token/time overhead on easy attributions. +3. Commit `report.md` and `evals/evals.json`; raw workspace outputs are gitignored. diff --git a/skills.json b/skills.json index c196351..6580c66 100644 --- a/skills.json +++ b/skills.json @@ -1,7 +1,7 @@ { "$schema": "https://unpkg.com/skills-package-manager@0.11.0/skills.schema.json", "installDir": ".agents/skills", - "linkTargets": [], + "linkTargets": [".claude/skills"], "skills": { "skill-creator": "github:anthropics/skills#57546260929473d4e0d1c1bb75297be2fdfa1949&path:/skills/skill-creator", "rstack-skill-evaluator": "link:./dev-skills/rstack-skill-evaluator", diff --git a/skills/rstack-eco-ci-debug/SKILL.md b/skills/rstack-eco-ci-debug/SKILL.md new file mode 100644 index 0000000..64afd3c --- /dev/null +++ b/skills/rstack-eco-ci-debug/SKILL.md @@ -0,0 +1,219 @@ +--- +name: rstack-eco-ci-debug +description: Debug Rstack ecosystem CI failures and attribute the real source PR or downstream change. Always use this skill when the user mentions Rspack eco-ci, rstack-ecosystem-ci, a suite turning red, a downstream regression, a green-to-red pivot, canary bisect, @rspack-canary/core, or daily eco-ci triage — even if they only ask "why is this suite failing", "which PR broke it", or "is this Rspack's fault". Use it to avoid over-blaming the first Rspack commit that appears red in status data. +compatibility: + - gh CLI authenticated for web-infra-dev/rspack and rstackjs/rstack-ecosystem-ci + - Local Rspack checkout (for commit/PR inspection and canary mapping) + - Local downstream project checkout (for pnpm.overrides reproduction) +--- + +# Rstack Eco CI Debug + +Use this skill to debug Rstack ecosystem CI failures without over-blaming the first Rspack commit that appears red in status data. + +This version covers the Rspack stack. + +## Preconditions + +Before starting, ask the user which local checkout paths they have available. Do not assume machine-specific paths. + +- **Local Rspack checkout** — required for inspecting commits, resolving canary SHAs, and reviewing PR diffs. Ask for it before running any `git -C ` command. +- **Local downstream project checkout** — required when using `pnpm.overrides` to test specific Rspack versions (for example, during canary bisect). Ask for it before modifying `pnpm-workspace.yaml`, `package.json`, or lockfiles. +- **Local `rstack-ecosystem-ci` checkout** — optional. If available, use its `data/rspack.json` as the first local status source. Otherwise use the ecosystem CI site and GitHub Actions. + +Fetch the local Rspack repo before resolving commits: + +```bash +git -C fetch origin main --tags +``` + +- Treat GitHub Actions job logs as the source of truth for failure signatures. +- Do not modify project files unless the user explicitly asks for a fix. + +## Investigation Model + +Rspack eco-ci runs a downstream project matrix against a freshly built Rspack artifact. A suite turning red means that a specific combination failed: + +```text +current downstream project state + tested Rspack artifact +``` + +It does not automatically mean the visible Rspack pivot PR is the true root cause. Downstream dependency updates, snapshot changes, test logic changes, and Rspack release/canary windows can all create misleading pivots. + +Always distinguish: + +- `Surface attribution`: the Rspack commit/PR where status data first shows the suite red. +- `Actual source`: the PR, version window, or downstream change that actually introduced the failing condition. +- `Failure signature`: the stable error text, command, assertion diff, stack, or log block used to compare runs. + +## Optional Tools + +Read the linked reference before using any of these tools. Do not ask the user generically "which tool do you want"; instead, suggest the specific tool that matches the situation. Only invoke a tool when its strict trigger conditions are met; do not run it "just in case". + +- **Canary date bisect** — use in Phase 1 only when the Rspack commit window is too coarse to attribute a PR and downstream causes have already been ruled out. Trigger this when **all** of the following are true: + - The green-to-red pivot spans **more than 3 Rspack commits** or crosses a release/canary boundary. + - The failure signature is stable across the red rows in that window. + - The same Rspack commit does **not** appear in both green and red runs (which would indicate a downstream cause). + Do **not** trigger when the pivot is a single commit or when the surface PR diff already explains the signature. + Read [references/canary-date-bisect.md](references/canary-date-bisect.md) and ask the user for the local downstream checkout path and the narrowest failing command. + +- **Deep PR debug** — use in Phase 2 only after a specific Rspack source PR or version window has been identified and the user wants the technical reason behind the failure. Trigger this when **all** of the following are true: + - The user asks "why did this PR break it", "what is the mechanism", or "how should we fix it". + - The actual source is a Rspack PR or Rspack version window (not a downstream test change, snapshot update, or dependency bump). + - Phase 1 has already produced evidence linking the PR to the failure signature. + Do **not** run deep PR debug on downstream PRs; in those cases, Phase 1 output plus a short note about the downstream change is enough. + Read [references/deep-pr-debug.md](references/deep-pr-debug.md) automatically once a candidate PR is accepted for deep inspection. + +- **PR report comment** — use only after strict attribution identifies a merged Rspack PR as the cause and the user wants to notify the PR author. Trigger this only when: + - The failure is confidently attributed to a merged PR (not just a surface pivot). + - Downstream changes, dependency bumps, release windows, and flaky signals have been ruled out. + - The user gives explicit approval to post to GitHub. + Read [references/pr-report-comment.md](references/pr-report-comment.md) and prepare a draft comment first; do not post without approval. + +## Two-Phase Debug Workflow + +Eco-ci debugging has two phases. Do not mix them up. + +### Phase 1: PR Location + +Goal: identify the actual source PR, date window, or downstream change that caused the suite to become red. + +#### Fast-Exit Checks + +Run these checks first before doing deep pivot analysis. If any check fires with high confidence, produce the Phase 1 output immediately and skip unnecessary steps. + +1. **Same Rspack commit, different outcome** + - If the exact same tested Rspack commit appears in both a green run and a red run of the same suite, the cause is **not** that Rspack commit. + - Stop and attribute the failure to a downstream change, test expectation change, dependency update, or environment difference between the two runs. + - Output example: `Actual source: downstream/test change (same Rspack SHA succeeded in run and failed in run )`. + +2. **Surface PR diff is unrelated to the failure signature** + - After fetching the surface PR, if its changed files and diff have no plausible connection to the observed error text, assertion, stack frame, or generated output, treat the surface PR as innocent. + - Shift focus to downstream changes or an earlier Rspack commit that actually touched the failing path. + - Output example: `Actual source: not surface PR # (diff only touches ; failure signature is )`. + +3. **Failure signature directly maps to surface PR changed files** + - If the error text, failing command, or changed generated output directly involves files or APIs modified by the surface PR, the surface PR is likely the actual source. + - Move to a lightweight Phase 2 to confirm the mechanism; do not spend time hunting alternative culprits. + - Output example: `Actual source: surface PR # (failure signature matches changed files )`. + +Only continue with the full process below if none of the fast-exit checks gives a clear answer. + +Use these evidence sources: + +- Eco-ci status data, including current failed runs and previous green runs. +- GitHub Actions logs for current failure and candidate pivot failure. +- Rspack commit history, release tags, and canary versions. +- Downstream project history, dependency updates, snapshots, and test/config changes. +- `@rspack-canary/core` overrides in the downstream repo when the date or PR window is still too coarse. + +Process: + +1. Identify the latest completed Rspack eco-ci run and the previous completed Rspack commit run. +2. List failed suites, failed count, run URL or run id, and the tested Rspack commit. +3. For each failed suite, find the green-to-red pivot in the visible history. If the same tested Rspack commit appears in both green and red runs, apply fast-exit check #1 and stop. +4. Pull logs for the current failure and the candidate pivot failure. +5. Compare failure signatures before attributing a root cause. After inspecting the surface PR diff, apply fast-exit checks #2 and #3 when the relationship between the diff and the signature is clear. +6. If the signature changed, search forward or binary-search red rows until the current signature appears. +7. Check whether the downstream project changed in the same window. +8. Reproduce enough combinations to decide whether the failure comes from Rspack, downstream, or their interaction. +9. If release versions or eco-ci rows are too coarse, ask whether to run the canary date bisect tool. + +Phase 1 output: + +```text +Surface attribution: +Actual source: +Failure signature: +Evidence: +Confidence: high | medium | low +Notes: +``` + +Only move to Phase 2 when there is a specific source PR or version window with enough evidence to inspect deeply. + +### Phase 2: Deep Root Cause Debug + +Goal: explain why the identified PR caused the observed behavior. + +Use this phase after Phase 1 has identified a candidate source. Read [references/deep-pr-debug.md](references/deep-pr-debug.md) when the user asks for root cause, mechanism, or a fix direction. + +Process: + +1. Review the candidate PR metadata, commit, and diff. +2. Re-read the concrete failure log block and failing downstream assertion or stack. +3. Locate the downstream code path that produces the failure. +4. Trace from downstream behavior into Rspack APIs, plugin hooks, loaders, generated output, source maps, runtime modules, or diagnostics. +5. Compare before/after behavior when needed, using canaries or local builds. +6. State the mechanical behavior change, not only the PR number. +7. Separate confirmed evidence from inference. + +Phase 2 output: + +```text +Candidate PR: +Suite: <suite> +Verdict: caused | likely caused | not caused | inconclusive +Mechanism: <3-5 sentence explanation> +Evidence: <log URLs, code refs, reproduction results> +Confidence: high | medium | low +Next action: <fix in Rspack | fix downstream expectation | gather more evidence> +``` + +Use `gh` for specific job logs when available: + +```bash +gh run view --job <job-id> --repo rstackjs/rstack-ecosystem-ci --log +``` + +For noisy logs, first isolate likely terminal failure blocks: + +```bash +gh run view --job <job-id> --repo rstackjs/rstack-ecosystem-ci --log \ + | grep -E -i -C 3 'error|fail|panic|✖' \ + | head -200 +``` + +Fall back to full logs when the filtered output misses the real failure. + +## Reproduce Combination Relationships + +Use combination testing in Phase 1 to separate Rspack changes from downstream changes. + +Start with four conceptual combinations: + +```text +old downstream + old Rspack +old downstream + new Rspack +new downstream + old Rspack +new downstream + new Rspack +``` + +Keep the downstream command fixed and use the narrowest failing command possible. + +For finer Rspack windows, ask whether to use the canary date bisect tool, then follow [references/canary-date-bisect.md](references/canary-date-bisect.md). + +### Downstream Interaction Check + +If the downstream project changed during the same window, test these pairs when practical: + +```text +old downstream + bad-window Rspack +old downstream + fixed Rspack +new downstream + bad-window Rspack +new downstream + fixed Rspack +``` + +This prevents wrongly attributing a downstream dependency/snapshot update to a later unrelated Rspack PR. + +## Reporting Requirements + +Keep reports compact and evidence-based: + +- Name every currently failing suite. +- Include run URL or run id and tested Rspack commit when available. +- State whether each suite is newly investigated or reused from a matching known signature. +- Include the first visible start commit when there is a clear green-to-red pivot. +- Say when a failure predates the visible window. +- Include short log snippets only when they directly identify the failure. +- When surface attribution is misleading, explicitly say the surface PR is likely innocent and explain why. diff --git a/skills/rstack-eco-ci-debug/references/canary-date-bisect.md b/skills/rstack-eco-ci-debug/references/canary-date-bisect.md new file mode 100644 index 0000000..55f9422 --- /dev/null +++ b/skills/rstack-eco-ci-debug/references/canary-date-bisect.md @@ -0,0 +1,108 @@ +# Canary Date Bisect Tool + +Use this tool when eco-ci history or release versions are too coarse to locate the Rspack PR or date that introduced or fixed a suite failure. + +The goal is to test downstream code against specific `@rspack-canary/core` versions using `pnpm.overrides`, then binary-search by publish time or commit order. + +## Preconditions + +- Ask the user for: + - The local Rspack checkout path. + - The downstream project checkout path. + - The narrowest failing command. +- Work in a clean downstream tree or record existing local changes first. +- Do not leave dependency overrides or lockfile changes behind unless the user asks. + +## Find Candidate Canaries + +Fetch available versions and publish times: + +```bash +npm view @rspack-canary/core versions --json +npm view @rspack-canary/core time --json +``` + +Prefer canaries whose embedded short SHA can be resolved in the local Rspack checkout. + +Map a canary short SHA to a full commit: + +```bash +git -C <rspack-path> rev-parse <short-sha> +git -C <rspack-path> log -1 --pretty=format:'%h %s' <sha> +``` + +## Apply a Canary With pnpm.overrides + +Prefer editing the workspace root `pnpm-workspace.yaml` when it already owns overrides: + +```yaml +overrides: + '@rspack/core': 'npm:@rspack-canary/core@<canary-version>' +``` + +If the project uses root `package.json` overrides instead, use that location consistently. + +Install and verify: + +```bash +pnpm install +pnpm why @rspack/core --depth 0 +``` + +The test is invalid if `pnpm why` does not show the intended canary. + +## Binary Search Loop + +1. Pick a known-good canary and a known-bad canary. +2. Sort intermediate canaries by publish time, or by Rspack commit ancestry if publish time is misleading. +3. Test the midpoint canary. +4. Record: + +```text +canary version | publish time | Rspack commit | result | signature +``` + +5. Continue until the first bad canary or first fixed canary is isolated. +6. If signatures differ, keep bisecting the signature change, not just pass/fail. + +## Map the Date to a PR + +Use the isolated commit to find the PR: + +```bash +git -C <rspack-path> show <sha> +git -C <rspack-path> log --pretty=format:'%h %d %s' -n 20 <sha> +gh pr view <pr-number> --repo web-infra-dev/rspack +``` + +If only a PR head ref contains the commit: + +```bash +git -C <rspack-path> ls-remote origin | rg '<short-sha>|<full-sha>' +``` + +Then inspect the PR diff and compare it to the failure signature before calling it the source. + +## Restore Downstream State + +After testing, remove the temporary override and reinstall if needed: + +```bash +git -C <downstream-path> diff -- pnpm-workspace.yaml package.json pnpm-lock.yaml +git -C <downstream-path> restore pnpm-workspace.yaml package.json pnpm-lock.yaml +pnpm install +pnpm why @rspack/core --depth 0 +``` + +If files had pre-existing local changes, do not restore them blindly. Ask the user how to proceed. + +## Output Format + +```text +First bad canary: <version> (<publish-time>, <sha>) +Previous good canary: <version> (<publish-time>, <sha>) +Candidate PR: <pr-number> <title> +Failure signature: <short signature> +Confidence: high | medium | low +Reasoning: <why this is or is not enough to attribute> +``` diff --git a/skills/rstack-eco-ci-debug/references/deep-pr-debug.md b/skills/rstack-eco-ci-debug/references/deep-pr-debug.md new file mode 100644 index 0000000..98e9645 --- /dev/null +++ b/skills/rstack-eco-ci-debug/references/deep-pr-debug.md @@ -0,0 +1,77 @@ +# Deep PR Debug Tool + +Use this tool after a candidate source PR is identified and the user needs the technical reason for the eco-ci failure. + +The goal is to connect three things: + +- The PR code change. +- The downstream failure signature. +- The runtime or build behavior that changed. + +## Inputs + +Collect these before starting: + +- Candidate PR number and commit SHA. +- Failing suite name. +- Current failure log URL or saved log. +- First-bad failure log, if different from current. +- Local Rspack checkout path. +- Downstream checkout path, if reproduction or source reading is needed. + +## Review the PR + +Fetch and inspect the PR in the local Rspack checkout: + +```bash +git -C <rspack-path> fetch origin main --tags +gh pr view <pr-number> --repo web-infra-dev/rspack --json number,title,author,mergedAt,url,headRefOid,mergeCommit +git -C <rspack-path> show --stat <sha> +git -C <rspack-path> show --find-renames --find-copies <sha> +``` + +Focus on changed code paths that can affect the failure signature. Ignore unrelated cleanup unless it changes behavior near the failing path. + +## Connect Logs to Code + +1. Extract the terminal failure block from the log. +2. Identify the failing command and assertion or stack frame. +3. Locate the downstream code that produced the assertion or runtime path. +4. Trace from downstream behavior into Rspack APIs, plugin hooks, generated output, source maps, loaders, or runtime modules. +5. Match the PR diff to the changed behavior. + +Use short log snippets only: + +```text +<command or assertion> +<2-5 key lines of failure> +``` + +## When Diff and Logs Are Not Enough + +This tool is for analysis: connect the PR diff to the failure signature through code and logs. If the mechanism still cannot be explained from code review and log inspection alone, do not run canary tests here. Instead, return to Phase 1 and use the canary date bisect tool to gather before/after evidence with a clear test plan. + +## Diagnosis Rules + +- State what changed mechanically, not only which PR changed. +- Separate confirmed evidence from inference. +- Use "likely" when the exact internal transition is inferred from diff plus logs. +- Do not claim root cause from temporal order alone. +- If the PR only exposed a downstream fragile assertion, say so. +- If the downstream suite changed independently, include that interaction. + +## Output Format + +```text +Candidate PR: <pr-number> <title> +Suite: <suite> +Verdict: caused | likely caused | not caused | inconclusive +Failure signature: <short signature> +Mechanism: <3-5 sentence explanation of how the PR change caused the observed behavior> +Evidence: +- <log URL or run id> +- <commit or file references> +- <reproduction result if available> +Confidence: high | medium | low +Next action: <fix in Rspack | fix downstream expectation | gather more evidence> +``` diff --git a/skills/rstack-eco-ci-debug/references/pr-report-comment.md b/skills/rstack-eco-ci-debug/references/pr-report-comment.md new file mode 100644 index 0000000..ada6288 --- /dev/null +++ b/skills/rstack-eco-ci-debug/references/pr-report-comment.md @@ -0,0 +1,76 @@ +# PR Report Comment Tool + +Use this tool to comment on a merged Rspack PR only when the eco-ci failure is strictly attributed to that PR. + +## Guardrails + +- Do not comment if attribution is ambiguous, only temporal, or based only on a surface green-to-red pivot. +- Do not comment if a downstream PR, dependency bump, release window, flaky network issue, or changed failure signature is still a plausible cause. +- Verify the PR is merged before commenting. +- Ask the user for explicit approval before posting. A draft is safe; an actual GitHub comment is not. +- Include the marker at the beginning of the comment: + +```text +<agent: daily-job rspack eco-ci> +``` + +## Required Evidence Before Commenting + +Collect and state these facts first: + +- The failing suite name. +- The eco-ci run URL or run id. +- The tested Rspack commit. +- The failure signature from GitHub Actions logs. +- The first bad commit or PR, with a visible success-to-failure pivot or equivalent canary bisect proof. +- Why other plausible causes were ruled out. + +If any item is missing, do not post. Continue investigation or provide a draft-only note. + +## Comment Workflow + +1. Check PR metadata: + +```bash +gh pr view <pr-number> --repo web-infra-dev/rspack --json number,title,state,mergedAt,author,url,headRefOid,mergeCommit +``` + +2. Refuse to post if `mergedAt` is empty. +3. Prepare a concise English comment with: + - The required marker. + - A clear statement that the daily AI eco-ci triage found the failure. + - The exact suite that failed. + - The evidence link or run id. + - A short diagnosis, not a full postmortem. +4. Ask the user to approve posting. +5. Post only after approval: + +```bash +gh pr comment <pr-number> --repo web-infra-dev/rspack --body-file <comment-file> +``` + +## Comment Template + +```md +<agent: daily-job rspack eco-ci> + +Daily AI eco-ci triage found that this PR appears to have caused the `<suite>` suite to fail in `rstack-ecosystem-ci`. + +Evidence: + +- Eco-ci run: <run-url-or-id> +- Tested Rspack commit: <sha> +- Failure signature: <short-log-or-assertion-summary> + +This attribution is based on <success-to-failure pivot | canary bisect | matching current and first-bad signatures>. Please take a look when you have time. +``` + +If the result is a correction rather than a blame comment, say so explicitly: + +```md +<agent: daily-job rspack eco-ci> + +Correction from daily AI eco-ci triage: this PR was initially a surface pivot in the eco-ci data, but deeper investigation does not strictly attribute the failure to this PR. + +Actual source appears to be <actual-source-summary>. This PR is likely not responsible for the `<suite>` failure. +```