feat(rstack-eco-ci-debug): add eco ci debugging skill#79
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new “rstack-eco-ci-debug” skill documenting a signature-first workflow for diagnosing Rspack ecosystem CI failures and avoiding incorrect surface attribution, and exposes it via the repository README skill list.
Changes:
- Added a new
rstack-eco-ci-debugskill with a structured investigation and attribution workflow. - Documented canary window bisecting via
@rspack-canary/coreand downstream/Rspack combination testing. - Added install snippet and short description to the README under “Rspack Skills”.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| skills/rstack-eco-ci-debug/SKILL.md | New skill content documenting the end-to-end eco-ci debugging and attribution workflow. |
| README.md | Adds the new skill to the public install list under Rspack Skills. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| Prefer workspace-level overrides in pnpm workspaces: | ||
|
|
||
| ```yaml | ||
| overrides: | ||
| '@rspack/core': 'npm:@rspack-canary/core@<canary-version>' | ||
| ``` |
| 6. Map the first bad or first fixed canary commit to a PR: | ||
|
|
||
| ```bash | ||
| git -C <rspack-path> show <sha> | ||
| git -C <rspack-path> branch -r --contains <sha> | ||
| gh pr view <pr-number> --repo web-infra-dev/rspack | ||
| ``` |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe pull request adds a new agent skill named 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
skills/rstack-eco-ci-debug/SKILL.md (1)
12-24: ⚡ Quick winAdd explicit tool requirements to preconditions.
The workflow steps rely heavily on
gitCLI and GitHub CLI (gh) tools (lines 19, 55, 61, 148–150), but these are not listed as preconditions. Following the pattern inrspack-debugging(Preparation section), preconditions should explicitly state that these tools must be installed before proceeding.♻️ Proposed addition to preconditions
## Preconditions +- Ensure the following CLI tools are installed and available in your PATH: + - `git` (for repository operations and commit inspection) + - `gh` (GitHub CLI, for accessing GitHub Actions logs) + - Require the user to provide the local Rspack checkout path before inspecting Rspack commits or source. Do not assume a machine-specific path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/rstack-eco-ci-debug/SKILL.md` around lines 12 - 24, Add explicit tool requirements to the Preconditions section in the SKILL.md file. The workflow relies on git CLI and GitHub CLI (gh) tools but does not explicitly state these as preconditions. Insert a new bullet point in the Preconditions section listing that git and GitHub CLI (gh) must be installed and available in the user's PATH before proceeding. Follow the pattern used in the rspack-debugging skill's Preparation section for consistency in how tool requirements are documented.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@skills/rstack-eco-ci-debug/SKILL.md`:
- Around line 12-24: Add explicit tool requirements to the Preconditions section
in the SKILL.md file. The workflow relies on git CLI and GitHub CLI (gh) tools
but does not explicitly state these as preconditions. Insert a new bullet point
in the Preconditions section listing that git and GitHub CLI (gh) must be
installed and available in the user's PATH before proceeding. Follow the pattern
used in the rspack-debugging skill's Preparation section for consistency in how
tool requirements are documented.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: c5d1b90a-a812-4c56-a1bb-07811800adb6
📒 Files selected for processing (2)
README.mdskills/rstack-eco-ci-debug/SKILL.md
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 16a5b926d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| overrides: | ||
| '@rspack/core': 'npm:@rspack-canary/core@<canary-version>' | ||
| ``` |
There was a problem hiding this comment.
Include peer rules for canary overrides
When the downstream workspace enforces strict peer dependencies, aliasing only @rspack/core to @rspack-canary/core can make pnpm install fail before the reproduction command runs, because canary prerelease versions may not satisfy consumers' @rspack/* peer ranges. Rspack's official canary pnpm example pairs this override with peerDependencyRules.allowAny for @rspack/*; without calling out the equivalent workspace setting here, the canary bisect can misclassify an install-time peer resolution failure as the regression window.
Useful? React with 👍 / 👎.
…, and description - Make Optional Tools triggers explicit and situation-based - Add compatibility block and proactively ask for local checkout paths - Make description pushier to improve skill triggering - Refactor deep-pr-debug.md into a pure analysis tool; remove canary before/after testing and point back to canary-date-bisect when more evidence is needed
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ad2cc4b89e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| pnpm install | ||
| pnpm why @rspack/core --depth 0 |
There was a problem hiding this comment.
Scope pnpm commands to the downstream checkout
When the downstream checkout is not the agent's current working directory, these commands install and verify dependencies in whatever repo the agent happens to be running from, not in the checkout where the temporary override was edited. That can make the bisect record a result for a canary that was never installed in the downstream project; add an explicit cd <downstream-path> or use pnpm's directory flag before running install/why.
Useful? React with 👍 / 👎.
| ``` | ||
|
|
||
| - Treat GitHub Actions job logs as the source of truth for failure signatures. | ||
| - Do not modify project files unless the user explicitly asks for a fix. |
There was a problem hiding this comment.
Permit approved temporary reproduction edits
This blanket rule conflicts with the canary-bisect workflow: a user can ask for attribution or a canary bisect without asking for a fix, but the bisect requires temporary edits to pnpm-workspace.yaml, package.json, or the lockfile. A cautious agent following this instruction will refuse the main reproduction path unless it misclassifies the investigation as a fix; carve out approved, restored reproduction changes.
Useful? React with 👍 / 👎.
…re workspace outputs - skills-test/rstack-eco-ci-debug/evals/evals.json: 3 real eco-ci failure cases - skills-test/rstack-eco-ci-debug/report.md: iteration-1 benchmark summary - .gitignore: ignore .claude/skills and skills-test workspace outputs - skills.json: link .claude/skills target
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@skills-test/rstack-eco-ci-debug/evals/evals.json`:
- Line 7: Replace the hardcoded local machine path
/Users/bytedance/Documents/codes/rspack with a neutral placeholder such as
<RSPACK_CHECKOUT_PATH> in the evals.json file. This change should be applied to
all occurrences of the hardcoded path (at lines 7, 20, and 33) to remove
machine-specific environment details and make the test fixtures portable across
different environments without requiring local environment-specific values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 15782920-5a60-49de-9fe9-a98040d61e74
📒 Files selected for processing (4)
.gitignoreskills-test/rstack-eco-ci-debug/evals/evals.jsonskills-test/rstack-eco-ci-debug/report.mdskills.json
✅ Files skipped from review due to trivial changes (2)
- skills.json
- skills-test/rstack-eco-ci-debug/report.md
| { | ||
| "id": 1, | ||
| "eval_name": "plugin-suite-empty-lines", | ||
| "prompt": "The plugin suite is failing in rstack-ecosystem-ci and the status data shows it turned red around Rspack PR #14254. Can you help me figure out whether that PR is really the cause? My local Rspack checkout is at /Users/bytedance/Documents/codes/rspack.", |
There was a problem hiding this comment.
Replace hardcoded local checkout paths with a neutral placeholder.
Using /Users/bytedance/Documents/codes/rspack in committed eval prompts leaks local environment details and makes fixtures less portable. Prefer a placeholder (for example, <RSPACK_CHECKOUT_PATH>) and keep machine-specific values in local runtime inputs.
Suggested change
- "prompt": "... My local Rspack checkout is at /Users/bytedance/Documents/codes/rspack.",
+ "prompt": "... My local Rspack checkout is at <RSPACK_CHECKOUT_PATH>.",Also applies to: 20-20, 33-33
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@skills-test/rstack-eco-ci-debug/evals/evals.json` at line 7, Replace the
hardcoded local machine path /Users/bytedance/Documents/codes/rspack with a
neutral placeholder such as <RSPACK_CHECKOUT_PATH> in the evals.json file. This
change should be applied to all occurrences of the hardcoded path (at lines 7,
20, and 33) to remove machine-specific environment details and make the test
fixtures portable across different environments without requiring local
environment-specific values.
Three hard-stop checks before deep pivot analysis: 1. Same Rspack commit with green+red outcome → downstream cause 2. Surface PR diff unrelated to signature → exonerate 3. Failure signature maps to surface PR files → likely actual source
Canary date bisect: only when Rspack window >3 commits or crosses release boundary, downstream cause ruled out, and signature is stable. Deep PR debug: only when user asks for mechanism, actual source is a Rspack PR/version, and Phase 1 evidence exists.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
skills/rstack-eco-ci-debug/SKILL.md (1)
75-95: ⚡ Quick winSoften the fast-exit checks to keep them provisional.
The current wording makes Phase 1 sound definitive. Same-SHA green/red or diff/signature overlap can still come from flakiness or environment drift, so the output should stay "likely" until one corroborating log or combo test confirms it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/rstack-eco-ci-debug/SKILL.md` around lines 75 - 95, Soften the language throughout the fast-exit checks section to indicate that these are preliminary findings rather than definitive conclusions. In the "Same Rspack commit, different outcome" check, replace definitive statements (e.g., "the cause is **not**") with provisional language like "is likely not" or "suggests." In the "Surface PR diff is unrelated to the failure signature" check, change "treat the surface PR as innocent" to "provisionally treat" or "suggests the surface PR is likely innocent." In the "Failure signature directly maps to surface PR changed files" check, replace "the surface PR is likely the actual source" with "the surface PR is a likely candidate pending confirmation." Update all three output examples to include qualifiers such as "likely," "suggests," or "probable" instead of stating conclusions as certain, and add a note that these findings require corroboration with logs or combo tests before final confirmation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@skills/rstack-eco-ci-debug/SKILL.md`:
- Around line 75-95: Soften the language throughout the fast-exit checks section
to indicate that these are preliminary findings rather than definitive
conclusions. In the "Same Rspack commit, different outcome" check, replace
definitive statements (e.g., "the cause is **not**") with provisional language
like "is likely not" or "suggests." In the "Surface PR diff is unrelated to the
failure signature" check, change "treat the surface PR as innocent" to
"provisionally treat" or "suggests the surface PR is likely innocent." In the
"Failure signature directly maps to surface PR changed files" check, replace
"the surface PR is likely the actual source" with "the surface PR is a likely
candidate pending confirmation." Update all three output examples to include
qualifiers such as "likely," "suggests," or "probable" instead of stating
conclusions as certain, and add a note that these findings require corroboration
with logs or combo tests before final confirmation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 57bc41ec-b4eb-4629-82ef-d859386dfc2c
📒 Files selected for processing (1)
skills/rstack-eco-ci-debug/SKILL.md
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
skills/rstack-eco-ci-debug/SKILL.md (2)
49-71:⚠️ Potential issue | 🟠 Major | ⚡ Quick winClarify human approval gates for optional tools to address autonomous agency concerns.
The static analysis tool flagged lines 51 and 71 for potential autonomous decision-making (EA2). While line 71 explicitly states "do not post without approval" and line 21 gates file modifications, the guidance at lines 53–58 (canary-date-bisect) and 60–65 (deep-pr-debug) describes automatic trigger conditions that could enable tool invocation without explicit user confirmation at invocation time.
Strengthen the human-in-the-loop guardrails by explicitly stating that even when trigger conditions are met, the skill must request user approval before executing high-impact tools (e.g., "Trigger conditions met. Proceed with canary-date-bisect? [Y/n]").
Proposed clarification for canary-date-bisect tool section
- **Canary date bisect** — use in Phase 1 only when the Rspack commit window is too coarse to attribute a PR and downstream causes have already been ruled out. Trigger this when **all** of the following are true: - The green-to-red pivot spans **more than 3 Rspack commits** or crosses a release/canary boundary. - The failure signature is stable across the red rows in that window. - The same Rspack commit does **not** appear in both green and red runs (which would indicate a downstream cause). Do **not** trigger when the pivot is a single commit or when the surface PR diff already explains the signature. - Read [references/canary-date-bisect.md](references/canary-date-bisect.md) and ask the user for the local downstream checkout path and the narrowest failing command. + When trigger conditions are met, inform the user that canary-date-bisect is recommended and request approval before proceeding. Read [references/canary-date-bisect.md](references/canary-date-bisect.md), then ask the user for the local downstream checkout path and the narrowest failing command.Proposed clarification for deep-pr-debug tool section
- **Deep PR debug** — use in Phase 2 only after a specific Rspack source PR or version window has been identified and the user wants the technical reason behind the failure. Trigger this when **all** of the following are true: - The user asks "why did this PR break it", "what is the mechanism", or "how should we fix it". - The actual source is a Rspack PR or Rspack version window (not a downstream test change, snapshot update, or dependency bump). - Phase 1 has already produced evidence linking the PR to the failure signature. Do **not** run deep PR debug on downstream PRs; in those cases, Phase 1 output plus a short note about the downstream change is enough. - Read [references/deep-pr-debug.md](references/deep-pr-debug.md) automatically once a candidate PR is accepted for deep inspection. + Read [references/deep-pr-debug.md](references/deep-pr-debug.md) and confirm with the user before proceeding with deep inspection.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/rstack-eco-ci-debug/SKILL.md` around lines 49 - 71, The canary-date-bisect and deep-pr-debug tool sections describe automatic trigger conditions but lack explicit user approval gates before execution. Add a statement to each tool section (canary-date-bisect around line 53 and deep-pr-debug around line 60) that requires the skill to request explicit user confirmation before invoking the tool, even when all trigger conditions are met. For example, clarify that after detecting trigger conditions are met, the skill should ask the user for approval (e.g., "Trigger conditions met. Proceed with [tool name]? [Y/n]") before executing, similar to the existing approval gate in the PR report comment section at line 71.Source: Linters/SAST tools
10-220:⚠️ Potential issue | 🟠 MajorAdd missing "Use Cases" and "Code Examples" sections to meet documentation requirements.
The SKILL.md file lacks two required sections per coding guidelines: "Use Cases" and "Code Examples". The file currently includes YAML front-matter, Workflow (detailed Two-Phase Debug Workflow), and Reference Documentation guidance (Optional Tools section with tool-specific references), but is missing documented use cases and code examples. Add these sections to complete the required documentation structure.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@skills/rstack-eco-ci-debug/SKILL.md` around lines 10 - 220, The SKILL.md file is missing two required documentation sections: "Use Cases" and "Code Examples". Add a "Use Cases" section after the "Reporting Requirements" section that describes typical scenarios and situations when this Rstack Eco CI Debug skill should be applied (such as when a suite suddenly turns red, when investigating canary failures, or when surface attribution is misleading). Add a "Code Examples" section immediately after "Use Cases" that provides practical command examples and workflow demonstrations showing how to actually invoke the tools and processes described in the Two-Phase Debug Workflow, Canary Date Bisect, Deep PR Debug, and other referenced tools.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@skills/rstack-eco-ci-debug/SKILL.md`:
- Around line 49-71: The canary-date-bisect and deep-pr-debug tool sections
describe automatic trigger conditions but lack explicit user approval gates
before execution. Add a statement to each tool section (canary-date-bisect
around line 53 and deep-pr-debug around line 60) that requires the skill to
request explicit user confirmation before invoking the tool, even when all
trigger conditions are met. For example, clarify that after detecting trigger
conditions are met, the skill should ask the user for approval (e.g., "Trigger
conditions met. Proceed with [tool name]? [Y/n]") before executing, similar to
the existing approval gate in the PR report comment section at line 71.
- Around line 10-220: The SKILL.md file is missing two required documentation
sections: "Use Cases" and "Code Examples". Add a "Use Cases" section after the
"Reporting Requirements" section that describes typical scenarios and situations
when this Rstack Eco CI Debug skill should be applied (such as when a suite
suddenly turns red, when investigating canary failures, or when surface
attribution is misleading). Add a "Code Examples" section immediately after "Use
Cases" that provides practical command examples and workflow demonstrations
showing how to actually invoke the tools and processes described in the
Two-Phase Debug Workflow, Canary Date Bisect, Deep PR Debug, and other
referenced tools.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 61dbe5e9-fc1a-4456-bb73-99f3fd188391
📒 Files selected for processing (1)
skills/rstack-eco-ci-debug/SKILL.md
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 94510be38a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "prompt": "The plugin suite is failing in rstack-ecosystem-ci and the status data shows it turned red around Rspack PR #14254. Can you help me figure out whether that PR is really the cause? My local Rspack checkout is at /Users/bytedance/Documents/codes/rspack.", | ||
| "expected_output": "The skill identifies PR #14254 as the actual source of the plugin suite failure, explains that the failure signature involves extra empty lines being added to the error stack, and provides supporting evidence such as run URLs, log snippets, or commit inspection.", | ||
| "files": [], | ||
| "expectations": [ |
There was a problem hiding this comment.
Rename expectations to assertions for grading
The new eval criteria are stored under expectations, but the repo's eval definitions consistently use assertions as the grader rubric key (checked the other skills-test/*/evals/evals.json files). With this key, the automated evaluator will not see these four checks, so the reported 12/12 pass rate cannot be reproduced from the committed eval file; rename this field in all three new eval cases to assertions.
Useful? React with 👍 / 👎.
Summary
This PR adds a skill for debugging Rstack ecosystem CI failures where the visible Rspack pivot may not be the real source. It documents a signature-first workflow for comparing eco-ci runs, checking downstream changes, and bisecting @rspack-canary/core windows, and adds the skill to the README install list.