From 24824ed7f5b49bd1a25eb12244d628d998e2c18c Mon Sep 17 00:00:00 2001 From: Ruben Romero Montes Date: Fri, 29 May 2026 14:22:36 +0200 Subject: [PATCH 1/3] feat(verify-pr): add file-type scoping to convention upgrades Add file-type applicability check to style-conventions Check 1a so that convention matches are discarded when the convention's scope does not overlap with the PR's changed files. Update the evidence format in Check 1d to include applicability status. Add eval assertions covering the new behavior. Implements TC-4287 Assisted-by: Claude Code --- evals/verify-pr/evals.json | 2 ++ .../skills/verify-pr/style-conventions.md | 20 ++++++++++++++----- 2 files changed, 17 insertions(+), 5 deletions(-) diff --git a/evals/verify-pr/evals.json b/evals/verify-pr/evals.json index 4f518e20..6f83d6f1 100644 --- a/evals/verify-pr/evals.json +++ b/evals/verify-pr/evals.json @@ -57,6 +57,8 @@ "The report does NOT auto-merge the PR (constraint 1.13)", "The report contains a Test Change Classification row with ADDITIVE — only new test files were added (tests/api/sbom_delete.rs is a new file)", "Convention upgrade eligibility is evaluated for review comment 30002 (index suggestion) — the review classification output (review-30002.md) or the report's Style/Conventions analysis explains whether the suggestion matches a documented or demonstrated project convention", + "Convention upgrade analysis for review comment 30002 includes file-type applicability verification per shared/convention-applicability-rules.md — the analysis checks whether the convention's file-type scope overlaps with the PR's changed files (.rs files including migration files) before upgrading", + "The convention upgrade evidence for a CONVENTIONS.md match includes applicability status (e.g., 'Applicable: yes, file types: ...') indicating that the convention's scope was verified against the PR's changed file types", "Review comment 30002 (index suggestion) results in a sub-task regardless of classification path — whether classified directly as code change request based on reviewer language, or upgraded from suggestion via convention analysis", "Eval Quality is N/A because no eval result reviews exist in the PR — the 3-criteria detection (author github-actions[bot], marker ## Eval Results, footer sdlc-workflow/run-evals) found no matches, so Eval Quality does not affect the Test Quality combination" ] diff --git a/plugins/sdlc-workflow/skills/verify-pr/style-conventions.md b/plugins/sdlc-workflow/skills/verify-pr/style-conventions.md index a42a4cbe..e75d3f01 100644 --- a/plugins/sdlc-workflow/skills/verify-pr/style-conventions.md +++ b/plugins/sdlc-workflow/skills/verify-pr/style-conventions.md @@ -41,6 +41,15 @@ documented conventions that match the suggested practice. For example, if the reviewer suggests adding indexes for foreign key columns, check whether CONVENTIONS.md documents index creation patterns. +After finding a matching convention, verify that the convention's file-type scope +overlaps with the PR's changed files, following the rules in +`shared/convention-applicability-rules.md`. Extract the PR's changed file types +from the diff headers (`--- a/path` and `+++ b/path` lines) in the PR Diff input. +If the convention's file-type scope does not overlap with any of the PR's changed +files, discard the convention match — do not use it for upgrade purposes. +Conventions without explicit file-type scope are treated as broadly applicable +(no filtering). + If CONVENTIONS.md is not provided (empty or absent), skip this sub-step. #### 1b — Check Codebase Patterns @@ -59,11 +68,12 @@ index migrations, caching layers, documented performance conventions). #### 1d — Upgrade Decision -If the suggestion matches a documented convention in CONVENTIONS.md **or** is -demonstrated by consistent codebase usage (multiple instances of the same pattern -in similar files), upgrade the classification from **suggestion** to **code change -request**. Record the evidence: -- `"Matches documented convention: [CONVENTIONS.md section or quote]"` +If the suggestion matches a documented convention in CONVENTIONS.md (and passes +the file-type applicability check from 1a) **or** is demonstrated by consistent +codebase usage (multiple instances of the same pattern in similar files), upgrade +the classification from **suggestion** to **code change request**. Record the +evidence: +- `"Matches documented convention: [CONVENTIONS.md section or quote] — Applicable: [yes, file types: match PR's changed ]"` - `"Matches codebase convention: [N occurrences of pattern in similar files]"` Suggestions that do not match any documented or demonstrated convention remain From 3b0f8d772b95960840d1f251962e77784527aa8f Mon Sep 17 00:00:00 2001 From: Ruben Romero Montes Date: Fri, 29 May 2026 15:34:41 +0200 Subject: [PATCH 2/3] fix(verify-pr): add CONVENTIONS.md fixture to eval 3 for applicability testing MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The previous assertions tested file-type applicability verification but eval 3 had no CONVENTIONS.md fixture, making the assertions untestable. Add a conventions-backend.md fixture with a §Migration Patterns convention targeting .rs migration files, update the eval prompt to reference it, and rewrite the assertions to match the actual scenario. Implements TC-4287 Assisted-by: Claude Code --- evals/verify-pr/evals.json | 8 +++--- evals/verify-pr/files/conventions-backend.md | 29 ++++++++++++++++++++ 2 files changed, 33 insertions(+), 4 deletions(-) create mode 100644 evals/verify-pr/files/conventions-backend.md diff --git a/evals/verify-pr/evals.json b/evals/verify-pr/evals.json index 6f83d6f1..72ac41d1 100644 --- a/evals/verify-pr/evals.json +++ b/evals/verify-pr/evals.json @@ -41,9 +41,9 @@ }, { "id": 3, - "prompt": "Verify PR #744 for task TC-9103. The task description is in task-with-reviews.md, the PR diff is in pr-diff-with-reviews.md, the review comments are in pr-review-comments.md, and the target repository structure is in repo-backend.md. All CI checks pass. Write your verification report to outputs/report.md. For each review comment, write your classification reasoning to outputs/review-N.md (where N is the comment id). For each sub-task that would be created, write its full description to outputs/subtask-N.md following the task-description-template.md format.", + "prompt": "Verify PR #744 for task TC-9103. The task description is in task-with-reviews.md, the PR diff is in pr-diff-with-reviews.md, the review comments are in pr-review-comments.md, the target repository structure is in repo-backend.md, and the repository's CONVENTIONS.md content is in conventions-backend.md. All CI checks pass. Write your verification report to outputs/report.md. For each review comment, write your classification reasoning to outputs/review-N.md (where N is the comment id). For each sub-task that would be created, write its full description to outputs/subtask-N.md following the task-description-template.md format.", "expected_output": "A verification report where acceptance criteria mostly pass, and review feedback processing creates sub-tasks for code change requests. The transaction wrapping comment (id 30001) and index comment (id 30002) should be classified as code change requests with sub-tasks created. The nit (id 30003) and question (id 30004) should be classified but NOT trigger sub-tasks. Sub-task descriptions must follow task-description-template.md and include Target PR and Review Context extension sections.", - "files": ["files/task-with-reviews.md", "files/pr-diff-with-reviews.md", "files/pr-review-comments.md", "files/repo-backend.md"], + "files": ["files/task-with-reviews.md", "files/pr-diff-with-reviews.md", "files/pr-review-comments.md", "files/repo-backend.md", "files/conventions-backend.md"], "assertions": [ "The review comment about transaction wrapping (id 30001) is classified as a code change request and triggers sub-task creation", "The review comment about adding an index (id 30002) is classified as a code change request (or upgraded suggestion) and triggers sub-task creation", @@ -57,8 +57,8 @@ "The report does NOT auto-merge the PR (constraint 1.13)", "The report contains a Test Change Classification row with ADDITIVE — only new test files were added (tests/api/sbom_delete.rs is a new file)", "Convention upgrade eligibility is evaluated for review comment 30002 (index suggestion) — the review classification output (review-30002.md) or the report's Style/Conventions analysis explains whether the suggestion matches a documented or demonstrated project convention", - "Convention upgrade analysis for review comment 30002 includes file-type applicability verification per shared/convention-applicability-rules.md — the analysis checks whether the convention's file-type scope overlaps with the PR's changed files (.rs files including migration files) before upgrading", - "The convention upgrade evidence for a CONVENTIONS.md match includes applicability status (e.g., 'Applicable: yes, file types: ...') indicating that the convention's scope was verified against the PR's changed file types", + "Convention upgrade analysis for review comment 30002 includes file-type applicability verification — the §Migration Patterns convention from conventions-backend.md targets .rs migration files, and the PR includes migration/src/m0042_sbom_soft_delete/mod.rs, so the convention's scope overlaps with the PR's changed files and the match is not discarded", + "The convention upgrade evidence for the §Migration Patterns match includes applicability status confirming that the convention's .rs migration file scope was verified against the PR's changed file types before upgrading the suggestion", "Review comment 30002 (index suggestion) results in a sub-task regardless of classification path — whether classified directly as code change request based on reviewer language, or upgraded from suggestion via convention analysis", "Eval Quality is N/A because no eval result reviews exist in the PR — the 3-criteria detection (author github-actions[bot], marker ## Eval Results, footer sdlc-workflow/run-evals) found no matches, so Eval Quality does not affect the Test Quality combination" ] diff --git a/evals/verify-pr/files/conventions-backend.md b/evals/verify-pr/files/conventions-backend.md new file mode 100644 index 00000000..ac91bd7b --- /dev/null +++ b/evals/verify-pr/files/conventions-backend.md @@ -0,0 +1,29 @@ + + +# Coding Conventions + +## Migration Patterns + +When creating migrations that add foreign key columns, always add a corresponding +index using `Index::create()`. For soft-delete columns (`deleted_at`), add a partial +index filtering on `NULL` values to optimize `WHERE deleted_at IS NULL` queries. + +Example from `migration/src/m0001200_source_document_fk_indexes.rs`: + +```rust +manager.create_index( + Index::create() + .table(MyTable::Table) + .name("idx_my_table_fk_col") + .col(MyTable::FkColumn) + .to_owned(), +).await?; +``` + +## Error Handling + +All handlers return `Result` with `.context()` for error wrapping. + +## Commit Messages + +Use Conventional Commits format: `type(scope): description` From 4fcbfc28855a1759e2cf44318f0dfe26212171c2 Mon Sep 17 00:00:00 2001 From: Ruben Romero Montes Date: Fri, 29 May 2026 16:26:20 +0200 Subject: [PATCH 3/3] fix(verify-pr): revert out-of-scope eval fixture and assertions Remove conventions-backend.md fixture and untestable assertions from eval 3. The existing eval scenarios don't provide CONVENTIONS.md content, so the file-type applicability path in Check 1a is not exercised. Eval coverage for this behavior requires a dedicated eval scenario. Implements TC-4287 Assisted-by: Claude Code --- evals/verify-pr/evals.json | 6 ++-- evals/verify-pr/files/conventions-backend.md | 29 -------------------- 2 files changed, 2 insertions(+), 33 deletions(-) delete mode 100644 evals/verify-pr/files/conventions-backend.md diff --git a/evals/verify-pr/evals.json b/evals/verify-pr/evals.json index 72ac41d1..4f518e20 100644 --- a/evals/verify-pr/evals.json +++ b/evals/verify-pr/evals.json @@ -41,9 +41,9 @@ }, { "id": 3, - "prompt": "Verify PR #744 for task TC-9103. The task description is in task-with-reviews.md, the PR diff is in pr-diff-with-reviews.md, the review comments are in pr-review-comments.md, the target repository structure is in repo-backend.md, and the repository's CONVENTIONS.md content is in conventions-backend.md. All CI checks pass. Write your verification report to outputs/report.md. For each review comment, write your classification reasoning to outputs/review-N.md (where N is the comment id). For each sub-task that would be created, write its full description to outputs/subtask-N.md following the task-description-template.md format.", + "prompt": "Verify PR #744 for task TC-9103. The task description is in task-with-reviews.md, the PR diff is in pr-diff-with-reviews.md, the review comments are in pr-review-comments.md, and the target repository structure is in repo-backend.md. All CI checks pass. Write your verification report to outputs/report.md. For each review comment, write your classification reasoning to outputs/review-N.md (where N is the comment id). For each sub-task that would be created, write its full description to outputs/subtask-N.md following the task-description-template.md format.", "expected_output": "A verification report where acceptance criteria mostly pass, and review feedback processing creates sub-tasks for code change requests. The transaction wrapping comment (id 30001) and index comment (id 30002) should be classified as code change requests with sub-tasks created. The nit (id 30003) and question (id 30004) should be classified but NOT trigger sub-tasks. Sub-task descriptions must follow task-description-template.md and include Target PR and Review Context extension sections.", - "files": ["files/task-with-reviews.md", "files/pr-diff-with-reviews.md", "files/pr-review-comments.md", "files/repo-backend.md", "files/conventions-backend.md"], + "files": ["files/task-with-reviews.md", "files/pr-diff-with-reviews.md", "files/pr-review-comments.md", "files/repo-backend.md"], "assertions": [ "The review comment about transaction wrapping (id 30001) is classified as a code change request and triggers sub-task creation", "The review comment about adding an index (id 30002) is classified as a code change request (or upgraded suggestion) and triggers sub-task creation", @@ -57,8 +57,6 @@ "The report does NOT auto-merge the PR (constraint 1.13)", "The report contains a Test Change Classification row with ADDITIVE — only new test files were added (tests/api/sbom_delete.rs is a new file)", "Convention upgrade eligibility is evaluated for review comment 30002 (index suggestion) — the review classification output (review-30002.md) or the report's Style/Conventions analysis explains whether the suggestion matches a documented or demonstrated project convention", - "Convention upgrade analysis for review comment 30002 includes file-type applicability verification — the §Migration Patterns convention from conventions-backend.md targets .rs migration files, and the PR includes migration/src/m0042_sbom_soft_delete/mod.rs, so the convention's scope overlaps with the PR's changed files and the match is not discarded", - "The convention upgrade evidence for the §Migration Patterns match includes applicability status confirming that the convention's .rs migration file scope was verified against the PR's changed file types before upgrading the suggestion", "Review comment 30002 (index suggestion) results in a sub-task regardless of classification path — whether classified directly as code change request based on reviewer language, or upgraded from suggestion via convention analysis", "Eval Quality is N/A because no eval result reviews exist in the PR — the 3-criteria detection (author github-actions[bot], marker ## Eval Results, footer sdlc-workflow/run-evals) found no matches, so Eval Quality does not affect the Test Quality combination" ] diff --git a/evals/verify-pr/files/conventions-backend.md b/evals/verify-pr/files/conventions-backend.md deleted file mode 100644 index ac91bd7b..00000000 --- a/evals/verify-pr/files/conventions-backend.md +++ /dev/null @@ -1,29 +0,0 @@ - - -# Coding Conventions - -## Migration Patterns - -When creating migrations that add foreign key columns, always add a corresponding -index using `Index::create()`. For soft-delete columns (`deleted_at`), add a partial -index filtering on `NULL` values to optimize `WHERE deleted_at IS NULL` queries. - -Example from `migration/src/m0001200_source_document_fk_indexes.rs`: - -```rust -manager.create_index( - Index::create() - .table(MyTable::Table) - .name("idx_my_table_fk_col") - .col(MyTable::FkColumn) - .to_owned(), -).await?; -``` - -## Error Handling - -All handlers return `Result` with `.context()` for error wrapping. - -## Commit Messages - -Use Conventional Commits format: `type(scope): description`