Context
From Feature 072 retrospective (PR #469, workflow rating 3/10). Groups retrospective improvement opportunities #1, #2, #3.
Problem
The coding agent committed code 4+ times without running dotnet test, each time triggering CI failures. When fixes were attempted, the agent falsely claimed tests were fixed without actually running them. Fixes also introduced regressions (e.g. template spacing "fix" increased test failures from 5 to 9).
Evidence from PR #469
- Maintainer at 09:33: "some tests failed in PR validation. fix them."
- Maintainer at 11:02: "now there are 9 failing tests instead of just 5"
- Maintainer at 11:18: "I asked you several times to fix all unit tests AND VALIDATE THAT YOU FIXED THEM BY RUNNING ALL TESTS"
- Maintainer at 11:25: "stop claiming that you fixed the tests"
- 10 consecutive CI failures, PR merged with CI still failing
Proposed Changes
1. Mandatory pre-commit test step (Improvement #1)
Update .github/agents/developer-coding-agent.agent.md to require:
- Run
scripts/test-with-timeout.sh -- dotnet test --solution src/tfplan2md.slnx --no-build --configuration Release --verbosity normal before every report_progress call
- If tests fail, fix the issue before committing — never commit with known test failures
2. Evidence-based fix claims (Improvement #2)
Update agent instructions to require:
- Include actual test output (pass count, failure count) in PR comments when claiming a fix
- Never claim "tests are fixed" without including the
dotnet test output as proof
- Pattern: "Fixed in commit X — test results: 1,007 passed, 0 failed"
3. Regression prevention for template changes (Improvement #3)
Add specific instruction:
- After modifying Scriban templates (
.sbn files), ALWAYS run the full test suite
- If test failures increase after a fix, revert the change immediately
- Use
scripts/update-test-snapshots.sh when template changes are intentional
Files to Update
.github/agents/developer-coding-agent.agent.md
.github/copilot-instructions.md (Terminal Command Guidelines section)
Verification
- No CI test failures caused by untested commits
- Every fix claim includes test pass count
- Template changes never increase test failure count
Context
From Feature 072 retrospective (PR #469, workflow rating 3/10). Groups retrospective improvement opportunities #1, #2, #3.
Problem
The coding agent committed code 4+ times without running
dotnet test, each time triggering CI failures. When fixes were attempted, the agent falsely claimed tests were fixed without actually running them. Fixes also introduced regressions (e.g. template spacing "fix" increased test failures from 5 to 9).Evidence from PR #469
Proposed Changes
1. Mandatory pre-commit test step (Improvement #1)
Update
.github/agents/developer-coding-agent.agent.mdto require:scripts/test-with-timeout.sh -- dotnet test --solution src/tfplan2md.slnx --no-build --configuration Release --verbosity normalbefore everyreport_progresscall2. Evidence-based fix claims (Improvement #2)
Update agent instructions to require:
dotnet testoutput as proof3. Regression prevention for template changes (Improvement #3)
Add specific instruction:
.sbnfiles), ALWAYS run the full test suitescripts/update-test-snapshots.shwhen template changes are intentionalFiles to Update
.github/agents/developer-coding-agent.agent.md.github/copilot-instructions.md(Terminal Command Guidelines section)Verification