docs: add Automated QA Testing page#529
Conversation
Add qa-changes.mdx with full QA agent documentation: - Four-phase methodology (Understand → Setup → Exercise → Report) - GitHub Actions and in-conversation quick start - QA report format with examples - Customization and repo-specific QA guidelines - Integration with the Verification Stack Also adds nav entry in docs.json and overview card. Co-authored-by: openhands <openhands@all-hands.dev>
- Replace internal verification-stack links with external URLs (blog post and extensions repo) since that page hasn't landed on main yet - Run sync_use_case_automations.py to update automations/overview.mdx Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable — well-structured documentation with one broken anchor that should be fixed before merge.
Critical Issue
Broken anchor link: automations/overview.mdx links to qa-changes#automate-this, but qa-changes.mdx has no ## Automate This section. Every other use-case page linked from the automations overview (code-review, dependency-upgrades, incident-triage, vulnerability-remediation) has this section, and the overview cards are all wired to scroll to #automate-this. Without it, the anchor silently falls back to the top of the page instead of the automation setup content.
Fix: Add a ## Automate This section to qa-changes.mdx following the same pattern as code-review.mdx (line 368) — a brief description linking to OpenHands Automations and a copy-paste prompt for setting up the QA automation. For example:
## Automate This
You can run QA automatically on every PR using [OpenHands Automations](/openhands/usage/automations/overview).
Copy this prompt into a new conversation to set one up:
Create an automation called "Automated QA" that triggers when a PR is labeled "qa-this".
It should:
- Check out the PR branch
- Run the QA agent to exercise the changed behavior as a real user would
- Post a structured QA report as a PR comment with evidence (commands run, outputs, screenshots)
Learn more at https://docs.openhands.dev/openhands/usage/use-cases/qa-changes
For automated QA on every push, use the [qa-changes plugin](https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes) as a GitHub Action instead.
Everything Else
The documentation quality is solid:
- The four-phase methodology (Understand → Setup → Exercise → Report) is clearly explained and practically useful
- The "What It Does (and Doesn't)" section sets accurate expectations — distinguishing QA from CI and code review is exactly the right framing
- The structured QA report format with evidence is a good concrete example
- The escape-hatch logic ("give up after 3 failed attempts, switch strategy; give up after 2 fundamentally different strategies fail") is important to document; it's good that it's called out explicitly
- Nav placement in
docs.json(between code-review and incident-triage) is appropriate - The custom QA guide mechanism via
.agents/skills/custom-qa-guide.mdis a clean extensibility hook
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW — Documentation-only change. The broken anchor is a UX issue (bad scroll target) but does not break any functionality or data.
VERDICT:
✅ Worth merging once the missing ## Automate This section is added to qa-changes.mdx. The content is high quality and the fix is small.
KEY INSIGHT:
The automations/overview.mdx cards assume a #automate-this anchor exists in every linked use-case page — qa-changes.mdx is the only one currently missing it.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
| <Card | ||
| title="Automated QA Testing" | ||
| icon="vial" | ||
| href="/openhands/usage/use-cases/qa-changes#automate-this" |
There was a problem hiding this comment.
🔴 Critical: This links to #automate-this but qa-changes.mdx has no ## Automate This section. Every other use-case page linked from this automations overview (code-review, dependency-upgrades, incident-triage, vulnerability-remediation) has this section. Without it, the anchor silently falls back to the top of the page instead of scrolling to the automation setup content.
Fix: Add a ## Automate This section to qa-changes.mdx following the pattern from code-review.mdx line 368.
Addresses review feedback — automations/overview.mdx links to qa-changes#automate-this but the anchor didn't exist. Every other use-case page has this section; now qa-changes does too. Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Well-structured documentation for the Automated QA Testing use case. The four-phase methodology (Understand → Setup → Exercise → Report) is clearly explained, the MDX components are used consistently, and cross-links to the code review page and Verification Stack are helpful context. A few items worth addressing inline.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Run QA Changes | ||
| uses: OpenHands/extensions/plugins/qa-changes@main |
There was a problem hiding this comment.
🟠 Important: pin the Action to a stable ref, not @main.
Using @main means any commit pushed to the extensions repo will silently change the behavior of every user who follows this quickstart. Docs that ship example workflows should reference a specific tag or commit SHA so users get reproducible builds.
| uses: OpenHands/extensions/plugins/qa-changes@main | |
| uses: OpenHands/extensions/plugins/qa-changes@v1 |
(Update v1 to whatever the current stable tag is. If no release tag exists yet, use the current SHA instead.)
| - name: Run QA Changes | ||
| uses: OpenHands/extensions/plugins/qa-changes@main | ||
| with: | ||
| llm-model: anthropic/claude-sonnet-4-5-20250929 |
There was a problem hiding this comment.
🟡 Suggestion: verify this model ID is valid before shipping.
anthropic/claude-sonnet-4-5-20250929 doesn't match any currently-documented Anthropic model ID format (claude-3-5-sonnet-20241022, claude-3-opus-20240229, etc.). If this is a future/speculative model name, users who copy-paste the snippet will get an API error. Consider:
- Using a currently-valid model ID (e.g.
anthropic/claude-3-5-sonnet-20241022), or - Adding a note like
# Replace with your preferred modelso it's clear this is a placeholder.
| 3. **Exercise** — The core phase: spins up servers, opens browsers, runs CLI commands, makes HTTP requests — testing the changed behavior as a real user would. For bug fixes, it reproduces the bug on the base branch and verifies the fix on the PR branch. | ||
| 4. **Report** — Posts a structured QA report as a PR comment, with evidence (commands run, outputs, screenshots) and a verdict (PASS / FAIL / PARTIAL). | ||
|
|
||
| The QA agent knows when to give up: if an approach fails after three materially different attempts, it switches strategy. If two fundamentally different strategies fail, it reports what it tried and stops — rather than spinning endlessly. |
There was a problem hiding this comment.
🟡 Suggestion: avoid hard-coding implementation-specific retry counts in user-facing docs.
"three materially different attempts" and "two fundamentally different strategies" are specific numbers that will become incorrect if the underlying skill changes its retry logic without a corresponding docs update. Consider softer language:
The QA agent knows when to give up: after exhausting multiple approaches without progress, it reports what it tried and stops — rather than spinning endlessly.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable — well-structured documentation with two minor improvements worth addressing before merge.
Summary
The PR adds a comprehensive qa-changes.mdx page documenting the four-phase QA methodology, a GitHub Actions quick-start, an in-conversation skill usage guide, customization instructions, and integration context. Nav entries in docs.json, automations/overview.mdx, and use-cases/overview.mdx are all correct. The ## Automate This section is present, so the anchor #automate-this targeted by automations/overview.mdx resolves correctly.
Issues
1. Pinned model name looks stale / potentially non-existent
openhands/usage/use-cases/qa-changes.mdx, line 80
llm-model: anthropic/claude-sonnet-4-5-20250929The model identifier claude-sonnet-4-5-20250929 doesn't follow Anthropic's documented naming pattern (claude-{major}-{minor}-{name}-{date}, e.g. claude-3-5-sonnet-20241022). Users who copy this verbatim will either get an invalid-model error or silently fall back to a default — neither of which is a good first-experience outcome. Options:
- Use a real, verified model identifier (e.g.
anthropic/claude-3-7-sonnet-20250219or whatever is the current recommended default at the time of publish). - Or use a generic placeholder comment like
# any supported model, e.g. anthropic/claude-3-5-sonnet-20241022so it's obvious it needs to be filled in.
2. "In a Conversation" section doesn't tell users how to get the skill
openhands/usage/use-cases/qa-changes.mdx, line 89
"You can also trigger QA manually in any OpenHands conversation by invoking the skill:
/qa-changes"
New users will try /qa-changes and get a "skill not found" error. There's no instruction explaining they need to add the skill first (e.g., via /add-skill https://github.com/OpenHands/extensions/... or the equivalent). Even a single sentence — "First, add the skill to your conversation with /add-skill ..." — would prevent frustration. The Related Resources section at the bottom links to the skill but doesn't explain the install step.
Everything Else
- The four-phase methodology (Understand → Setup → Exercise → Report) is clearly articulated and the distinction from CI and code review is exactly right.
- The "give-up" escape-hatch logic (3 failed attempts → switch strategy; 2 fundamentally different strategies fail → stop and report) is important to document and is called out well.
- The change-type table (Frontend/UI → browser, CLI → commands, API → HTTP, Bug fix → before/after, Library → short script) is practical and concrete.
- The custom QA guide via
.agents/skills/custom-qa-guide.mdis a clean extensibility hook that follows established convention. - Nav placement and anchor links are all correct.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW — Documentation-only change. No functional code is modified. The model name issue is a UX concern (bad copy-paste outcome) but causes no harm beyond user confusion.
VERDICT:
✅ Worth merging — consider fixing the model name and adding the skill installation step first; both are one-line changes.
KEY INSIGHT:
The skill invocation section implicitly assumes users already have the qa-changes skill loaded — documenting the install step would make this a complete, self-contained guide.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
| - name: Run QA Changes | ||
| uses: OpenHands/extensions/plugins/qa-changes@main | ||
| with: | ||
| llm-model: anthropic/claude-sonnet-4-5-20250929 |
There was a problem hiding this comment.
Model name may be invalid or stale. anthropic/claude-sonnet-4-5-20250929 doesn't match Anthropic's documented naming scheme (claude-{major}-{minor}-{name}-{date}). Users who copy this verbatim will hit an invalid-model error. Use a verified model identifier or a clearly-labelled placeholder.
|
|
||
| ### In a Conversation | ||
|
|
||
| You can also trigger QA manually in any OpenHands conversation by invoking the skill: |
There was a problem hiding this comment.
Missing prerequisite step. Users need to add the qa-changes skill before /qa-changes will work. Consider adding something like: "First, add the skill to your workspace with /add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes" before the invocation example. Without it, users will hit a skill-not-found error with no guidance on how to fix it.
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable — prior review issues largely addressed; one unresolved item remains before merge.
Current State vs. Prior Reviews
Three reviews were posted earlier today. The current PR head has addressed most of the issues raised:
- ✅
## Automate Thissection added — the broken#automate-thisanchor fromautomations/overview.mdxnow resolves correctly - ✅ Action pinned to
@v1— no longer using the unpinned@mainref - ✅ Model name updated — changed from the non-existent
claude-sonnet-4-5-20250929toanthropic/claude-sonnet-4-20250514 - ✅ Retry count language generalized — specific attempt counts replaced with "multiple approaches"
- ❌ Missing skill install step — still unresolved (see inline comment)
Only Remaining Issue
The ### In a Conversation section tells users to run /qa-changes but does not explain how to load the skill first. A user who follows the quick-start verbatim will get a "skill not found" error. This was flagged in the earlier inline comment at line 89 and has not yet been addressed.
Suggested one-line fix — add before the invocation example:
First, add the skill to your workspace:
/add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes
Everything Else
The documentation is solid:
- Four-phase methodology (Understand → Setup → Exercise → Report) is clearly explained
- The "does / does not" framing correctly distinguishes QA from CI and code review
- The change-type table is concrete and practical
- The custom QA guide via
.agents/skills/custom-qa-guide.mdfollows established convention - All nav entries and anchor links are correct in the current version
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟢 LOW — Documentation-only change. The missing install step is a UX issue that causes user confusion, not a system failure.
VERDICT:
✅ Worth merging — fix the one-line skill install step first.
KEY INSIGHT:
The quick-start is otherwise complete and well-structured; adding the install step closes the last gap between reading this page and successfully running /qa-changes.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it is merge-ready.Was this review helpful? React with 👍 or 👎 to give feedback.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
|
|
||
| ### In a Conversation | ||
|
|
||
| You can also trigger QA manually in any OpenHands conversation by invoking the skill: |
There was a problem hiding this comment.
Missing prerequisite step (still unresolved from prior review).
Users need to add the qa-changes skill before /qa-changes will work. Without it they get a "skill not found" error the first time they follow this guide.
Add one sentence before the invocation example:
First, add the skill to your workspace:
/add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes
This was flagged in the earlier review inline comment on this line and has not yet been addressed.
Users need to install the skill first via /add-skill before running /qa-changes in a conversation. Without this, they get a skill not found error. Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
Review
This PR adds a new qa-changes.mdx documentation page for the Automated QA Testing use case, along with nav entries in docs.json, automations/overview.mdx, and use-cases/overview.mdx.
Overall Assessment
✅ Ready to merge. All issues flagged in previous reviews are addressed in the current version. The content is high quality, well-structured, and consistent with the rest of the documentation.
Resolution of Prior Issues
All issues flagged across the previous three reviews are resolved in the current PR head:
| Issue | Status |
|---|---|
Missing ## Automate This section (broken anchor from automations/overview.mdx) |
✅ Section present, anchor resolves correctly |
GitHub Action pinned to @main (unpinned, unsafe) |
✅ Now pinned to @v1 |
Invalid model identifier claude-sonnet-4-5-20250929 |
✅ Updated to anthropic/claude-sonnet-4-20250514 |
| Specific retry attempt counts that would go stale | ✅ Generalized to "multiple approaches" |
Missing /add-skill prerequisite step in "In a Conversation" section |
✅ Already present in the current code — the section opens with "First, install the skill:" followed by the /add-skill command. The most recent automated review incorrectly flagged this as still unresolved. |
Fresh Independent Review
docs.json — Nav entry placed correctly between code-review and incident-triage. ✅
automations/overview.mdx — New card uses the correct icon (vial), correct href (/openhands/usage/use-cases/qa-changes#automate-this), and the target anchor now resolves correctly. ✅
use-cases/overview.mdx — Card placement (after code-review, before incident-triage) is consistent with the nav order in docs.json. ✅
qa-changes.mdx — Key observations:
- Four-phase methodology: Clearly articulated. Generalized escape-hatch language ("after exhausting multiple approaches") is the right call — ties the docs to intent rather than to implementation-specific numbers that will drift. ✅
- "Does / Does Not" framing: Correctly distinguishes QA from CI (test suite) and code review (diff analysis). This is exactly the right mental model to establish upfront. ✅
- GitHub Actions quick-start: The
if:condition correctly gates ongithub.event.label.name == 'qa-this'for labeled events, so the job does not fire on every label addition — only the intended one. Model identifier and action ref (@v1) are valid. ✅ - "In a Conversation" section:
/add-skillprerequisite IS present. The—separator for passing inline context to skill invocations follows the established OpenHands convention. ✅ - QA Report format: The structured example (Status, Changes Tested, Evidence, Edge Cases) gives users a concrete expectation of output quality. ✅
- Change-type table: Covers all five key archetypes (Frontend/UI, CLI, API/Backend, Bug fix, Library/SDK) with concrete testing approaches for each. ✅
- Custom QA guide: The
.agents/skills/custom-qa-guide.mdextensibility hook follows the established skill convention used across this repository. ✅ ## Automate This: The copy-paste automation prompt references the correct plugin source (github:OpenHands/extensions) and includes thelearn morelink. ✅- Troubleshooting accordion: Addresses the three most likely first-run failure modes (can't start server, PARTIAL verdict, timeout). ✅
- External link validation: Both
OpenHands/extensions/tree/main/plugins/qa-changesand.../skills/qa-changeswere verified to exist at review time. ✅
No New Issues Found
This is a documentation-only change. No functional code is modified. Risk: 🟢 LOW.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
OpenHands/extensions does not publish versioned tags. Reverts the action reference to @main and adds a note to AGENTS.md so future agents and reviewers don't repeat this mistake. Co-authored-by: openhands <openhands@all-hands.dev>
The extensions README documents the skill file as .agents/skills/qa-guide.md with name: qa-guide, not custom-qa-guide.md. Align docs to match the actual upstream example. Co-authored-by: openhands <openhands@all-hands.dev>
Split out from #501 for independent review and merging.
Adds
qa-changes.mdxwith full QA agent documentation:Also adds nav entry in
docs.jsonand overview card inoverview.mdx.This PR was created by an AI agent (OpenHands) on behalf of the user.