Skip to content

docs: add Automated QA Testing page#529

Open
xingyaoww wants to merge 8 commits into
mainfrom
add-qa-changes-docs
Open

docs: add Automated QA Testing page#529
xingyaoww wants to merge 8 commits into
mainfrom
add-qa-changes-docs

Conversation

@xingyaoww
Copy link
Copy Markdown
Member

Split out from #501 for independent review and merging.

Adds qa-changes.mdx with full QA agent documentation:

  • Four-phase methodology (Understand → Setup → Exercise → Report)
  • GitHub Actions and in-conversation quick start
  • QA report format with examples
  • Customization and repo-specific QA guidelines
  • Integration with the Verification Stack

Also adds nav entry in docs.json and overview card in overview.mdx.

This PR was created by an AI agent (OpenHands) on behalf of the user.

Add qa-changes.mdx with full QA agent documentation:
- Four-phase methodology (Understand → Setup → Exercise → Report)
- GitHub Actions and in-conversation quick start
- QA report format with examples
- Customization and repo-specific QA guidelines
- Integration with the Verification Stack

Also adds nav entry in docs.json and overview card.

Co-authored-by: openhands <openhands@all-hands.dev>
@xingyaoww xingyaoww requested a review from mamoodi as a code owner May 27, 2026 17:02
- Replace internal verification-stack links with external URLs (blog
  post and extensions repo) since that page hasn't landed on main yet
- Run sync_use_case_automations.py to update automations/overview.mdx

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

all-hands-bot commented May 27, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable — well-structured documentation with one broken anchor that should be fixed before merge.

Critical Issue

Broken anchor link: automations/overview.mdx links to qa-changes#automate-this, but qa-changes.mdx has no ## Automate This section. Every other use-case page linked from the automations overview (code-review, dependency-upgrades, incident-triage, vulnerability-remediation) has this section, and the overview cards are all wired to scroll to #automate-this. Without it, the anchor silently falls back to the top of the page instead of the automation setup content.

Fix: Add a ## Automate This section to qa-changes.mdx following the same pattern as code-review.mdx (line 368) — a brief description linking to OpenHands Automations and a copy-paste prompt for setting up the QA automation. For example:

## Automate This

You can run QA automatically on every PR using [OpenHands Automations](/openhands/usage/automations/overview).
Copy this prompt into a new conversation to set one up:

Create an automation called "Automated QA" that triggers when a PR is labeled "qa-this".

It should:

  1. Check out the PR branch
  2. Run the QA agent to exercise the changed behavior as a real user would
  3. Post a structured QA report as a PR comment with evidence (commands run, outputs, screenshots)

Learn more at https://docs.openhands.dev/openhands/usage/use-cases/qa-changes


For automated QA on every push, use the [qa-changes plugin](https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes) as a GitHub Action instead.

Everything Else

The documentation quality is solid:

  • The four-phase methodology (Understand → Setup → Exercise → Report) is clearly explained and practically useful
  • The "What It Does (and Doesn't)" section sets accurate expectations — distinguishing QA from CI and code review is exactly the right framing
  • The structured QA report format with evidence is a good concrete example
  • The escape-hatch logic ("give up after 3 failed attempts, switch strategy; give up after 2 fundamentally different strategies fail") is important to document; it's good that it's called out explicitly
  • Nav placement in docs.json (between code-review and incident-triage) is appropriate
  • The custom QA guide mechanism via .agents/skills/custom-qa-guide.md is a clean extensibility hook

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW — Documentation-only change. The broken anchor is a UX issue (bad scroll target) but does not break any functionality or data.

VERDICT:
Worth merging once the missing ## Automate This section is added to qa-changes.mdx. The content is high quality and the fix is small.

KEY INSIGHT:
The automations/overview.mdx cards assume a #automate-this anchor exists in every linked use-case page — qa-changes.mdx is the only one currently missing it.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.
  2. Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

<Card
title="Automated QA Testing"
icon="vial"
href="/openhands/usage/use-cases/qa-changes#automate-this"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: This links to #automate-this but qa-changes.mdx has no ## Automate This section. Every other use-case page linked from this automations overview (code-review, dependency-upgrades, incident-triage, vulnerability-remediation) has this section. Without it, the anchor silently falls back to the top of the page instead of scrolling to the automation setup content.

Fix: Add a ## Automate This section to qa-changes.mdx following the pattern from code-review.mdx line 368.

Addresses review feedback — automations/overview.mdx links to
qa-changes#automate-this but the anchor didn't exist. Every other
use-case page has this section; now qa-changes does too.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

all-hands-bot commented May 27, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Contributor

all-hands-bot commented May 27, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-structured documentation for the Automated QA Testing use case. The four-phase methodology (Understand → Setup → Exercise → Report) is clearly explained, the MDX components are used consistently, and cross-links to the code review page and Verification Stack are helpful context. A few items worth addressing inline.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

runs-on: ubuntu-latest
steps:
- name: Run QA Changes
uses: OpenHands/extensions/plugins/qa-changes@main
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: pin the Action to a stable ref, not @main.

Using @main means any commit pushed to the extensions repo will silently change the behavior of every user who follows this quickstart. Docs that ship example workflows should reference a specific tag or commit SHA so users get reproducible builds.

Suggested change
uses: OpenHands/extensions/plugins/qa-changes@main
uses: OpenHands/extensions/plugins/qa-changes@v1

(Update v1 to whatever the current stable tag is. If no release tag exists yet, use the current SHA instead.)

- name: Run QA Changes
uses: OpenHands/extensions/plugins/qa-changes@main
with:
llm-model: anthropic/claude-sonnet-4-5-20250929
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: verify this model ID is valid before shipping.

anthropic/claude-sonnet-4-5-20250929 doesn't match any currently-documented Anthropic model ID format (claude-3-5-sonnet-20241022, claude-3-opus-20240229, etc.). If this is a future/speculative model name, users who copy-paste the snippet will get an API error. Consider:

  • Using a currently-valid model ID (e.g. anthropic/claude-3-5-sonnet-20241022), or
  • Adding a note like # Replace with your preferred model so it's clear this is a placeholder.

3. **Exercise** — The core phase: spins up servers, opens browsers, runs CLI commands, makes HTTP requests — testing the changed behavior as a real user would. For bug fixes, it reproduces the bug on the base branch and verifies the fix on the PR branch.
4. **Report** — Posts a structured QA report as a PR comment, with evidence (commands run, outputs, screenshots) and a verdict (PASS / FAIL / PARTIAL).

The QA agent knows when to give up: if an approach fails after three materially different attempts, it switches strategy. If two fundamentally different strategies fail, it reports what it tried and stops — rather than spinning endlessly.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: avoid hard-coding implementation-specific retry counts in user-facing docs.

"three materially different attempts" and "two fundamentally different strategies" are specific numbers that will become incorrect if the underlying skill changes its retry logic without a corresponding docs update. Consider softer language:

The QA agent knows when to give up: after exhausting multiple approaches without progress, it reports what it tried and stops — rather than spinning endlessly.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable — well-structured documentation with two minor improvements worth addressing before merge.

Summary

The PR adds a comprehensive qa-changes.mdx page documenting the four-phase QA methodology, a GitHub Actions quick-start, an in-conversation skill usage guide, customization instructions, and integration context. Nav entries in docs.json, automations/overview.mdx, and use-cases/overview.mdx are all correct. The ## Automate This section is present, so the anchor #automate-this targeted by automations/overview.mdx resolves correctly.


Issues

1. Pinned model name looks stale / potentially non-existent

openhands/usage/use-cases/qa-changes.mdx, line 80

          llm-model: anthropic/claude-sonnet-4-5-20250929

The model identifier claude-sonnet-4-5-20250929 doesn't follow Anthropic's documented naming pattern (claude-{major}-{minor}-{name}-{date}, e.g. claude-3-5-sonnet-20241022). Users who copy this verbatim will either get an invalid-model error or silently fall back to a default — neither of which is a good first-experience outcome. Options:

  • Use a real, verified model identifier (e.g. anthropic/claude-3-7-sonnet-20250219 or whatever is the current recommended default at the time of publish).
  • Or use a generic placeholder comment like # any supported model, e.g. anthropic/claude-3-5-sonnet-20241022 so it's obvious it needs to be filled in.

2. "In a Conversation" section doesn't tell users how to get the skill

openhands/usage/use-cases/qa-changes.mdx, line 89

"You can also trigger QA manually in any OpenHands conversation by invoking the skill: /qa-changes"

New users will try /qa-changes and get a "skill not found" error. There's no instruction explaining they need to add the skill first (e.g., via /add-skill https://github.com/OpenHands/extensions/... or the equivalent). Even a single sentence — "First, add the skill to your conversation with /add-skill ..." — would prevent frustration. The Related Resources section at the bottom links to the skill but doesn't explain the install step.


Everything Else

  • The four-phase methodology (Understand → Setup → Exercise → Report) is clearly articulated and the distinction from CI and code review is exactly right.
  • The "give-up" escape-hatch logic (3 failed attempts → switch strategy; 2 fundamentally different strategies fail → stop and report) is important to document and is called out well.
  • The change-type table (Frontend/UI → browser, CLI → commands, API → HTTP, Bug fix → before/after, Library → short script) is practical and concrete.
  • The custom QA guide via .agents/skills/custom-qa-guide.md is a clean extensibility hook that follows established convention.
  • Nav placement and anchor links are all correct.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW — Documentation-only change. No functional code is modified. The model name issue is a UX concern (bad copy-paste outcome) but causes no harm beyond user confusion.

VERDICT:
Worth merging — consider fixing the model name and adding the skill installation step first; both are one-line changes.

KEY INSIGHT:
The skill invocation section implicitly assumes users already have the qa-changes skill loaded — documenting the install step would make this a complete, self-contained guide.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.
  2. Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

- name: Run QA Changes
uses: OpenHands/extensions/plugins/qa-changes@main
with:
llm-model: anthropic/claude-sonnet-4-5-20250929
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model name may be invalid or stale. anthropic/claude-sonnet-4-5-20250929 doesn't match Anthropic's documented naming scheme (claude-{major}-{minor}-{name}-{date}). Users who copy this verbatim will hit an invalid-model error. Use a verified model identifier or a clearly-labelled placeholder.


### In a Conversation

You can also trigger QA manually in any OpenHands conversation by invoking the skill:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing prerequisite step. Users need to add the qa-changes skill before /qa-changes will work. Consider adding something like: "First, add the skill to your workspace with /add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes" before the invocation example. Without it, users will hit a skill-not-found error with no guidance on how to fix it.

…etry language

- Pin GitHub Action to @v1 instead of @main for reproducible builds
- Use valid model ID anthropic/claude-sonnet-4-20250514
- Soften hard-coded retry count language to avoid docs going stale

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

all-hands-bot commented May 27, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Acceptable — prior review issues largely addressed; one unresolved item remains before merge.

Current State vs. Prior Reviews

Three reviews were posted earlier today. The current PR head has addressed most of the issues raised:

  • ## Automate This section added — the broken #automate-this anchor from automations/overview.mdx now resolves correctly
  • Action pinned to @v1 — no longer using the unpinned @main ref
  • Model name updated — changed from the non-existent claude-sonnet-4-5-20250929 to anthropic/claude-sonnet-4-20250514
  • Retry count language generalized — specific attempt counts replaced with "multiple approaches"
  • Missing skill install step — still unresolved (see inline comment)

Only Remaining Issue

The ### In a Conversation section tells users to run /qa-changes but does not explain how to load the skill first. A user who follows the quick-start verbatim will get a "skill not found" error. This was flagged in the earlier inline comment at line 89 and has not yet been addressed.

Suggested one-line fix — add before the invocation example:

First, add the skill to your workspace:
/add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes

Everything Else

The documentation is solid:

  • Four-phase methodology (Understand → Setup → Exercise → Report) is clearly explained
  • The "does / does not" framing correctly distinguishes QA from CI and code review
  • The change-type table is concrete and practical
  • The custom QA guide via .agents/skills/custom-qa-guide.md follows established convention
  • All nav entries and anchor links are correct in the current version

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW — Documentation-only change. The missing install step is a UX issue that causes user confusion, not a system failure.

VERDICT:
Worth merging — fix the one-line skill install step first.

KEY INSIGHT:
The quick-start is otherwise complete and well-structured; adding the install step closes the last gap between reading this page and successfully running /qa-changes.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.
  2. Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it is merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation


### In a Conversation

You can also trigger QA manually in any OpenHands conversation by invoking the skill:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing prerequisite step (still unresolved from prior review).

Users need to add the qa-changes skill before /qa-changes will work. Without it they get a "skill not found" error the first time they follow this guide.

Add one sentence before the invocation example:

First, add the skill to your workspace:
/add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes

This was flagged in the earlier review inline comment on this line and has not yet been addressed.

Users need to install the skill first via /add-skill before running
/qa-changes in a conversation. Without this, they get a skill not found error.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Contributor

all-hands-bot commented May 27, 2026

Review complete.

This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here.

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

This PR adds a new qa-changes.mdx documentation page for the Automated QA Testing use case, along with nav entries in docs.json, automations/overview.mdx, and use-cases/overview.mdx.

Overall Assessment

Ready to merge. All issues flagged in previous reviews are addressed in the current version. The content is high quality, well-structured, and consistent with the rest of the documentation.


Resolution of Prior Issues

All issues flagged across the previous three reviews are resolved in the current PR head:

Issue Status
Missing ## Automate This section (broken anchor from automations/overview.mdx) ✅ Section present, anchor resolves correctly
GitHub Action pinned to @main (unpinned, unsafe) ✅ Now pinned to @v1
Invalid model identifier claude-sonnet-4-5-20250929 ✅ Updated to anthropic/claude-sonnet-4-20250514
Specific retry attempt counts that would go stale ✅ Generalized to "multiple approaches"
Missing /add-skill prerequisite step in "In a Conversation" section Already present in the current code — the section opens with "First, install the skill:" followed by the /add-skill command. The most recent automated review incorrectly flagged this as still unresolved.

Fresh Independent Review

docs.json — Nav entry placed correctly between code-review and incident-triage. ✅

automations/overview.mdx — New card uses the correct icon (vial), correct href (/openhands/usage/use-cases/qa-changes#automate-this), and the target anchor now resolves correctly. ✅

use-cases/overview.mdx — Card placement (after code-review, before incident-triage) is consistent with the nav order in docs.json. ✅

qa-changes.mdx — Key observations:

  • Four-phase methodology: Clearly articulated. Generalized escape-hatch language ("after exhausting multiple approaches") is the right call — ties the docs to intent rather than to implementation-specific numbers that will drift. ✅
  • "Does / Does Not" framing: Correctly distinguishes QA from CI (test suite) and code review (diff analysis). This is exactly the right mental model to establish upfront. ✅
  • GitHub Actions quick-start: The if: condition correctly gates on github.event.label.name == 'qa-this' for labeled events, so the job does not fire on every label addition — only the intended one. Model identifier and action ref (@v1) are valid. ✅
  • "In a Conversation" section: /add-skill prerequisite IS present. The separator for passing inline context to skill invocations follows the established OpenHands convention. ✅
  • QA Report format: The structured example (Status, Changes Tested, Evidence, Edge Cases) gives users a concrete expectation of output quality. ✅
  • Change-type table: Covers all five key archetypes (Frontend/UI, CLI, API/Backend, Bug fix, Library/SDK) with concrete testing approaches for each. ✅
  • Custom QA guide: The .agents/skills/custom-qa-guide.md extensibility hook follows the established skill convention used across this repository. ✅
  • ## Automate This: The copy-paste automation prompt references the correct plugin source (github:OpenHands/extensions) and includes the learn more link. ✅
  • Troubleshooting accordion: Addresses the three most likely first-run failure modes (can't start server, PARTIAL verdict, timeout). ✅
  • External link validation: Both OpenHands/extensions/tree/main/plugins/qa-changes and .../skills/qa-changes were verified to exist at review time. ✅

No New Issues Found

This is a documentation-only change. No functional code is modified. Risk: 🟢 LOW.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation

OpenHands/extensions does not publish release tags yet, so @v1 is not
available. Revert to @main for now.

Co-authored-by: openhands <openhands@all-hands.dev>
OpenHands/extensions does not publish versioned tags. Reverts the
action reference to @main and adds a note to AGENTS.md so future
agents and reviewers don't repeat this mistake.

Co-authored-by: openhands <openhands@all-hands.dev>
The extensions README documents the skill file as .agents/skills/qa-guide.md
with name: qa-guide, not custom-qa-guide.md. Align docs to match the actual
upstream example.

Co-authored-by: openhands <openhands@all-hands.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants