From ccbb3462a2a141afa1ff0849e2045e6153c81152 Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 27 May 2026 17:01:55 +0000 Subject: [PATCH 1/8] docs: add Automated QA Testing page MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add qa-changes.mdx with full QA agent documentation: - Four-phase methodology (Understand → Setup → Exercise → Report) - GitHub Actions and in-conversation quick start - QA report format with examples - Customization and repo-specific QA guidelines - Integration with the Verification Stack Also adds nav entry in docs.json and overview card. Co-authored-by: openhands --- docs.json | 1 + openhands/usage/use-cases/overview.mdx | 7 + openhands/usage/use-cases/qa-changes.mdx | 195 +++++++++++++++++++++++ 3 files changed, 203 insertions(+) create mode 100644 openhands/usage/use-cases/qa-changes.mdx diff --git a/docs.json b/docs.json index 548487e1..c1d31682 100644 --- a/docs.json +++ b/docs.json @@ -198,6 +198,7 @@ "pages": [ "openhands/usage/use-cases/vulnerability-remediation", "openhands/usage/use-cases/code-review", + "openhands/usage/use-cases/qa-changes", "openhands/usage/use-cases/incident-triage", "openhands/usage/use-cases/cobol-modernization", "openhands/usage/use-cases/dependency-upgrades", diff --git a/openhands/usage/use-cases/overview.mdx b/openhands/usage/use-cases/overview.mdx index 8229b463..da9b0161 100644 --- a/openhands/usage/use-cases/overview.mdx +++ b/openhands/usage/use-cases/overview.mdx @@ -22,6 +22,13 @@ Each use case can be implemented in different ways—as a one-off conversation, > Set up automated PR reviews to maintain code quality and catch bugs early. + + Validate PR changes by actually running the software as a real user would. + - + Functionally test PR changes by exercising the software as a real user would. +--- + + + Check out the complete QA changes plugin with ready-to-use code and configuration. + + +Automated QA testing goes beyond code review and CI: instead of reading diffs or running the test suite, the QA agent actually **runs the software** and verifies that changes work as claimed. It sets up the environment, exercises changed behavior as a real user would (browser, CLI, API requests), and posts a structured report with evidence. + +This is Layer 2 of the [Verification Stack](/openhands/usage/use-cases/verification-stack), complementing the [code review agent](/openhands/usage/use-cases/code-review). + +## Overview + +The QA agent follows a four-phase methodology: + +1. **Understand** — Reads the PR diff, title, and description. Classifies changes (new feature, bug fix, refactor, config) and identifies entry points (CLI commands, API endpoints, UI pages). +2. **Setup** — Bootstraps the repository: installs dependencies, builds the project, notes CI status. +3. **Exercise** — The core phase: spins up servers, opens browsers, runs CLI commands, makes HTTP requests — testing the changed behavior as a real user would. For bug fixes, it reproduces the bug on the base branch and verifies the fix on the PR branch. +4. **Report** — Posts a structured QA report as a PR comment, with evidence (commands run, outputs, screenshots) and a verdict (PASS / FAIL / PARTIAL). + +The QA agent knows when to give up: if an approach fails after three materially different attempts, it switches strategy. If two fundamentally different strategies fail, it reports what it tried and stops — rather than spinning endlessly. + +## What It Does (and Doesn't) + + + + - Run the actual application and interact with it + - Make real HTTP requests, run real CLI commands + - Open browsers and verify UI changes + - Reproduce bugs and verify fixes end-to-end + - Report with evidence (commands, outputs, screenshots) + + + - Run the test suite (that's CI's job) + - Analyze code for style or structure (that's code review's job) + - Run linters, formatters, or type checkers + - Substitute `--help` or `--dry-run` for real execution + + + +## Quick Start + +### GitHub Actions + +Create `.github/workflows/qa-changes.yml` in your repository: + +```yaml +name: QA Changes + +on: + pull_request: + types: [opened, ready_for_review, labeled] + +permissions: + contents: read + pull-requests: write + issues: write + +jobs: + qa: + if: | + (github.event.action == 'opened' && github.event.pull_request.draft == false) || + github.event.action == 'ready_for_review' || + github.event.label.name == 'qa-this' + runs-on: ubuntu-latest + steps: + - name: Run QA Changes + uses: OpenHands/extensions/plugins/qa-changes@main + with: + llm-model: anthropic/claude-sonnet-4-5-20250929 + llm-api-key: ${{ secrets.LLM_API_KEY }} + github-token: ${{ secrets.GITHUB_TOKEN }} +``` + +Add your `LLM_API_KEY` to your repository's **Settings → Secrets and variables → Actions**. + +### In a Conversation + +You can also trigger QA manually in any OpenHands conversation by invoking the skill: + +``` +/qa-changes +``` + +The agent will ask for the PR to test, or you can provide context directly: + +``` +/qa-changes — Please QA PR #42 on the my-org/my-repo repository. +Focus on the new dashboard page and verify it renders correctly. +``` + +## QA Report Format + +The QA agent posts a structured report as a PR comment: + +``` +## QA Report + +**Status: PASS** ✅ + +### Changes Tested +- New `/api/health` endpoint returns 200 with version info +- Dashboard page renders at `/dashboard` with correct data + +### Evidence +1. Started server with `npm run dev` +2. `curl http://localhost:3000/api/health` → 200 OK, body: {"status":"ok","version":"1.2.0"} +3. Navigated to http://localhost:3000/dashboard — page renders correctly + [screenshot attached] + +### Edge Cases +- Empty database state: dashboard shows "No data" placeholder ✅ +- Invalid auth token: returns 401 as expected ✅ +``` + +## Customization + +### Change Types + +The QA agent adapts its approach based on the type of change: + +| Change Type | QA Approach | +|-------------|-------------| +| **Frontend / UI** | Starts dev server, opens browser, verifies visual changes, tests interactions | +| **CLI** | Runs commands with realistic arguments, verifies output, tests edge cases | +| **API / Backend** | Starts server, makes HTTP requests, verifies responses and side effects | +| **Bug fix** | Reproduces bug on base branch, verifies fix on PR branch (before/after) | +| **Library / SDK** | Writes and runs a short script that imports and calls changed functions | + +### Repository-Specific QA Guidelines + +Add repo-specific QA instructions by creating `.agents/skills/custom-qa-guide.md`: + +```markdown +--- +name: custom-qa-guide +description: Custom QA guidelines for this repository +triggers: +- /qa-changes +--- + +# QA Guidelines for [Your Project] + +## Environment Setup +- Run `make setup` to initialize the development environment +- The dev server runs on port 8080 + +## Key Test Scenarios +- Always verify the admin dashboard at /admin after backend changes +- For API changes, test with both authenticated and unauthenticated requests + +## Known Limitations +- The payment module requires a Stripe test key — skip payment flow testing +``` + +## Integration with the Verification Stack + +The QA agent is most powerful when used alongside the [code review agent](/openhands/usage/use-cases/code-review) and the [iterate skill](/openhands/usage/use-cases/verification-stack#closing-the-loop-the-iterate-skill) as part of the full [Verification Stack](/openhands/usage/use-cases/verification-stack): + +1. **Code review** catches issues by reading the diff (style, security, data structures) +2. **QA** catches issues by running the software (behavioral regressions, UI bugs) +3. **Iterate** orchestrates the loop — fixing issues flagged by either verifier and re-polling until the PR is clean + +## Troubleshooting + + + + Ensure your repository's setup instructions are documented in `README.md` or `AGENTS.md`. The agent follows these to bootstrap the environment. If setup requires special steps, add them to a custom QA guide. + + + + PARTIAL means some scenarios passed and others failed or couldn't be tested. Read the report details — it will explain what worked and what didn't. Common causes: missing environment variables, external service dependencies, or insufficient permissions. + + + + For large PRs with many changed entry points, the agent may need more time. Consider splitting large PRs into smaller, focused changes. You can also add a custom QA guide that prioritizes the most important scenarios. + + + +## Related Resources + +- [QA Changes Plugin](https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes) — GitHub Actions plugin +- [QA Changes Skill](https://github.com/OpenHands/extensions/tree/main/skills/qa-changes) — Detailed skill methodology +- [Verification Stack](/openhands/usage/use-cases/verification-stack) — How QA fits into the full verification pipeline +- [Automated Code Review](/openhands/usage/use-cases/code-review) — The complementary code review agent From a3714cae4d677870863bfc10df0b3158956303d2 Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 27 May 2026 17:08:28 +0000 Subject: [PATCH 2/8] =?UTF-8?q?fix:=20resolve=20CI=20failures=20=E2=80=94?= =?UTF-8?q?=20broken=20links=20and=20automation=20sync?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Replace internal verification-stack links with external URLs (blog post and extensions repo) since that page hasn't landed on main yet - Run sync_use_case_automations.py to update automations/overview.mdx Co-authored-by: openhands --- openhands/usage/automations/overview.mdx | 7 +++++++ openhands/usage/use-cases/qa-changes.mdx | 6 +++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/openhands/usage/automations/overview.mdx b/openhands/usage/automations/overview.mdx index 5c76f31e..cb268bcd 100644 --- a/openhands/usage/automations/overview.mdx +++ b/openhands/usage/automations/overview.mdx @@ -115,6 +115,13 @@ Each use case has a ready-to-use automation prompt. Click a card to see the full > Monitor API health, analyze errors, and alert your team automatically. + + Functionally test PR changes by exercising the software as a real user would. + Date: Wed, 27 May 2026 18:04:30 +0000 Subject: [PATCH 3/8] fix: add missing 'Automate This' section to qa-changes.mdx MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses review feedback — automations/overview.mdx links to qa-changes#automate-this but the anchor didn't exist. Every other use-case page has this section; now qa-changes does too. Co-authored-by: openhands --- openhands/usage/use-cases/qa-changes.mdx | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/openhands/usage/use-cases/qa-changes.mdx b/openhands/usage/use-cases/qa-changes.mdx index 1cead2cf..dc71a441 100644 --- a/openhands/usage/use-cases/qa-changes.mdx +++ b/openhands/usage/use-cases/qa-changes.mdx @@ -187,6 +187,25 @@ The QA agent is most powerful when used alongside the [code review agent](/openh +## Automate This + +You can run QA automatically on every PR using [OpenHands Automations](/openhands/usage/automations/overview). +Copy this prompt into a new conversation to set one up: + +``` +Create an automation called "Automated QA" that triggers on pull_request.opened +and pull_request.labeled (with label "qa-this") for my repositories. + +It should use the qa-changes plugin from github:OpenHands/extensions to: +1. Check out the PR branch +2. Run the QA agent to exercise the changed behavior as a real user would +3. Post a structured QA report as a PR comment with evidence (commands run, outputs, screenshots) + +Learn more at https://docs.openhands.dev/openhands/usage/use-cases/qa-changes +``` + +For automated QA on every push, use the [qa-changes plugin](https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes) as a GitHub Action instead. + ## Related Resources - [QA Changes Plugin](https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes) — GitHub Actions plugin From cdb4dba7ace35238c82895d66ac569840f209778 Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 27 May 2026 18:10:35 +0000 Subject: [PATCH 4/8] =?UTF-8?q?fix:=20address=20review=20feedback=20?= =?UTF-8?q?=E2=80=94=20pin=20action=20ref,=20fix=20model=20ID,=20soften=20?= =?UTF-8?q?retry=20language?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Pin GitHub Action to @v1 instead of @main for reproducible builds - Use valid model ID anthropic/claude-sonnet-4-20250514 - Soften hard-coded retry count language to avoid docs going stale Co-authored-by: openhands --- openhands/usage/use-cases/qa-changes.mdx | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/openhands/usage/use-cases/qa-changes.mdx b/openhands/usage/use-cases/qa-changes.mdx index dc71a441..f0e42b26 100644 --- a/openhands/usage/use-cases/qa-changes.mdx +++ b/openhands/usage/use-cases/qa-changes.mdx @@ -28,7 +28,7 @@ The QA agent follows a four-phase methodology: 3. **Exercise** — The core phase: spins up servers, opens browsers, runs CLI commands, makes HTTP requests — testing the changed behavior as a real user would. For bug fixes, it reproduces the bug on the base branch and verifies the fix on the PR branch. 4. **Report** — Posts a structured QA report as a PR comment, with evidence (commands run, outputs, screenshots) and a verdict (PASS / FAIL / PARTIAL). -The QA agent knows when to give up: if an approach fails after three materially different attempts, it switches strategy. If two fundamentally different strategies fail, it reports what it tried and stops — rather than spinning endlessly. +The QA agent knows when to give up: after exhausting multiple approaches without progress, it reports what it tried and stops — rather than spinning endlessly. ## What It Does (and Doesn't) @@ -75,9 +75,9 @@ jobs: runs-on: ubuntu-latest steps: - name: Run QA Changes - uses: OpenHands/extensions/plugins/qa-changes@main + uses: OpenHands/extensions/plugins/qa-changes@v1 with: - llm-model: anthropic/claude-sonnet-4-5-20250929 + llm-model: anthropic/claude-sonnet-4-20250514 llm-api-key: ${{ secrets.LLM_API_KEY }} github-token: ${{ secrets.GITHUB_TOKEN }} ``` From 283da1153020e7867d3d511bf89be5b7ebbb5e37 Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 27 May 2026 18:16:40 +0000 Subject: [PATCH 5/8] fix: add skill install step before /qa-changes invocation Users need to install the skill first via /add-skill before running /qa-changes in a conversation. Without this, they get a skill not found error. Co-authored-by: openhands --- openhands/usage/use-cases/qa-changes.mdx | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/openhands/usage/use-cases/qa-changes.mdx b/openhands/usage/use-cases/qa-changes.mdx index f0e42b26..444a1f0b 100644 --- a/openhands/usage/use-cases/qa-changes.mdx +++ b/openhands/usage/use-cases/qa-changes.mdx @@ -86,7 +86,13 @@ Add your `LLM_API_KEY` to your repository's **Settings → Secrets and variables ### In a Conversation -You can also trigger QA manually in any OpenHands conversation by invoking the skill: +You can also trigger QA manually in any OpenHands conversation. First, install the skill: + +``` +/add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes +``` + +Then invoke it: ``` /qa-changes From e3c84b1f1460f8eb9b4dc65d3b52ce4b1c0027e7 Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 27 May 2026 18:30:36 +0000 Subject: [PATCH 6/8] =?UTF-8?q?fix:=20revert=20action=20pin=20to=20@main?= =?UTF-8?q?=20=E2=80=94=20@v1=20tag=20does=20not=20exist?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OpenHands/extensions does not publish release tags yet, so @v1 is not available. Revert to @main for now. Co-authored-by: openhands --- openhands/usage/use-cases/qa-changes.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/openhands/usage/use-cases/qa-changes.mdx b/openhands/usage/use-cases/qa-changes.mdx index 444a1f0b..73d620ce 100644 --- a/openhands/usage/use-cases/qa-changes.mdx +++ b/openhands/usage/use-cases/qa-changes.mdx @@ -75,7 +75,7 @@ jobs: runs-on: ubuntu-latest steps: - name: Run QA Changes - uses: OpenHands/extensions/plugins/qa-changes@v1 + uses: OpenHands/extensions/plugins/qa-changes@main with: llm-model: anthropic/claude-sonnet-4-20250514 llm-api-key: ${{ secrets.LLM_API_KEY }} From 117a2deb5a7d496018d9a05dadf2340a5ed87100 Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 27 May 2026 18:31:35 +0000 Subject: [PATCH 7/8] fix: revert @v1 pin to @main, document extensions has no release tags OpenHands/extensions does not publish versioned tags. Reverts the action reference to @main and adds a note to AGENTS.md so future agents and reviewers don't repeat this mistake. Co-authored-by: openhands --- AGENTS.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/AGENTS.md b/AGENTS.md index de1edc63..9385713b 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -198,10 +198,23 @@ Run locally: uv run --with pytest --with requests pytest -q tests/ ``` +## OpenHands/extensions — No Release Tags + +`OpenHands/extensions` does **not** publish versioned release tags (no `v1`, `v2`, etc.). +All GitHub Action references to plugins in that repo must use `@main`: + +```yaml +uses: OpenHands/extensions/plugins/qa-changes@main +uses: OpenHands/extensions/plugins/pr-review@main +``` + +Do **not** suggest pinning to `@v1` or any other tag — they don't exist and the workflow will fail. + ## Related repos (source-of-truth) - OpenHands Agent SDK: https://github.com/OpenHands/software-agent-sdk - OpenHands CLI: https://github.com/OpenHands/OpenHands-CLI - OpenHands (Web/App): https://github.com/OpenHands/OpenHands +- OpenHands Extensions: https://github.com/OpenHands/extensions (plugins, skills, actions — **no release tags**) When updating SDK features or examples, expect to update this repo too (especially under `sdk/`). From 52621d6aa52df81e7f09503d72a13abb8c99c7b3 Mon Sep 17 00:00:00 2001 From: openhands Date: Wed, 27 May 2026 18:35:41 +0000 Subject: [PATCH 8/8] fix: correct custom QA guide filename and name to match extensions repo The extensions README documents the skill file as .agents/skills/qa-guide.md with name: qa-guide, not custom-qa-guide.md. Align docs to match the actual upstream example. Co-authored-by: openhands --- openhands/usage/use-cases/qa-changes.mdx | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/openhands/usage/use-cases/qa-changes.mdx b/openhands/usage/use-cases/qa-changes.mdx index 73d620ce..7d0a2e2e 100644 --- a/openhands/usage/use-cases/qa-changes.mdx +++ b/openhands/usage/use-cases/qa-changes.mdx @@ -145,12 +145,12 @@ The QA agent adapts its approach based on the type of change: ### Repository-Specific QA Guidelines -Add repo-specific QA instructions by creating `.agents/skills/custom-qa-guide.md`: +Add repo-specific QA instructions by creating `.agents/skills/qa-guide.md`: ```markdown --- -name: custom-qa-guide -description: Custom QA guidelines for this repository +name: qa-guide +description: Project-specific QA guidelines triggers: - /qa-changes ---