-
Notifications
You must be signed in to change notification settings - Fork 27
docs: add Automated QA Testing page #529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
xingyaoww
wants to merge
8
commits into
main
Choose a base branch
from
add-qa-changes-docs
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
ccbb346
docs: add Automated QA Testing page
openhands-agent a3714ca
fix: resolve CI failures — broken links and automation sync
openhands-agent e75c4f9
fix: add missing 'Automate This' section to qa-changes.mdx
openhands-agent cdb4dba
fix: address review feedback — pin action ref, fix model ID, soften r…
openhands-agent 283da11
fix: add skill install step before /qa-changes invocation
openhands-agent e3c84b1
fix: revert action pin to @main — @v1 tag does not exist
openhands-agent 117a2de
fix: revert @v1 pin to @main, document extensions has no release tags
openhands-agent 52621d6
fix: correct custom QA guide filename and name to match extensions repo
openhands-agent File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,220 @@ | ||
| --- | ||
| title: Automated QA Testing | ||
| description: Validate pull request changes by actually running the software — not just reading code or running tests | ||
| automation: | ||
| icon: vial | ||
| summary: >- | ||
| Functionally test PR changes by exercising the software as a real user would. | ||
| --- | ||
|
|
||
| <Card | ||
| title="View QA Changes Plugin" | ||
| icon="github" | ||
| href="https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes" | ||
| > | ||
| Check out the complete QA changes plugin with ready-to-use code and configuration. | ||
| </Card> | ||
|
|
||
| Automated QA testing goes beyond code review and CI: instead of reading diffs or running the test suite, the QA agent actually **runs the software** and verifies that changes work as claimed. It sets up the environment, exercises changed behavior as a real user would (browser, CLI, API requests), and posts a structured report with evidence. | ||
|
|
||
| This is Layer 2 of the [Verification Stack](https://www.openhands.dev/blog/verification-stack), complementing the [code review agent](/openhands/usage/use-cases/code-review). | ||
|
|
||
| ## Overview | ||
|
|
||
| The QA agent follows a four-phase methodology: | ||
|
|
||
| 1. **Understand** — Reads the PR diff, title, and description. Classifies changes (new feature, bug fix, refactor, config) and identifies entry points (CLI commands, API endpoints, UI pages). | ||
| 2. **Setup** — Bootstraps the repository: installs dependencies, builds the project, notes CI status. | ||
| 3. **Exercise** — The core phase: spins up servers, opens browsers, runs CLI commands, makes HTTP requests — testing the changed behavior as a real user would. For bug fixes, it reproduces the bug on the base branch and verifies the fix on the PR branch. | ||
| 4. **Report** — Posts a structured QA report as a PR comment, with evidence (commands run, outputs, screenshots) and a verdict (PASS / FAIL / PARTIAL). | ||
|
|
||
| The QA agent knows when to give up: after exhausting multiple approaches without progress, it reports what it tried and stops — rather than spinning endlessly. | ||
|
|
||
| ## What It Does (and Doesn't) | ||
|
|
||
| <CardGroup cols={2}> | ||
| <Card title="QA Agent Does" icon="check"> | ||
| - Run the actual application and interact with it | ||
| - Make real HTTP requests, run real CLI commands | ||
| - Open browsers and verify UI changes | ||
| - Reproduce bugs and verify fixes end-to-end | ||
| - Report with evidence (commands, outputs, screenshots) | ||
| </Card> | ||
| <Card title="QA Agent Does NOT" icon="xmark"> | ||
| - Run the test suite (that's CI's job) | ||
| - Analyze code for style or structure (that's code review's job) | ||
| - Run linters, formatters, or type checkers | ||
| - Substitute `--help` or `--dry-run` for real execution | ||
| </Card> | ||
| </CardGroup> | ||
|
|
||
| ## Quick Start | ||
|
|
||
| ### GitHub Actions | ||
|
|
||
| Create `.github/workflows/qa-changes.yml` in your repository: | ||
|
|
||
| ```yaml | ||
| name: QA Changes | ||
|
|
||
| on: | ||
| pull_request: | ||
| types: [opened, ready_for_review, labeled] | ||
|
|
||
| permissions: | ||
| contents: read | ||
| pull-requests: write | ||
| issues: write | ||
|
|
||
| jobs: | ||
| qa: | ||
| if: | | ||
| (github.event.action == 'opened' && github.event.pull_request.draft == false) || | ||
| github.event.action == 'ready_for_review' || | ||
| github.event.label.name == 'qa-this' | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Run QA Changes | ||
| uses: OpenHands/extensions/plugins/qa-changes@main | ||
| with: | ||
| llm-model: anthropic/claude-sonnet-4-20250514 | ||
| llm-api-key: ${{ secrets.LLM_API_KEY }} | ||
| github-token: ${{ secrets.GITHUB_TOKEN }} | ||
| ``` | ||
|
|
||
| Add your `LLM_API_KEY` to your repository's **Settings → Secrets and variables → Actions**. | ||
|
|
||
| ### In a Conversation | ||
|
|
||
| You can also trigger QA manually in any OpenHands conversation. First, install the skill: | ||
|
|
||
| ``` | ||
| /add-skill https://github.com/OpenHands/extensions/tree/main/skills/qa-changes | ||
| ``` | ||
|
|
||
| Then invoke it: | ||
|
|
||
| ``` | ||
| /qa-changes | ||
| ``` | ||
|
|
||
| The agent will ask for the PR to test, or you can provide context directly: | ||
|
|
||
| ``` | ||
| /qa-changes — Please QA PR #42 on the my-org/my-repo repository. | ||
| Focus on the new dashboard page and verify it renders correctly. | ||
| ``` | ||
|
|
||
| ## QA Report Format | ||
|
|
||
| The QA agent posts a structured report as a PR comment: | ||
|
|
||
| ``` | ||
| ## QA Report | ||
|
|
||
| **Status: PASS** ✅ | ||
|
|
||
| ### Changes Tested | ||
| - New `/api/health` endpoint returns 200 with version info | ||
| - Dashboard page renders at `/dashboard` with correct data | ||
|
|
||
| ### Evidence | ||
| 1. Started server with `npm run dev` | ||
| 2. `curl http://localhost:3000/api/health` → 200 OK, body: {"status":"ok","version":"1.2.0"} | ||
| 3. Navigated to http://localhost:3000/dashboard — page renders correctly | ||
| [screenshot attached] | ||
|
|
||
| ### Edge Cases | ||
| - Empty database state: dashboard shows "No data" placeholder ✅ | ||
| - Invalid auth token: returns 401 as expected ✅ | ||
| ``` | ||
|
|
||
| ## Customization | ||
|
|
||
| ### Change Types | ||
|
|
||
| The QA agent adapts its approach based on the type of change: | ||
|
|
||
| | Change Type | QA Approach | | ||
| |-------------|-------------| | ||
| | **Frontend / UI** | Starts dev server, opens browser, verifies visual changes, tests interactions | | ||
| | **CLI** | Runs commands with realistic arguments, verifies output, tests edge cases | | ||
| | **API / Backend** | Starts server, makes HTTP requests, verifies responses and side effects | | ||
| | **Bug fix** | Reproduces bug on base branch, verifies fix on PR branch (before/after) | | ||
| | **Library / SDK** | Writes and runs a short script that imports and calls changed functions | | ||
|
|
||
| ### Repository-Specific QA Guidelines | ||
|
|
||
| Add repo-specific QA instructions by creating `.agents/skills/qa-guide.md`: | ||
|
|
||
| ```markdown | ||
| --- | ||
| name: qa-guide | ||
| description: Project-specific QA guidelines | ||
| triggers: | ||
| - /qa-changes | ||
| --- | ||
|
|
||
| # QA Guidelines for [Your Project] | ||
|
|
||
| ## Environment Setup | ||
| - Run `make setup` to initialize the development environment | ||
| - The dev server runs on port 8080 | ||
|
|
||
| ## Key Test Scenarios | ||
| - Always verify the admin dashboard at /admin after backend changes | ||
| - For API changes, test with both authenticated and unauthenticated requests | ||
|
|
||
| ## Known Limitations | ||
| - The payment module requires a Stripe test key — skip payment flow testing | ||
| ``` | ||
|
|
||
| ## Integration with the Verification Stack | ||
|
|
||
| The QA agent is most powerful when used alongside the [code review agent](/openhands/usage/use-cases/code-review) and the [iterate skill](https://github.com/OpenHands/extensions/tree/main/skills/iterate) as part of the full [Verification Stack](https://www.openhands.dev/blog/verification-stack): | ||
|
|
||
| 1. **Code review** catches issues by reading the diff (style, security, data structures) | ||
| 2. **QA** catches issues by running the software (behavioral regressions, UI bugs) | ||
| 3. **Iterate** orchestrates the loop — fixing issues flagged by either verifier and re-polling until the PR is clean | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| <AccordionGroup> | ||
| <Accordion title="QA agent can't start the server"> | ||
| Ensure your repository's setup instructions are documented in `README.md` or `AGENTS.md`. The agent follows these to bootstrap the environment. If setup requires special steps, add them to a custom QA guide. | ||
| </Accordion> | ||
|
|
||
| <Accordion title="QA report says PARTIAL"> | ||
| PARTIAL means some scenarios passed and others failed or couldn't be tested. Read the report details — it will explain what worked and what didn't. Common causes: missing environment variables, external service dependencies, or insufficient permissions. | ||
| </Accordion> | ||
|
|
||
| <Accordion title="QA takes too long"> | ||
| For large PRs with many changed entry points, the agent may need more time. Consider splitting large PRs into smaller, focused changes. You can also add a custom QA guide that prioritizes the most important scenarios. | ||
| </Accordion> | ||
| </AccordionGroup> | ||
|
|
||
| ## Automate This | ||
|
|
||
| You can run QA automatically on every PR using [OpenHands Automations](/openhands/usage/automations/overview). | ||
| Copy this prompt into a new conversation to set one up: | ||
|
|
||
| ``` | ||
| Create an automation called "Automated QA" that triggers on pull_request.opened | ||
| and pull_request.labeled (with label "qa-this") for my repositories. | ||
|
|
||
| It should use the qa-changes plugin from github:OpenHands/extensions to: | ||
| 1. Check out the PR branch | ||
| 2. Run the QA agent to exercise the changed behavior as a real user would | ||
| 3. Post a structured QA report as a PR comment with evidence (commands run, outputs, screenshots) | ||
|
|
||
| Learn more at https://docs.openhands.dev/openhands/usage/use-cases/qa-changes | ||
| ``` | ||
|
|
||
| For automated QA on every push, use the [qa-changes plugin](https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes) as a GitHub Action instead. | ||
|
|
||
| ## Related Resources | ||
|
|
||
| - [QA Changes Plugin](https://github.com/OpenHands/extensions/tree/main/plugins/qa-changes) — GitHub Actions plugin | ||
| - [QA Changes Skill](https://github.com/OpenHands/extensions/tree/main/skills/qa-changes) — Detailed skill methodology | ||
| - [Verification Stack](https://www.openhands.dev/blog/verification-stack) — How QA fits into the full verification pipeline | ||
| - [Automated Code Review](/openhands/usage/use-cases/code-review) — The complementary code review agent | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔴 Critical: This links to
#automate-thisbutqa-changes.mdxhas no## Automate Thissection. Every other use-case page linked from this automations overview (code-review, dependency-upgrades, incident-triage, vulnerability-remediation) has this section. Without it, the anchor silently falls back to the top of the page instead of scrolling to the automation setup content.Fix: Add a
## Automate Thissection toqa-changes.mdxfollowing the pattern fromcode-review.mdxline 368.