QAMule

An agent-native Android testing framework — the agent itself is the tester.

中文 • Quick Start • How It Works • Practice Reference • MIT License

What Is This?

QAMule is an agent-first Android QA framework. It does not use AI to help you generate scripts first. Instead, it lets an AI agent operate the device directly to complete testing. Scripts still matter, but they are no longer the prerequisite for testing. They become an acceleration cache built after a behavior has already been validated.

The key shift is not "using AI to assist automation." The key shift is moving the test entry point from scripts to agents, then turning long-term reuse into KB entries, pytest assets, and datasets.

Paradigm shift:

	Traditional Test Automation	QAMule
Primary executor	Scripts	AI agent
Role of AI	Generate or maintain scripts	Execute the test directly
Scripts	Required upfront	Optional, used for faster replay
Failure handling	Snapshot only, environment is lost	Failure state is preserved so the agent can diagnose and try to recover
New scenarios	Write scripts first, then validate	Explore and test directly
Knowledge retention	Scattered across scripts and human memory	Continuously written into the KB for reuse

Core Highlights

The agent tests directly. When QAMule sees a new page, flow, or feature, its first response is not to patch scripts. It lets the agent inspect, try, and validate first. That makes it a natural fit for Android teams with fast-changing requirements, lagging regression coverage, and high exploration cost.
QA and Distiller have distinct roles. The QA agent handles exploration, validation, regression, issue reproduction, and eventually pytest script distillation for mature scenarios. The Distiller agent captures real visual interaction trajectories and turns them into VLM training data. Their outputs differ, but they share the same KB.
The KB makes experience reusable. Pages, elements, flows, dependency apps, and abnormal behaviors are continuously written into kb/. One exploration run does not only solve the current task. It also leaves context behind for future QA and Distiller runs.
Scripts are a cache of successful experience. Pytest is not the entry point to testing. It is the artifact of stable, validated scenarios. The agent handles change; scripts provide cheap regression.
Failure state is preserved. QAMule freezes the failing scene with pause-on-failure, so the agent can keep observing, analyzing, and sometimes recovering before teardown runs, instead of relying only on postmortem logs.

Quick Start

Prerequisites

An Android device connected over USB, or an emulator
UV for managing the Python environment and dependencies
ADB, the Android Debug Bridge

Installation

Install QAMule into your project as an agent plugin:

# GitHub Copilot
copilot plugin marketplace add lanbaoshen/agent-plugins
copilot plugin install QAMule@lanbaoshen

# Claude Code
/plugin marketplace add lanbaoshen/agent-plugins
/plugin install QAMule@lanbaoshen

# VS Code
# command + shift + p -> "Chat: Install plugin from Source" -> "lanbaoshen/agent-plugins"

Initialize the project structure:

/qamule:init

This copies config files, creates the project skeleton, and installs the base dependencies.

Usage

Explore a page:

@QAMule QA explore the Settings page

Test a feature:

@QAMule QA test whether the Bluetooth toggle in Settings works correctly

Collect one training trajectory:

@QAMule Distiller open Bluetooth in com.android.settings

Update the knowledge base:

/qamule:kb This app can only be launched through the UI

Query the knowledge base:

/qamule:kb How do I check the system version?

Review existing coverage:

/qamule:testcase Which features are covered right now?

Preview Dataset

/qamule:dataset Preview dataset

How It Works

QA Agent — test by doing, then distill

The QA agent follows a human-like testing loop:

Observe -> Plan -> Execute -> Verify -> Learn -> Record
	 ^                                               |
	 +-----------------------------------------------+

Observe: capture a screenshot and read the current screen
Plan: decide the next step from the goal and the KB
Execute: issue one device command
Verify: confirm whether the action produced the expected effect
Learn: write new findings into the KB
Record: generate pytest scripts for suitable scenarios to support later regression

Scripts are the distilled result of a successful test, not a prerequisite before the test starts.

Distiller Agent — collect real trajectories for vision models

Distiller does one thing only: capture high-quality, replayable trajectories from real device interaction.

It uses screenshots and the current app state as its main observation input, executes actions with absolute coordinates, and logs screenshots, actions, reasoning, foreground app, and results step by step into dataset/. It does not generate pytest scripts. Its focus is preserving real interaction behavior, including missteps, corrections, waiting, and recovery. It also reuses page and flow knowledge from the KB to reduce blind trial and error.

Knowledge Base — shared memory for both agents

kb/ is QAMule's long-term memory layer. It stores page information, element selectors, business flows, dependency apps, and known quirks.

QA Agent uses it to reduce repeated exploration and improve testing efficiency.
Distiller Agent uses it to understand the target app and context faster.

One of QAMule's core design choices is not asking a single agent to produce everything. Different agents collaborate around the same shared knowledge and accumulate it over time, while producing different kinds of testing and data assets.

Agents

Agent	Purpose	Output
QA	Exploratory testing, regression testing, issue reproduction	KB entries + pytest scripts
Distiller	Training data collection	Coordinate-based trajectory data

Skills

Skill	Responsibility
uiautomator2	Internal device operation skill, using `u2cli` for tapping, swiping, input, screenshots, and app management
kb	Read and write persistent app knowledge: pages, flows, selectors, and abnormal behavior
pytest	Internal pytest conventions for script structure, fixture contracts, run modes, and pause-on-failure usage
testcase	Look up existing cases before manual testing to avoid duplicate effort
dataset	Manage VLM training trajectories: naming, schema, and visual browsing
init	One-time project scaffolding

Design Principles

The agent is the tester. AI is not writing tests for another system. It executes the test itself.
Scripts are an acceleration layer, not the prerequisite. Pytest exists to reuse validated behavior, not to limit testing to what already has scripts.
QA and Distiller are separated, not blended. One produces testing assets, the other produces data assets.
The knowledge base is the collaboration layer. QA and Distiller share page understanding, flow knowledge, and exception experience through the KB, reducing repeated exploration.
Failure state matters more than logs. Preserve the state first, let the agent analyze in place, and try recovery when possible.

Who This Fits

Android teams with fast iteration and high script maintenance cost
Teams that want AI to participate in test execution, not just code generation
Teams that want to turn the testing process into long-term knowledge assets
Teams that care about both engineering test coverage and vision-operation model data accumulation

Dependencies

uiautomator2 — Android automation library
uiautomator2-cli — CLI wrapper designed for agent usage
pytest — structured testing framework

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.claude-plugin		.claude-plugin
.github/plugin		.github/plugin
agents		agents
skills		skills
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_ZH.md		README_ZH.md
distiller-agent-workflow-bilingual.excalidraw.svg		distiller-agent-workflow-bilingual.excalidraw.svg
logo.svg		logo.svg
plugin.json		plugin.json
qa-agent-workflow-bilingual.excalidraw.svg		qa-agent-workflow-bilingual.excalidraw.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QAMule

What Is This?

Core Highlights

Quick Start

Prerequisites

Installation

Usage

How It Works

QA Agent — test by doing, then distill

Distiller Agent — collect real trajectories for vision models

Knowledge Base — shared memory for both agents

Agents

Skills

Design Principles

Who This Fits

Dependencies

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

QAMule

What Is This?

Core Highlights

Quick Start

Prerequisites

Installation

Usage

How It Works

QA Agent — test by doing, then distill

Distiller Agent — collect real trajectories for vision models

Knowledge Base — shared memory for both agents

Agents

Skills

Design Principles

Who This Fits

Dependencies

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages