diff --git a/skills/dotagents-qa/Dockerfile b/skills/dotagents-qa/Dockerfile new file mode 100644 index 0000000..9686cab --- /dev/null +++ b/skills/dotagents-qa/Dockerfile @@ -0,0 +1,24 @@ +FROM node:24-bookworm + +ENV CI=1 +ENV PNPM_HOME=/pnpm +ENV COREPACK_HOME=/corepack +ENV PATH=/pnpm:$PATH + +RUN apt-get update \ + && apt-get install -y --no-install-recommends \ + ca-certificates \ + git \ + jq \ + less \ + openssh-client \ + ripgrep \ + && rm -rf /var/lib/apt/lists/* + +RUN mkdir -p "$PNPM_HOME" "$COREPACK_HOME" \ + && corepack enable \ + && corepack prepare pnpm@10.28.2 --activate + +WORKDIR /sandbox + +CMD ["bash"] diff --git a/skills/dotagents-qa/SKILL.md b/skills/dotagents-qa/SKILL.md index ad8f838..8bcde77 100644 --- a/skills/dotagents-qa/SKILL.md +++ b/skills/dotagents-qa/SKILL.md @@ -1,38 +1,121 @@ --- name: dotagents-qa -description: QA dotagents behavior changes with a local fixture project. Use when changes may affect dotagents install, sync, list, doctor, skill placement, agent symlinks, MCP or hook config generation, user scope, or package/runtime behavior and an agent needs to prove a representative agents.toml setup still installs correctly. +description: QA dotagents behavior changes in a Docker sandbox. Use when changes may affect dotagents install, sync, list, doctor, skill placement, agent symlinks, MCP or hook config generation, user scope, or package/runtime behavior. --- # dotagents QA -Answer the practical question: "With this local dotagents build, does a representative `agents.toml` still install and wire the expected files?" +Do real QA for the change in front of you. Docker is the safety boundary, not +the test plan: use it so dotagents cannot write to host agent config, host home +directories, or host cache state while you build fixtures that prove the +changed behavior. -## Workflow +## 1. Understand The Change -1. Inspect the changed surface: - - `git status --short` - - `git diff --stat` - - `git diff -- ` -2. Build the local CLI before smoke testing: +Start from the diff and identify the behavior that could regress: ```bash -pnpm build +git status --short +git diff --stat +git diff -- ``` -3. Create a temp project that exercises the changed behavior. Start here and add only what the change needs. +Write down the QA target before running commands: +- Which command path changed: `install`, `sync`, `list`, `doctor`, `add`, + `remove`, `mcp`, `trust`, `init`, package runtime, or scope resolution. +- Which surfaces must be inspected: `.agents/skills`, agent skill symlinks, + MCP config, hook config, lockfile, gitignore, CLI output, or user scope. +- Which fixture shape proves it: local skills, nested skills, wildcard source, + specific agents, MCP entries, hooks, existing broken state, or remote source. + +Run focused Vitest coverage for logic bugs. Use this skill for end-to-end QA +evidence, not as a substitute for regression tests. + +## 2. Enter A Docker Sandbox + +Build the repo-local QA image when it is missing, when this Dockerfile changes, +or when the repo `packageManager` pnpm version changes: + +```bash +docker build \ + -f skills/dotagents-qa/Dockerfile \ + -t dotagents-qa:local \ + skills/dotagents-qa +``` + +Use an interactive container so the QA steps stay change-specific: ```bash -set -euo pipefail REPO="$(pwd)" -TMP="$(mktemp -d)" -mkdir -p "$TMP/local-skills" "$TMP/home" "$TMP/state" -for skill in review commit; do - mkdir -p "$TMP/local-skills/$skill" - printf -- "---\nname: %s\ndescription: Fixture %s skill.\n---\n\n%s fixture.\n" \ - "$skill" "$skill" "$skill" > "$TMP/local-skills/$skill/SKILL.md" -done - -cat > "$TMP/agents.toml" <<'EOF' +OUT="$(mktemp -d "${TMPDIR:-/tmp}/dotagents-qa.XXXXXX")" +docker run --rm -it \ + -v "$REPO:/host-repo:ro" \ + -v "$OUT:/qa-out" \ + dotagents-qa:local +``` + +If your tool environment is not attached to a TTY, use `-i` instead of `-it` +and feed the same commands with a here-doc. Keep `-i`; without stdin attached, +the container shell will receive no script. + +Inside the container: + +```bash +set -euo pipefail +export CI=1 +export HOME=/sandbox/home +export DOTAGENTS_STATE_DIR=/sandbox/state +export DOTAGENTS_HOME=/sandbox/user-agents + +mkdir -p "$HOME" "$DOTAGENTS_STATE_DIR" "$DOTAGENTS_HOME" /sandbox/repo +tar -C /host-repo \ + --exclude=.git \ + --exclude=node_modules \ + --exclude=.turbo \ + --exclude=coverage \ + --exclude=core \ + --exclude='*.tsbuildinfo' \ + --exclude='packages/*/dist' \ + -cf - . | tar -C /sandbox/repo -xf - + +cd /sandbox/repo +pnpm install --frozen-lockfile +pnpm build +``` + +Run `pnpm check` inside Docker unless the change requires a narrower +target or the check is already known to be unrelated. If `build` or `check` +fails, treat that as a QA finding and stop before fixture work unless you are +explicitly isolating the playbook mechanics. If skipped or bypassed, report why. + +## 3. Build The Fixture + +Create the smallest fixture that proves the changed behavior. Start from this +shape only when it fits; edit it aggressively for the diff you are testing. + +```bash +fixture=/sandbox/fixture +mkdir -p "$fixture/local-skills/review" "$fixture/local-skills/commit" + +cat > "$fixture/local-skills/review/SKILL.md" <<'SKILL' +--- +name: review +description: Fixture review skill. +--- + +Review fixture. +SKILL + +cat > "$fixture/local-skills/commit/SKILL.md" <<'SKILL' +--- +name: commit +description: Fixture commit skill. +--- + +Commit fixture. +SKILL + +cat > "$fixture/agents.toml" <<'TOML' version = 1 agents = ["claude", "cursor"] @@ -52,27 +135,43 @@ args = ["-e", "process.exit(0)"] [[hooks]] event = "Stop" command = "echo fixture" -EOF +TOML ``` -4. Run the local CLI from inside the temp project, with home/cache isolated: +Useful fixture changes: +- Skill resolution: nested `skills/` layouts, wildcard sources, duplicate + names, local paths outside `.agents`, or the exact `path:` shape touched. +- Agent placement: include only affected agents, or include all supported + agents when shared config or registry behavior changed. +- MCP and hooks: use the exact command, URL, headers, env refs, hook event, or + matcher affected by the diff. +- Sync and doctor: pre-create broken or legacy state, then prove repair and + diagnostics. +- User scope: run from outside a project with `--user` and inspect + `$DOTAGENTS_HOME`; never use the host home directory. +- Remote sources: use the real source and trust policy only when source, + trust, git, well-known, or network behavior changed. + +## 4. Exercise The CLI + +Run the built local CLI from the fixture. Capture output and inspect generated +files, not just exit codes. ```bash -set -euo pipefail -export HOME="$TMP/home" DOTAGENTS_STATE_DIR="$TMP/state" -cd "$TMP" -node "$REPO/packages/dotagents/dist/cli/index.js" install -node "$REPO/packages/dotagents/dist/cli/index.js" list > "$TMP/list.out" -node "$REPO/packages/dotagents/dist/cli/index.js" doctor --fix -node "$REPO/packages/dotagents/dist/cli/index.js" doctor +cli=(node /sandbox/repo/packages/dotagents/dist/cli/index.js) +cd "$fixture" + +"${cli[@]}" install | tee /qa-out/install.out +"${cli[@]}" list | tee /qa-out/list.out +"${cli[@]}" doctor --fix | tee /qa-out/doctor-fix.out +"${cli[@]}" doctor | tee /qa-out/doctor.out ``` -5. Assert the behavior that matters: +Assert what matters for this change. Examples: ```bash -set -euo pipefail -grep -q "review" "$TMP/list.out" && grep -q "commit" "$TMP/list.out" -test -f .agents/skills/review/SKILL.md && test -f .agents/skills/commit/SKILL.md +grep -q "review" /qa-out/list.out +test -f .agents/skills/review/SKILL.md test -L .claude/skills test -f .mcp.json test -f .cursor/mcp.json @@ -80,50 +179,50 @@ test -f .claude/settings.json test -f .cursor/hooks.json ``` -For `sync` changes, break generated state and assert repair: +For `sync` changes, break the generated state in the way the diff claims to +repair, then verify the repair: ```bash -set -euo pipefail rm .mcp.json .claude/skills -node "$REPO/packages/dotagents/dist/cli/index.js" sync +"${cli[@]}" sync | tee /qa-out/sync.out test -f .mcp.json test -L .claude/skills ``` -## Adjust The Fixture - -- Skill resolution changes: use multiple local skills, nested `skills/` directories, or the exact `path:` layout touched by the change. -- Agent placement changes: edit `agents = [...]` and assert expected symlinks/config files. -- MCP or hook changes: include representative `[[mcp]]` or `[[hooks]]` entries and inspect generated JSON/TOML. -- User-scope changes: set `DOTAGENTS_HOME="$TMP/user-home"` and pass `--user`; never write to the real home directory. -- Package/runtime changes: run the built CLI as above, or pack/install the package only when the packaging path itself changed. -- Remote source changes: use `getsentry/skills`; avoid remotes for ordinary install-location checks. - -## Optional Agent CLI Checks - -Use only when discovery/registration changed. +For user-scope changes: ```bash -set -euo pipefail -mkdir -p "$TMP/codex-home" -CODEX_HOME="$TMP/codex-home" codex debug prompt-input "probe skills" > "$TMP/codex-prompt.json" -grep -q "review" "$TMP/codex-prompt.json" +cd /sandbox +"${cli[@]}" --user install | tee /qa-out/user-install.out +test -f "$DOTAGENTS_HOME/agents.toml" +test -d "$DOTAGENTS_HOME/skills" ``` -Claude: no cheap dry-run skill list. If auth/network/model cost is acceptable, run a minimal non-interactive `/skill-name` prompt from `$TMP`; otherwise report skipped. +Copy useful evidence before leaving the container: -## Optional Remote Check +```bash +cp -a "$fixture" /qa-out/fixture +cp -a "$DOTAGENTS_HOME" /qa-out/user-agents 2>/dev/null || true +``` -Use only when source resolution, trust, git, well-known, or network behavior changed. +## 5. Real Agent Clients -```toml -[[skills]] -name = "code-review" -source = "getsentry/skills" -``` +Use real clients only when discovery or registration in Claude, Cursor, Codex, +VS Code, or OpenCode changed. Docker proves generated files and symlinks; it +does not prove an installed host client notices them. -Run `pnpm check` after the smoke test unless there is a concrete reason to run only a targeted package check. +Keep host-client checks isolated with explicit temp homes/config dirs where +the client supports it. If a client cannot run without reading host state, say +so and report the Docker-generated files you inspected instead. -## Report +## 6. Report Evidence -Include the temp path if it remains for debugging, the exact fixture shape, commands run, assertions made, and any skipped checks. +Report: +- the changed behavior you targeted +- Docker image and setup used +- fixture shape and why it matched the diff +- commands run +- generated files or command output inspected +- assertions that passed +- `/qa-out` host path if retained for debugging +- skipped checks and residual risk diff --git a/skills/dotagents-qa/SOURCES.md b/skills/dotagents-qa/SOURCES.md index 303cb55..e2c548b 100644 --- a/skills/dotagents-qa/SOURCES.md +++ b/skills/dotagents-qa/SOURCES.md @@ -9,17 +9,26 @@ | `specs/SPEC.md` | canonical `.agents/skills/`, agent symlink, MCP, hook, install, sync, list, and doctor behavior | | `packages/dotagents/src/agents/definitions/claude.ts` | Claude project skills and generated config paths | | `packages/dotagents/src/agents/definitions/cursor.ts` | Cursor shares `.claude/skills` and writes Cursor-specific MCP/hook config | +| `packages/dotagents/src/agents/definitions/vscode.ts` | VS Code reads `.agents/skills` natively and writes `.vscode/mcp.json` plus shared Claude-style hooks | +| `packages/dotagents/src/agents/definitions/codex.ts` | Codex reads `.agents/skills` natively and writes `.codex/config.toml` | +| `packages/dotagents/src/agents/definitions/opencode.ts` | OpenCode reads `.agents/skills` natively and writes `opencode.json` | | `packages/dotagents/src/cli/cache.ts` | `DOTAGENTS_STATE_DIR` cache isolation | | `packages/dotagents/src/cli/update-notifier.ts` | `HOME` isolation for update-check cache | | `skills/dotagents/SKILL.md` | sibling skill layout | +| `/Users/dcramer/src/sentry-mcp/.agents/skills/mcp-qa/SKILL.md` | Numbered QA flow, primary path guidance, optional client checks, and pass criteria structure | | local `codex debug prompt-input` probe | verifies Codex can expose `.agents/skills` metadata without a model call | ## Decisions -- Keep the runtime skill as an inline smoke-test guide rather than a broad QA matrix. -- Use local `path:` skills by default so the fixture tests dotagents wiring without network/trust noise. -- Build first and run `packages/dotagents/dist/cli/index.js` from inside the temp project. -- Include representative local skills, agents, MCP, and hooks because those cover the main generated outputs. +- Keep the runtime skill as a guided QA playbook rather than a broad QA matrix or fixed smoke harness. +- Provide a repo-local Dockerfile for the sandbox/toolchain only; keep fixture design and assertions manual. +- Keep the Dockerfile's prepared pnpm version aligned with the root `packageManager`. +- Mount the host checkout read-only and do dependency install/build inside Docker to avoid host system-file writes and host binary assumptions. +- Exclude `*.tsbuildinfo` with `dist` so TypeScript project references rebuild cleanly instead of reusing stale host incremental state. +- Document `-i` for non-TTY Docker runs because stdin must stay attached when an agent feeds commands through a here-doc. +- Make agents derive the fixture from the diff first, then use local `path:` skills, specific agents, MCP entries, hooks, broken state, user scope, or remote sources as needed. +- Run `pnpm check` and `pnpm build` inside Docker when appropriate, then run `packages/dotagents/dist/cli/index.js` from inside the fixture project. +- Require inspection of generated files and command output that demonstrate the changed behavior, not only exit codes. - Expose the skill through `.agents/skills/dotagents-qa` so existing Claude/Cursor symlinks discover it. - Keep agent CLI registration and remote-source checks optional because they add auth, network, or tool-version variance. diff --git a/skills/dotagents-qa/SPEC.md b/skills/dotagents-qa/SPEC.md index f588fe2..782c9d5 100644 --- a/skills/dotagents-qa/SPEC.md +++ b/skills/dotagents-qa/SPEC.md @@ -2,32 +2,40 @@ ## Intent -Provide a concise workflow for proving that local dotagents changes still install a representative `agents.toml` setup into the expected project locations. +Provide a concise, Docker-sandboxed playbook for performing change-specific dotagents QA without writing to host agent config or host home directories. ## Scope In scope: - building and running the local dotagents CLI -- creating disposable project fixtures with local skill sources +- designing disposable project fixtures inside Docker based on the changed behavior - verifying `.agents/skills/`, agent skills symlinks, MCP configs, hook configs, `list`, `doctor`, and `sync` +- verifying `--user` behavior with `DOTAGENTS_HOME` inside Docker - optionally checking agent CLI registration when discovery paths changed - optionally using `getsentry/skills` for remote source behavior -- isolating home and cache state during smoke tests +- isolating home and cache state from the host Out of scope: - release publishing - network-backed source testing for ordinary install-location changes - replacing focused Vitest regression coverage for logic bugs +- treating a fixed smoke script as sufficient QA for behavior-specific changes ## Runtime Contract -- Start from the changed dotagents behavior and customize the fixture only as needed. -- Run the built CLI from inside the temp project so project scope resolves correctly. -- Isolate `HOME`, `DOTAGENTS_STATE_DIR`, and `DOTAGENTS_HOME` for user-scope checks. +- Start from the changed dotagents behavior and state the behavior being proved before running commands. +- Use the repo-local QA Dockerfile for a stable Node/pnpm/system-tool baseline. +- Mount the host checkout read-only and copy it into a throwaway container before installing dependencies or building. +- Exclude host build artifacts and TypeScript build-info files from the container copy. +- Support both interactive Docker sessions and stdin-attached non-TTY runs. +- Run the built CLI from inside the container fixture project so project scope resolves correctly. +- Inspect generated files and command output that demonstrate the changed behavior, not just exit codes. +- Keep `HOME`, `DOTAGENTS_STATE_DIR`, and `DOTAGENTS_HOME` inside Docker for user-scope checks. - Report fixture shape, commands, assertions, skipped checks, and residual risk. ## Maintenance -- Keep `SKILL.md` focused on temp-project smoke testing. -- Update the fixture when dotagents changes config fields, generated file locations, or supported agents. -- Add scripts only if the same fixture becomes stable enough to automate. +- Keep `SKILL.md` focused on guided, change-specific Docker QA. +- Keep examples editable and illustrative; do not turn the skill into a fixed test harness. +- Keep the Dockerfile pnpm version aligned with the root `packageManager`. +- Update examples when dotagents changes config fields, generated file locations, or supported agents.