Skip to content

Add create-test-cases skill — feature-first manual test-case authoring#62

Open
gololdf1sh wants to merge 5 commits into
testomatio:masterfrom
gololdf1sh:skill/create-test-cases
Open

Add create-test-cases skill — feature-first manual test-case authoring#62
gololdf1sh wants to merge 5 commits into
testomatio:masterfrom
gololdf1sh:skill/create-test-cases

Conversation

@gololdf1sh
Copy link
Copy Markdown

@gololdf1sh gololdf1sh commented Apr 21, 2026

Flow at a glance

Main flow on the left, subagents block on the right. The ═══▶ arrows show which step spawns which subagent; the subagent runs in its own isolated context and returns a 1-line summary to the parent conversation.

┌──────────────────────────────────────────────────────────────────┐
│  PRE-STEP — Resume Detector                                      │
│  Runs BEFORE every step. Scans disk without reading file bodies. │
│  Derives $RESUME_FROM from 7 states: empty / feature-partial /   │
│  feature-done / sub-feature-partial / sub-feature-ready /        │
│  sub-feature-done / all-features-done.                           │
└──────────────────────────────────────────────────────────────────┘
                        │
                        ▼
┌──────────────────────────────────────────────────────────────────┐      ╔══════════════════════════════════╗
│  FEATURE PHASE  (ONE conversation — ends with HARD STOP)         │      ║  SUBAGENTS (isolated contexts)   ║
│  Produces shared baselines consumed by every sub-feature.        │      ║                                  ║
│                                                                  │      ║  ┌────────────────────────────┐  ║
│    Step 0  Intake questionnaire (5 questions, one at a time)     │      ║  │ 🧭 ui-explorer             │  ║
│            → intake.md                                           │      ║  │   mode: feature-baseline   │  ║
│                                                                  │      ║  │   — Playwright walkthrough │  ║
│    Step 1  Gather feature data (in parallel)                     │      ║  │   — browser_snapshot ×N    │  ║
│       1.1  UI exploration of the whole feature ══════════════════╪══════╪═▶│   — write _shared-ui.md    │  ║
│            → _shared-ui.md + sub-feature candidates              │      ║  │   ← "Cataloged 8 surfaces" │  ║
│       1.2  Extract ACs from docs (MCP / WebFetch / paste)        │      ║  └────────────────────────────┘  ║
│            → _ac-baseline.md                                     │      ║                                  ║
│       1.3  Existing-steps library (optional)                     │      ║  ┌────────────────────────────┐  ║
│            → _existing-steps.md                                  │      ║  │ 🧭 ui-explorer             │  ║
│       1.4  Destructuring + cross-cutting concerns                │      ║  │   mode: sub-feature-delta  │  ║
│            → destructuring.md                                    │      ║  │   — reads _shared-ui.md    │  ║
│       1.5  Single user-approval gate                ⛔ HARD STOP │      ║  │   — catalog ONLY delta     │  ║
└──────────────────────────────────────────────────────────────────┘      ║  │   — write {S}-ui-delta.md  │  ║
                        │                                                 ║  │   ← "4 delta surfaces"     │  ║
                        ▼   (new conversation per sub-feature)            ║  └────────────────────────────┘  ║
┌──────────────────────────────────────────────────────────────────┐      ║                                  ║
│  SUB-FEATURE PHASE  (ONE conversation per sub-feature, ×N)       │      ║  ┌────────────────────────────┐  ║
│                                                                  │      ║  │ ✅ test-case-reviewer      │  ║
│    Step 2  Thin slice                                            │      ║  │   ONLY if tests ≥ 15       │  ║
│       2.1  AC delta for the sub-feature   → {S}-ac-delta.md      │      ║  │   — 12 Bash gates          │  ║
│       2.2  UI delta                       → {S}-ui-delta.md ═════╪══════╪═▶│   — 11 semantic checks     │  ║
│       2.3  AC↔UI cross-validation         (no file)              │      ║  │   — auto-fix safe items    │  ║
│       2.4  Scope contract + user gate     → {S}-scope.md         │      ║  │   — violations report      │  ║
│                                                                  │      ║  │   ← "Found 4, fixed 3,     │  ║
│    Step 3  Generate                                              │      ║  │      escalated 1"          │  ║
│       Phase 0  Feed cross-cutting concerns from destructuring.md │      ║  └────────────────────────────┘  ║
│       Phase 1  Checklist + flat-vs-nested decision               │      ║                                  ║
│       Phase 2  Full test cases                                   │      ║  ┌────────────────────────────┐  ║
│               (reads _style.md if present)                       │      ║  │ 🔍 ui-validator            │  ║
│       Phase 3a Mechanical checks  ═══════════════════════════════╪══════╪═▶│   — Playwright walkthrough │  ║
│               < 15 tests: inline                                 │      ║  │   — pick 2-3 repr. tests   │  ║
│               ≥ 15 tests: test-case-reviewer subagent            │      ║  │   — verify vs. real UI     │  ║
│       Phase 3b UI reality check  ════════════════════════════════╪══════╪═▶│   — edit MD inline on fix  │  ║
│               (mandatory)                                        │      ║  │   — separate audit log     │  ║
│       Phase 4  Single user-approval gate                         │      ║  │   ← "Walked 3, fixed 2"    │  ║
│                                                                  │      ║  └────────────────────────────┘  ║
│    Step 4  Report + update tracker in destructuring.md           │      ║                                  ║
│            + write _style.md (ONLY on 1st approval) ⛔ HARD STOP │      ║  Principle:                      ║
└──────────────────────────────────────────────────────────────────┘      ║  • parent spawns via Agent()     ║
                        │                                                 ║  • Playwright / heavy dumps      ║
                        ▼   (when ALL sub-features in destructuring.md [x])║    live inside the subagent      ║
┌──────────────────────────────────────────────────────────────────┐      ║  • parent sees 1 summary line    ║
│  HANDOFF                                                         │      ║    — context stays clean         ║
│    /publish-test-cases-batch test-cases/{F}/                     │      ╚══════════════════════════════════╝
│    (separate skill — one branch per sub-feature: tc/{F}/{S})     │
└──────────────────────────────────────────────────────────────────┘

Key terms

Term What it is
Feature The top-level area under test (e.g. "Manual Tests Execution"). One folder test-cases/{feature}/.
Sub-feature A self-contained area of the feature that earns its own suite (e.g. run-creation, environment-configuration). Criteria below.
AC (acceptance criterion) One testable behavioral assertion extracted from docs / requirements. Every AC must be cited by source: in ≥ 1 test case (blocking gate).
AC baseline (_ac-baseline.md) Feature-wide AC list produced once in Step 1.2. Consumed by every sub-feature.
AC delta ({S}-ac-delta.md) ACs specific to sub-feature {S} that aren't in the baseline. Produced per sub-feature in Step 2.1.
UI catalog baseline (_shared-ui.md) Shared UI surfaces (screens, modals, forms, buttons, inputs) cataloged once by ui-explorer in mode: feature-baseline.
UI delta ({S}-ui-delta.md) UI elements specific to sub-feature {S} — what exists only inside this suite and not in the shared catalog. Cataloged by ui-explorer in mode: sub-feature-delta.
Destructuring (destructuring.md) The sub-feature map + cross-cutting concerns. Output of Step 1.4 — the single source of truth for "what suites will this feature have".
Cross-cutting concern A requirement that affects multiple sub-features (permissions, multi-env, audit, i18n, a11y). Tracked once in destructuring.md, fed into every sub-feature slice in Step 3 Phase 0 so each suite has ≥ 1 dedicated test.
Scope contract ({S}-scope.md) What the sub-feature generation will / will not cover. Explicit user-approval gate before generation starts.
Style carry-over (_style.md) Naming / formatting decisions captured on the first approved sub-feature, reused by every subsequent sub-feature so the whole feature stays stylistically consistent.
Feature phase Runs once per feature, produces the shared baselines. Ends with HARD STOP → each sub-feature gets its own conversation.
Sub-feature phase Runs per sub-feature: thin slice (ac-delta + ui-delta + scope) → generation → report. Ends with HARD STOP.
HARD STOP A checkpoint that ends the current conversation — forces a fresh context for the next phase so token pressure doesn't leak state between scopes.
Resume Detector Pre-step that runs before every invocation — scans disk (no body reads), derives state from 7 possible, routes to the right step. Makes interrupted runs safe.
Subagent Isolated-context worker spawned via Agent(). Three in this skill: ui-explorer (Playwright), test-case-reviewer (self-review when tests ≥ 15), ui-validator (UI reality walk). Parent sees only a 1-line summary.
Flat vs nested Layout decision in Step 3 Phase 1. Nested (directory of MD files) when ≥ 2 natural sections of ≥ 3 tests each exist; flat (single MD) otherwise. Total test count is not part of this decision.

How we decide what is a sub-feature

During Step 1.4 (destructuring) the skill walks the feature's UI catalog and applies the following rule: an area becomes a sub-feature (= its own suite) when it meets ≥ 2 of:

  1. Has a form with 3+ fields (inputs, dropdowns, toggles).
  2. Has its own CRUD lifecycle (create / read / update / delete).
  3. Has 3+ distinct states or modes (e.g. pending / in-progress / finished).
  4. Has its own entry point or navigation (separate tab, panel, dialog).
  5. Has interactions with 2+ other entities (e.g. environment selection affects run + report).

Edge cases:

  • Area meets exactly 1 criterion → it is a section inside a larger sub-feature, not its own suite.
  • Area meets 4+ criteria → flagged as deep, estimated ≥ 30 tests, may be split into nested sub-suites in Step 3 Phase 1.

The resulting sub-feature map is written to destructuring.md, reviewed with the user at the end of the feature phase (Step 1.5 — Single user-approval gate), and then drives the sub-feature phase one suite at a time.

Files

  • skills/create-test-cases/SKILL.md — entry point (router).
  • skills/create-test-cases/intake-questionnaire.md + intake-examples.md
  • skills/create-test-cases/steps/ — 5 step files (00-intake, 01-feature-phase, 10-sub-feature-slice, 20-generate, 40-report).
  • skills/create-test-cases/references/ — 8 reference files (artifacts, testing-strategy, test-case-format, self-review-checks, destructuring, product-context, ui-catalog-format, troubleshooting).
  • plugins/test-management/skills/create-test-cases — symlink (plugin wrapper).
  • README.md — skill listed in the Test Management table.

claude added 5 commits April 21, 2026 17:33
Deep, feature-first manual test-case authoring skill. Complements the
existing generate-test-cases skill by offering a heavier workflow:

- Two-phase flow: one feature phase produces shared baselines
  (_shared-ui.md, _ac-baseline.md, _existing-steps.md, destructuring
  map), each sub-feature then runs in its own conversation as a thin
  slice (ac-delta, ui-delta, scope) + generation + report.
- Resume Detector pre-step — scans disk, derives $RESUME_FROM from 7
  states, never re-does finished work.
- Subagent fan-out — Playwright exploration, mechanical self-review
  (>=15 tests), and UI reality check all run in isolated contexts and
  return 1-line summaries so the parent conversation stays clean.
- Artifact-driven contracts — every step declares precondition, input,
  output, postcondition, idempotency, and retry policy.

Includes a full worked example under examples/:
- examples/flow-diagram.md — ASCII diagram of the full flow.
- examples/generatedTests/ — real run output for Testomat.io's Manual
  Tests Execution feature (10 sub-features, feature- and sub-feature-
  level artifacts, nested test-case MDs).
- examples/generatedDocs/ — downstream BA-style product documentation
  derived from the test cases (use cases, business rules, state
  diagrams, traceability matrix) — illustrates what's possible once a
  solid test-case baseline exists.

Publishing is intentionally not part of this skill — approved MDs are
handed off to sync-cases (or local /publish-test-cases-batch, not yet
in this repo).
- SKILL.md gains a 'Flow at a glance' section at the top — the same
  ASCII diagram from examples/flow-diagram.md is now visible to every
  reader landing on the skill, before the 170-line router.
- examples/README.md calls out explicitly that everything under
  examples/ is optional reference material:
  * generatedTests/ is a sample of what the skill produces
  * generatedDocs/ is a sample of downstream docs you can build on top
    of test cases — NOT skill output
  Users can delete or ignore examples/ without affecting the skill.
Replaced the embedded ASCII diagram with a single-line link to
examples/flow-diagram.md. Keeps SKILL.md lean while preserving the
visual reference for anyone who wants it.
Drops ~146 example files (generatedTests/, generatedDocs/,
flow-diagram.md, examples/README.md) to keep the PR focused on the
skill itself. The flow diagram is still available in the PR
description.

Also removes the now-stale link to examples/flow-diagram.md from
SKILL.md.
CLAUDE.md carried project-specific lessons learned that don't belong in
the upstream skill. The three inline references in references/ have
been simplified (the bullet-list summary of parser behaviour stays —
the pointer to CLAUDE.md is dropped).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants