diff --git a/.github/workflows/performance-stress.yml b/.github/workflows/performance-stress.yml index 49c7dc3..f33673c 100644 --- a/.github/workflows/performance-stress.yml +++ b/.github/workflows/performance-stress.yml @@ -2,7 +2,7 @@ name: Performance Stress Test on: schedule: - - cron: '0 2 * * 1' + - cron: '25 2 * * 1' workflow_dispatch: inputs: target_url: diff --git a/.github/workflows/pr-quality-gate.yml b/.github/workflows/pr-quality-gate.yml index 5857ac1..e328fc4 100644 --- a/.github/workflows/pr-quality-gate.yml +++ b/.github/workflows/pr-quality-gate.yml @@ -109,6 +109,15 @@ jobs: sudo apt-get update sudo apt-get install k6 + # Java is required for the Maven execution path below. Cached Maven deps + # are reused across all jobs that also hit the reactor. + - name: Set up Java ${{ env.JAVA_VERSION }} + uses: actions/setup-java@v4 + with: + java-version: ${{ env.JAVA_VERSION }} + distribution: temurin + cache: maven + # Component performance gate — validates patient lookup endpoint SLA on # every PR. Catches regressions at the point they are introduced. - name: Run patient lookup component test @@ -117,6 +126,17 @@ jobs: --env API_BASE_URL=${{ env.API_BASE_URL }} \ k6/component/patient-lookup.js + # Governance parity: invoke the same k6 script through Maven so the + # reactor entrypoint (`mvn test`) behaves identically to the direct k6 + # command above. Runs alongside the direct invocation rather than + # replacing it, to prove the two paths produce the same result while + # the rest of the k6 modules are on-boarded. Once every k6 component + # has a Maven binding, the direct `k6 run` step will retire and CI + # will drive all test layers (REST Assured, Karate, Pact, k6) through + # a single reactor command — the operating model transformation goal. + - name: Run patient lookup component test via Maven + run: mvn test -pl k6 -Dapi.base.url=${{ env.API_BASE_URL }} + k6-prescription-api: name: k6 — Prescription API Component runs-on: ubuntu-latest diff --git a/README.md b/README.md index 44affea..1c15ee2 100644 --- a/README.md +++ b/README.md @@ -45,13 +45,13 @@ flowchart TD direction LR PR1["REST Assured\nFunctional tests"] PR2["Karate BDD\nScenario tests"] - PR3["k6 Component\nOne per endpoint"] + PR3["k6 Component\nOne per endpoint\n60s each"] end CO["Pact Consumer\nContract generation"] PV["Pact Provider\nContract verification"] SM["Staging Smoke\nKarate @smoke\n5 min · critical paths only"] - K6L["k6 System Load\nFull e2e journey"] - K6S["k6 Stress\nScheduled — Monday 2AM"] + K6L["k6 System Load\nFull e2e journey\nStaging deploy"] + K6S["k6 Stress\nScheduled Mon 2AM\nBreak point discovery"] end API["🌐 JSONPlaceholder\nRepresents microservice layer"] @@ -63,7 +63,6 @@ flowchart TD L1 --> PR1 L2 --> PR2 - L4 --> PR3 PR1 --> CO PR2 --> CO PR3 --> CO @@ -173,13 +172,18 @@ Pact inverts the testing direction. The consumer team (e.g., Cart service consum ## 6. Pipeline Integration -![PR Quality Gate — 4-job dependency chain](docs/images/pipeline-visual.png) -*PR Quality Gate: REST Assured and Karate run in parallel, -feeding Pact Consumer contract generation, -followed by Pact Provider verification. -Total execution: ~80 seconds on every pull request.* +![PR Quality Gate](docs/images/pipeline-visual.png) +*PR Quality Gate: Five parallel jobs — REST Assured, +Karate BDD, and three k6 component performance gates +run simultaneously, feeding Pact Consumer contract +generation, followed by Pact Provider verification. +Total execution: ~2m 18s on every pull request.* -Two workflows, one principle: **the right tests at the right gate.** +Three workflows, one principle: **the right tests at the right gate, at the right time.** + +- **PR Quality Gate** — functional, BDD, and performance component gates on every pull request +- **Staging Smoke** — Karate @smoke + k6 system load on every staging deployment +- **Performance Stress** — scheduled Monday 2AM UTC, on-demand for capacity planning ### `pr-quality-gate.yml` — runs on every PR to `main` @@ -226,6 +230,31 @@ The goal is not to eliminate Postman from engineers' desktops. The goal is to re --- +## Documentation + +| Document | Purpose | +|---|---| +| [Architecture Decision](docs/architecture-decision.md) | Why each layer — trade-offs and alternatives considered | +| [Pipeline Design](docs/pipeline-design.md) | What runs when and why — the gate logic | +| [Current State — Postman](docs/CURRENT_STATE_POSTMAN.md) | Enterprise-scale limitations of Postman/Newman | +| [Test Pyramid](docs/TEST_PYRAMID.md) | Complete testing strategy including service virtualization gap | +| [Build Story](docs/BUILD_STORY.md) | How this was built — AI-native productivity model | + +--- + +## Use This Skill + +This repository was built using a reusable Claude Code skill. The full prompt template — including all seven prompts, verification checklist, customization guide, and talking points — is available at: + +👉 [SKILL.md](SKILL.md) + +**Adapt it for any Java microservices project** by replacing six variables: +- PROJECT_NAME, PACKAGE_BASE, CONSUMER_NAME, PROVIDER_NAME, TARGET_API, DOMAIN + +The skill produces a working three-layer API testing platform with GitHub Actions pipeline in under 9 hours using Claude Code. + +--- + ## 8. Getting Started **Prerequisites:** Java 17, Maven 3.9+ diff --git a/SKILL.md b/SKILL.md new file mode 100644 index 0000000..4ab1d9f --- /dev/null +++ b/SKILL.md @@ -0,0 +1,718 @@ +# CLAUDE_SKILL_API_QE_PLATFORM.md + +## Skill: API Quality Engineering Platform + +--- + +### Purpose + +This skill builds a production-quality, four-layer API testing reference architecture on a Java/Maven multi-module project. Use it when replacing Postman/Newman in a CI pipeline, demonstrating enterprise QE platform thinking, or establishing a governed, code-reviewed testing foundation for a Java microservices org. The output is a fully working Maven project with REST Assured (functional validation), Karate DSL (BDD governance), Pact (consumer-driven contract testing), and k6 (two-stage performance engineering), wired together in a GitHub Actions pipeline with a 7-job dependency chain with two-stage performance gates. Every layer is independently buildable and independently runnable against any REST API. + +--- + +### Prerequisites + +- Java 17+ +- Maven 3.9+ +- GitHub repository created +- Public mock API available (JSONPlaceholder default: `https://jsonplaceholder.typicode.com`) + +--- + +### Variables to Customize + +Replace these values throughout all prompts before using them. + +| Variable | Default | Description | +|---|---|---| +| `PROJECT_NAME` | `api-quality-platform-reference` | GitHub repo name and root Maven artifactId | +| `PACKAGE_BASE` | `com.wag.qe` | Java package root for all source files | +| `CONSUMER_NAME` | `PrescriptionService` | Pact consumer service name (the service making calls) | +| `PROVIDER_NAME` | `PatientService` | Pact provider service name (the service receiving calls) | +| `TARGET_API` | `https://jsonplaceholder.typicode.com` | Mock API base URL used in CI and local runs | +| `DOMAIN` | `prescription/patient` | Business domain for test naming, feature file directories, and scenario language | +| `PERF_SCHEDULE` | `0 2 * * 1` | Cron schedule for stress test (Monday 2AM UTC) | + +--- + +### Prompt 1 — Project Scaffolding + +``` +I'm building a Java API testing reference architecture to demonstrate enterprise QE +platform thinking. I need you to scaffold the complete project structure. + +Project name: PROJECT_NAME +Java package root: PACKAGE_BASE +Target mock API: TARGET_API + +Build a Maven multi-module project with this exact structure: + +PROJECT_NAME/ +├── pom.xml (parent POM) +├── README.md (8-section architect-to-VP narrative) +├── rest-assured/ +│ └── pom.xml +├── karate/ +│ └── pom.xml +├── pact/ +│ └── pom.xml +└── .github/ + └── workflows/ + ├── pr-quality-gate.yml (stub) + └── staging-smoke.yml (stub) + +Parent POM requirements: +- groupId: PACKAGE_BASE +- artifactId: PROJECT_NAME +- Java 17, UTF-8 +- Modules: rest-assured, karate, pact +- Manage these dependency versions in dependencyManagement (never declare versions + in child POMs): + - REST Assured 5.3.2 + - TestNG 7.9.0 + - com.intuit.karate:karate-junit5:1.4.0 (NOT io.karatelabs — not on Maven Central) + - au.com.dius.pact.consumer:junit5:4.6.7 + - au.com.dius.pact.provider:junit5:4.6.7 + - junit-jupiter-api and junit-jupiter-engine (latest stable) + - jackson-databind (latest stable) + - logback-classic (latest stable) + - SLF4J API (latest stable) +- Properties: api.base.url (default TARGET_API), pact.broker.url, pact.broker.token +- Maven Surefire 3.x configured to use JUnit platform in pluginManagement + +README.md must have exactly 8 sections written as an architect explaining to a VP +why Postman/Newman fails at platform scale and what this replaces it with: +1. The Problem This Solves — three specific Postman failure modes: governance, + secrets management, contract awareness +2. Architecture Decision — three layers, what each solves, why Maven multi-module +3. Layer 1: REST Assured — functional validation, ApiConfig pattern, system property injection +4. Layer 2: Karate — BDD governance, karate-config.js, why Karate over Cucumber +5. Layer 3: Pact — consumer-driven contracts, the guarantee it provides, Pact Broker +6. Pipeline Integration — two workflows, what runs when and why +7. What This Replaces and What It Doesn't — Postman narrows to exploration only +8. Getting Started — prerequisites, mvn commands for each layer + +The README must include a concrete example in the Pact section: Patient API renames +the "name" field to "fullName". Walk through what happens without contract testing +(silent production failure) versus with contract testing (PR blocked, conversation +between engineers). + +Do not write tutorial-style content. Write as an architect, not an instructor. +``` + +--- + +### Prompt 2 — REST Assured Layer + +``` +Build out the complete REST Assured layer for PROJECT_NAME. +All code must be production-quality — not demo quality. + +Package structure under rest-assured/src/test/java/PACKAGE_BASE/api/: + config/ApiConfig.java + client/PatientApiClient.java + client/PrescriptionApiClient.java + client/CartApiClient.java + PrescriptionApiTest.java + CartApiTest.java + +ApiConfig requirements: +- getBaseSpec() returns a NEW RequestSpecBuilder().build() on every call — not a + singleton. This is intentional: TestNG parallel execution requires thread-safe specs. +- Base URL from System.getProperty("api.base.url") +- Connection timeout: 5000ms, socket timeout: 10000ms + Use string literals "http.connection.timeout" and "http.socket.timeout" — + RestAssuredConfig requires strings, not the enum. +- FailureOnlyLoggingFilter as a static inner class: + - Implements Filter + - In filter(): call next.next(requestSpec, responseSpec, ctx) to get the response + - Log full request + response at ERROR level via SLF4J only when + response.getStatusCode() >= 400 + - Use Java 16+ pattern matching: if (body instanceof String s) return s; + - Silent on success — no log noise in the passing case + +API client pattern (enforce strict separation): +- Client classes: HTTP mechanics only. No assertions. No test logic. + Returns ValidatableResponse from every method. +- Test classes: assertions only. Never call RestAssured directly. + +PatientApiClient → maps to /users endpoint: + getPatient(int id) → GET /users/{id} + getAllPatients() → GET /users + createPatient(Map payload) → POST /users + deletePatient(int id) → DELETE /users/{id} + +PrescriptionApiClient → maps to /posts endpoint: + getPrescriptionsForPatient(int patientId) → GET /posts?userId={patientId} + submitPrescriptionRefill(Map payload) → POST /posts + +CartApiClient → maps to /todos endpoint: + getCartItemsForPatient(int patientId) → GET /todos?userId={patientId} + addCartItem(Map payload) → POST /todos + getAllCartItems() → GET /todos + +PrescriptionApiTest (TestNG, @Test(groups="prescription") on class): +- @BeforeClass logs: [REST Assured] Starting PrescriptionApiTest against: {url} +- Test 1: getPatientRecord_validId_returns200WithSchema + GET /users/1, assert 200, assert body has id/name/email/phone, + assert response.getTime() < 2000 +- Test 2: getPatientRecord_invalidId_returns404 + GET /users/99999, assert 404 + Comment explaining why error-path testing matters in healthcare workflows +- Test 3: submitPrescriptionRefill_validPayload_returns201 + POST /posts with userId/title/body payload, assert 201, assert id > 0 +- Test 4: prescriptionSubmission_responseTime_underSLA + POST /posts, assert response.getTime() < 3000 + Comment explaining why SLA is a separate test, not folded into test 3 + +CartApiTest: similar structure, 3 tests. + +rest-assured/src/test/resources/logback-test.xml: +- Set org.apache.http.wire to OFF +- Set io.restassured to WARN +- Suppress all wire-level HTTP logging in the passing case + +rest-assured/pom.xml: +- Parent: PACKAGE_BASE:PROJECT_NAME +- Dependencies: rest-assured, testng, slf4j-api, logback-classic (all from parent + dependencyManagement, no versions in child POM) +- Surefire: include **/*Test.java, pass api.base.url as systemPropertyVariable + +Write zero comments that explain what the code does. Only write a comment when the +WHY is non-obvious — hidden constraints, workarounds, invariants a reader would miss. +``` + +--- + +### Prompt 3 — Karate Layer + +``` +Build out the complete Karate layer for PROJECT_NAME. +All feature files must be readable by a non-engineer — product managers and QA +leads must be able to read scenario names and understand what is being validated +without reading the step bodies. + +File structure under karate/src/test/resources/: + karate-config.js + features/ + DOMAIN/ + DOMAIN-api.feature + +karate-config.js requirements: +- Three environments: dev, staging, prod +- dev: baseUrl from System.getProperty("api.base.url", "TARGET_API") +- staging: https://staging-api.[your-domain].com +- prod: https://api.[your-domain].com +- For staging and prod: if api.base.url system property is set, it wins (enables + CI override without code changes) +- Log the resolved environment and baseUrl: + karate.log('[Karate] Environment:', env, '| Base URL:', config.baseUrl) +- No credentials in this file. Credentials come from environment variables + injected by the pipeline. + +Feature file requirements: +- @DOMAIN tag on the feature (not individual scenarios) +- @smoke tag on scenarios that must pass as a deployment health check +- Scenario names must be business language — written as if Jason (a non-engineer + product manager) will read the HTML report and needs to understand what failed + without asking an engineer + +DOMAIN-api.feature must include: +1. @smoke Scenario: Retrieve known patient record by ID + GET /users/1 + Assert status 200 + Assert response.id == 1 (exact match, not type check) + Assert response.name == 'Leanne Graham' (exact match) + Assert response has email and phone fields + +2. @smoke Scenario: Retrieve non-existent patient returns 404 + GET /users/99999, assert status 404 + +3. @smoke Scenario: Write prescription refill for existing patient + POST /posts with a payload, assert status 201, assert id is present + +4. Scenario Outline: Invalid patient IDs return client errors + IDs: 0, -1, 99999 + Assert responseStatus >= 400 + Add a comment above the Outline explaining why boundary values matter in a + medication dispensing workflow — this is the non-obvious WHY + +KarateRunner.java under karate/src/test/java/PACKAGE_BASE/karate/: +- JUnit 5 @Test void testParallel() +- Runner.path("classpath:features").parallel(5) +- Read tags from System.getProperty("karate.options", "--tags @smoke") +- Strip the "--tags " prefix before passing to .tags() +- Fail the build via assertEquals(0, results.getFailCount(), results.getErrorMessages()) + +karate/pom.xml: +- com.intuit.karate:karate-junit5:1.4.0 (from parent dependencyManagement) +- Also explicitly declare junit-jupiter-api and junit-jupiter-engine + (Karate needs them resolvable at test runtime) +- Surefire: include **/*Runner.java only + +Write a cart feature file with the same structure: @cart @smoke tags, 3 scenarios, +business language names, one scenario validating Content-Type response header +using Karate's match syntax on responseHeaders. +``` + +--- + +### Prompt 4 — Pact Contract Testing + +``` +Build the Pact consumer-driven contract testing layer for PROJECT_NAME. +This is the most architecturally significant layer. Explain it clearly in code +comments — this is the layer interviewers and VP-level stakeholders will scrutinize. + +File structure under pact/src/test/java/PACKAGE_BASE/pact/: + consumer/PrescriptionConsumerTest.java (CONSUMER_NAME consuming PROVIDER_NAME) + provider/PatientProviderTest.java (PROVIDER_NAME verifying contracts) + +PrescriptionConsumerTest requirements: +- @ExtendWith(PactConsumerTestExt.class) +- @PactTestFor(providerName = "PROVIDER_NAME", pactVersion = PactSpecVersion.V3) + V3 is explicit and required. Pact 4.6.7 defaults to V4 which requires a different + PactBuilder DSL and breaks the familiar RequestResponsePact API. +- @Pact(consumer = "CONSUMER_NAME") on the @Pact method — returns RequestResponsePact +- Use PactDslWithProvider builder pattern +- PactDslJsonBody declares MINIMUM fields only: + .integerType("id", 1) + .stringType("name", "Leanne Graham") + .stringType("email", "Sincere@april.biz") + .stringType("phone", "1-770-736-8031 x56442") + Comment explaining why minimum fields: allows provider to add/restructure + without triggering false violations +- State: "PROVIDER_NAME user with ID 1 exists" +- Interaction: GET /users/1 → 200 + body +- @Test method uses Java 11 HttpClient (no REST Assured dependency in pact module) + Calls mockServer.getUrl() + "/users/1", asserts 200, asserts body not null + +Class-level Javadoc must explain the 5-step contract testing flow in plain English: + 1. @Pact method declares the interaction + 2. Pact starts a mock server — consumer never touches the real provider + 3. @Test proves the consumer can make the request and handle the response + 4. Pact writes the verified interaction to target/pacts/CONSUMER_NAME-PROVIDER_NAME.json + 5. PatientProviderTest loads that file and replays it against the real provider + +PatientProviderTest requirements: +- @Provider("PROVIDER_NAME") +- @PactFolder("target/pacts") +- @BeforeEach configureTarget(PactVerificationContext context): + Read api.base.url system property + Use URL.getProtocol() to pick HttpsTestTarget.fromUrl(url) for https or + HttpTestTarget.fromUrl(url) for http + Use the static fromUrl() factory — do NOT use the constructor. The constructor + signature changed in 4.6.x and will cause a compilation error. +- @State("PROVIDER_NAME user with ID 1 exists") — no-op body, comment explaining why +- @TestTemplate @ExtendWith(PactVerificationInvocationContextProvider.class) + void verifyContracts(PactVerificationContext context) → context.verifyInteraction() + +pact/pom.xml — TWO ordered Surefire executions (this is the critical config): + Problem: alphabetical ordering puts PatientProviderTest before + PrescriptionConsumerTest. Provider verification fails because the pact JSON + doesn't exist yet. + Solution: + 1. Disable the default-test execution by binding it to phase: none + 2. Add execution consumer-contract-generation: includes **/*ConsumerTest.java + 3. Add execution provider-contract-verification: includes **/*ProviderTest.java + Maven executes them in declaration order within the same phase. + + All executions must inherit: + false + ${project.build.directory}/pacts + +In the CI workflow, when filtering with -Dtest=ClassName and the module has multiple +Surefire executions, add -DfailIfNoSpecifiedTests=false. Without it, Surefire 3.x +fails the build when a filtered execution finds no matching tests. + +Do not add a comment summarizing what the code does. Only add comments for +non-obvious constraints: the V3 requirement, the fromUrl() factory requirement, +the ordering problem and its solution. +``` + +--- + +### Prompt 5 — GitHub Actions Pipeline + +``` +Build the two GitHub Actions workflow files for PROJECT_NAME. +These are the most important files in the repository after the README. +Every comment must explain WHY, not WHAT. + +.github/workflows/pr-quality-gate.yml requirements: + +Trigger: pull_request → branches: [main, develop] + +Workflow-level env: + JAVA_HOME: ${{ env.JAVA_HOME }} (documents the Java dependency; evaluates to + empty at parse time — this is intentional) + JAVA_VERSION: '17' + API_BASE_URL: 'TARGET_API' + +Permissions: checks: write, pull-requests: write +(required for dorny/test-reporter to post Check annotations on the PR) + +Four jobs with this exact dependency structure: + Job 1 — rest-assured-functional (no dependencies) + name: REST Assured — Functional API Validation + steps: checkout, setup-java (temurin, cache: maven), + mvn test -pl rest-assured -Dapi.base.url=${{ env.API_BASE_URL }} + dorny/test-reporter@v1: reporter java-junit, + path rest-assured/target/surefire-reports/TEST-*.xml, + fail-on-error: 'false' + (fail-on-error is false because Maven exit code already controls job + pass/fail — reporter should annotate, not double-fail) + + Job 2 — karate-bdd (no dependencies, runs in parallel with Job 1) + name: Karate — BDD API Scenarios + steps: checkout, setup-java, + mvn test -pl karate -Dapi.base.url=${{ env.API_BASE_URL }} + upload-artifact@v4: karate/target/karate-reports/ as karate-report + + Job 3 — pact-consumer (needs: [rest-assured-functional, karate-bdd]) + name: Pact — Consumer Contract Generation + steps: checkout, setup-java, + mvn test -pl pact \ + -Dtest=PrescriptionConsumerTest \ + -DfailIfNoSpecifiedTests=false \ + -Dapi.base.url=${{ env.API_BASE_URL }} + upload-artifact@v4: pact/target/pacts/ as pact-contracts + Comment on the needs: dependency — explain why contracts must not be generated + from a failing codebase. A contract from broken code encodes the bug as a + requirement and trains the provider to satisfy incorrect expectations. + + Job 4 — pact-provider (needs: [pact-consumer]) + name: Pact — Provider Contract Verification + steps: checkout, setup-java, + download-artifact@v4: restore pact-contracts to pact/target/pacts/ + (path must match @PactFolder("target/pacts") — Surefire sets working + directory to module root, so target/pacts → pact/target/pacts/ from repo root) + mvn test -pl pact \ + -Dtest=PatientProviderTest \ + -DfailIfNoSpecifiedTests=false \ + -Dapi.base.url=${{ env.API_BASE_URL }} + +.github/workflows/staging-smoke.yml requirements: + +Trigger: workflow_dispatch ONLY + Remove repository_dispatch. Remove PagerDuty. Remove workflow inputs. + This workflow is manually triggered by the deployment pipeline. + +Workflow-level env: same JAVA_HOME pattern, JAVA_VERSION, API_BASE_URL + +One job — staging-smoke: + name: Karate — Staging Smoke Scenarios + steps: checkout, setup-java, + mvn test -pl karate \ + -Dkarate.options="--tags @smoke" \ + -Dapi.base.url=${{ env.API_BASE_URL }} + upload-artifact@v4: karate/target/karate-reports/ as staging-smoke-report + Comment on the upload: HTML report surfaces which scenario failed and what the + actual response was. Engineers triaging a failed smoke gate read this first. + +Comment at the top of staging-smoke.yml: +"Smoke test runs after every deployment to staging. @smoke tag = critical paths only. +5 minutes maximum. If this fails, staging is unhealthy — stop all deployments until fixed." + +REST Assured and Pact do not run here. Code was validated on PR before the artifact +was promoted. Running them again validates infrastructure, not code. +``` + +--- + +### Prompt 6 — Architecture Documentation + +``` +Rewrite docs/architecture-decision.md and docs/pipeline-design.md. +Both documents must be sharp, peer-level, and concise — written as an architect +explaining decisions to a VP peer. No tutorials. No hand-holding. Under one page each. + +architecture-decision.md must cover: + +1. Context — what Postman/Newman fails at, specifically. Three failure modes as + named paragraphs: governance (40 engineers, 12 squads, no review gate), secrets + (plaintext JSON, HIPAA/PCI environment), contract visibility (Newman cannot detect + schema drift between services). + +2. Decision — three layers, one sentence each on what problem each solves: + REST Assured: deep payload validation, auth flows, stateful sequences requiring Java + Karate: governed BDD scenarios reviewable by non-engineers; test changes are + PR-gated artifacts + Pact: consumer-driven contracts that surface schema drift at PR time, not production + +3. Alternatives considered — prose sentences with the disqualifying reason embedded + (not a table): + Postman only: governance and secrets are architectural — no tooling layer fixes them + Playwright API only: TypeScript-first, no Pact broker integration, adds runtime + heterogeneity for no gain + Karate only: no native Pact broker integration; OAuth2/mTLS interop is fragile + +4. Trade-offs — honest, one line per layer: + REST Assured: non-Java QA engineers cannot own these tests without ramp-up + Karate: scope it to what it does well; Java interop at the edges is fragile + Pact: requires Pact Broker infrastructure before cross-team guarantees apply + +5. What this replaces — Postman role narrows to exploration and documentation only. + Removed from CI entirely. Newman removed from CI entirely. + +6. When to evolve: + TypeScript org: evaluate Pact JS + Playwright, three-layer model holds + Java expertise gaps: shift coverage to Karate, reduce REST Assured to auth + flows and SLA validation only + +pipeline-design.md must cover: + +1. The principle — one declarative statement. A test in the wrong gate is quality + theater. Every gate has an explicit rationale. + +2. On Pull Request — table: job, what it validates, why here. + +3. On Merge — one paragraph. The right answer is: nothing additional runs. + The PR gate is the quality gate. Running the same tests again on merge validates + the pipeline, not the code. + +4. On Staging Deploy — Karate @smoke only. Validates the deployment + (DNS, ingress, service mesh, secrets injection) — not the code. + +5. The dependency chain — ASCII diagram showing the parallel/serial structure. + Explain why the chain is a correctness constraint, not a convenience: a contract + from broken code encodes the bug as a requirement. + +6. Coverage intelligence — SeaLights as the next layer (reference only, not + implemented). Explain what it answers that this pipeline cannot. Note that + integration requires tagging test results with build metadata in each job. +``` + +--- + +### Prompt 8 — k6 Two-Stage Performance Layer + +``` +Add a two-stage k6 performance testing layer to PROJECT_NAME. +k6 is JavaScript — not a Maven module. Do not add it to pom.xml. + +Create this folder structure: +k6/ +├── README.md +├── config/ +│ ├── thresholds.js +│ └── environments.js +├── component/ +│ ├── patient-lookup.js +│ ├── prescription-api.js +│ └── cart-api.js +├── system/ +│ ├── prescription-checkout-load.js +│ └── prescription-checkout-stress.js +└── reports/ + └── .gitkeep + +k6/config/environments.js: +export const config = { + baseUrl: __ENV.API_BASE_URL || 'TARGET_API', + environment: __ENV.ENVIRONMENT || 'local', +}; + +k6/config/thresholds.js — three threshold tiers. +CRITICAL: k6 threshold syntax uses p(95) not p95. +"p95<500" will throw "failed parsing threshold expression" at runtime. +The correct syntax is "p(95)<500". + +export const componentThresholds = { + http_req_duration: ['p(95)<500', 'p(99)<1000'], + http_req_failed: ['rate<0.01'], +}; +export const systemThresholds = { + http_req_duration: ['p(95)<2000', 'p(99)<3000'], + http_req_failed: ['rate<0.01'], + http_reqs: ['rate>10'], +}; +export const stressThresholds = { + http_req_duration: ['p(99)<5000'], + http_req_failed: ['rate<0.05'], +}; + +Component test files (k6/component/): +Three files — one per endpoint. Each follows this pattern: +- Import config and componentThresholds +- stages: 15s ramp to 5 VUs → 30s hold → 15s ramp down (60s total) +- Random patientId: Math.floor(Math.random() * 10) + 1 +- patient-lookup.js: GET /users/{id}, check status 200, duration < 500, + id and name present in response body +- prescription-api.js: POST /posts with JSON payload, check status 201, + duration < 500, id returned +- cart-api.js: GET /posts?userId={id}, check status 200, + duration < 500, response is an array +- All checks use r.timings.duration (not r.timings.waiting) +- sleep(1) after each iteration + +System test files (k6/system/): +prescription-checkout-load.js — full e2e journey, systemThresholds: + stages: 1m ramp to 20 VUs → 3m hold → 1m ramp down + Three groups in sequence: Patient Record Lookup, Prescription History, + Submit Prescription Refill + Each group: HTTP call, check status + SLA, sleep(Math.random() * 2 + 1) + +prescription-checkout-stress.js — break point discovery, stressThresholds: + Add this comment at top: + "Stress test — identifies break point and recovery behavior. + Run scheduled, not on every deployment. + Results inform capacity planning and auto-scaling thresholds." + stages: 2m→50 VUs, 3m→100, 2m→200, 2m hold at 200, 1m→0 + Same three-group journey as load test + +Update .github/workflows/pr-quality-gate.yml: +Add three k6 component jobs that run in PARALLEL with rest-assured-functional +and karate-bdd (no dependencies on each other): + + k6-patient-lookup, k6-prescription-api, k6-cart-api — each with: + - Install k6 via apt (keyring method, not snap): + sudo gpg -k + sudo gpg --no-default-keyring \ + --keyring /usr/share/keyrings/k6-archive-keyring.gpg \ + --keyserver hkp://keyserver.ubuntu.com:80 \ + --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69 + echo "deb [signed-by=...] https://dl.k6.io/deb stable main" | sudo tee ... + sudo apt-get update && sudo apt-get install k6 + - k6 run --env API_BASE_URL=${{ env.API_BASE_URL }} k6/component/{script}.js + +Update pact-consumer needs array to include all five parallel jobs: + needs: [rest-assured-functional, karate-bdd, k6-patient-lookup, + k6-prescription-api, k6-cart-api] + +Update .github/workflows/staging-smoke.yml: +Add k6-system-load job after staging-smoke: + needs: [staging-smoke] + Install k6, run prescription-checkout-load.js + --out json=k6/reports/load-results.json + upload-artifact: k6/reports/ as k6-load-results + +Create .github/workflows/performance-stress.yml: + on: + schedule: + - cron: 'PERF_SCHEDULE' (literal string — NOT an expression) + workflow_dispatch: + inputs: + target_url: (description, required: false, default: TARGET_API) + + CRITICAL: Do not use ${{ vars.X || 'default' }} in the cron field. + GitHub Actions does not evaluate expressions in on.schedule.cron. + Use a literal cron string only. + + One job: stress-test + Install k6, run prescription-checkout-stress.js + --out json=k6/reports/stress-results.json + upload-artifact: k6/reports/ as k6-stress-results + +k6/README.md — peer-level, no tutorials. Cover: + - Two-stage model: why component gates on PR vs system load on deploy + - Threshold values and the reasoning behind each (p95 vs p99, 500ms vs 2000ms) + - Running locally (three commands) + - Azure Load Testing production path (reference only, not implemented) + - Pipeline integration table: stage, trigger, scripts, gate behavior +``` + +--- + +### Prompt 7 — Validation and Cleanup + +``` +Validate the complete PROJECT_NAME build and fix any failures. + +Run in this order: +1. mvn clean compile -DskipTests + Fix any compilation errors before proceeding. + +2. mvn test -pl rest-assured -Dapi.base.url=TARGET_API + All tests must pass. Fix failures before proceeding to next module. + +3. mvn test -pl karate -Dapi.base.url=TARGET_API + All @smoke scenarios must pass. + +4. mvn test -pl pact -Dapi.base.url=TARGET_API + Consumer test must generate target/pacts/CONSUMER_NAME-PROVIDER_NAME.json + Provider test must pass verification against TARGET_API. + Confirm the pact JSON file exists: ls pact/target/pacts/ + +5. Validate GitHub Actions YAML syntax: + python3 -c "import yaml; yaml.safe_load(open('.github/workflows/pr-quality-gate.yml'))" + python3 -c "import yaml; yaml.safe_load(open('.github/workflows/staging-smoke.yml'))" + Both must parse without error. + +Known issues to watch for: +- Pact V4 vs V3 mismatch: if you see "Method does not conform required method signature + 'public V4Pact xxx(PactBuilder builder)'" — add pactVersion = PactSpecVersion.V3 + to the @PactTestFor annotation. +- HttpTestTarget constructor error: use HttpTestTarget.fromUrl(URL) static factory, + not the constructor. The constructor signature changed in Pact 4.6.x. +- Surefire ordering: if PatientProviderTest runs before PrescriptionConsumerTest, + the pact JSON does not exist yet. Fix by disabling default-test execution and + adding two named executions in declaration order. +- karate groupId: use com.intuit.karate:karate-junit5:1.4.0, NOT io.karatelabs — + io.karatelabs is not on Maven Central. +- -DfailIfNoSpecifiedTests=false: required in CI when using -Dtest=ClassName with + a module that has multiple Surefire executions. + +After all tests pass: +- git add -A +- git commit -m "test: all layers passing against TARGET_API" +- Confirm git log shows a clean linear history with one commit per layer +``` + +--- + +### Verification Checklist + +- [ ] REST Assured: 0 assertions in client classes — clients return `ValidatableResponse`, assertions live only in test classes +- [ ] Karate: feature files readable by a non-engineer — scenario names describe business outcomes, not HTTP operations +- [ ] Pact: consumer contract JSON generated at `pact/target/pacts/CONSUMER_NAME-PROVIDER_NAME.json` +- [ ] Pact: provider verification calls the real API (check logs — no mock server in provider test) +- [ ] Pipeline: 7-job dependency chain correct — 5 parallel jobs (REST Assured, Karate, 3×k6 component), then Pact Consumer, then Pact Provider +- [ ] Pipeline: pact artifact download path in job 6 matches `@PactFolder("target/pacts")` +- [ ] k6: p(95) syntax in all threshold expressions — `p(95)<500` not `p95<500` +- [ ] performance-stress.yml uses literal cron string — not a `${{ vars.X || 'default' }}` expression +- [ ] Five parallel jobs in PR gate before Pact Consumer (`needs` array has all five) +- [ ] Stress test artifacts uploaded after run (`k6/reports/` as k6-stress-results) +- [ ] All tests passing locally before any commit +- [ ] YAML syntax valid for all three workflow files +- [ ] Git log shows clean linear commit history — one commit per layer, no fixup commits visible + +--- + +### How to Present This to a Technical VP + +Five talking points. No notes needed. + +**1. Three problems with Postman at enterprise scale** +Postman breaks in three specific ways at platform scale: governance (test changes ship without review because collections live in personal accounts), secrets (credentials exported as plaintext JSON — a HIPAA violation waiting to happen), and contract blindness (Newman cannot tell you when a provider team's schema change breaks every downstream consumer). This platform solves all three, each with the right tool. + +**2. Why client/test separation matters** +REST Assured client classes contain zero assertions. They return `ValidatableResponse`. Test classes contain zero HTTP configuration. This is the API equivalent of the Page Object Model. When an endpoint changes, you update one client class. When an assertion changes, you update one test class. Engineers who violate this boundary create maintenance debt that compounds across every test class that touches that endpoint. + +**3. What Pact solves that functional testing cannot** +A Karate scenario that asserts `status 200` and checks a few fields will pass even if the provider renames a field your consumer depends on. Pact inverts the testing direction: the consumer declares what minimum shape it needs, the provider is required to satisfy that declaration before merging. The Patient API team cannot rename `name` to `fullName` without first failing `PatientProviderTest` on their own PR. The conversation happens between two engineers before a single line ships to staging. + +**4. Pipeline gate logic — what runs when and why** +Functional tests run on PR because they're fast and scoped to the changed service. Pact consumer runs after functional tests because a contract generated from broken code encodes the bug as a requirement. Pact provider runs after consumer because the pact JSON artifact must exist. Staging smoke runs post-deploy to validate the deployment itself — DNS, ingress, secrets injection — not the code, which was already validated before the artifact was promoted. + +**5. How to answer "did you write this?"** +Yes. Walk through these specifics: the `FailureOnlyLoggingFilter` in `ApiConfig` is a static inner class that logs at ERROR only when `response.statusCode() >= 400` — silent in the passing case. The Pact V3 format is explicit because Pact 4.6.7 defaults to V4 which requires a different DSL. The two ordered Surefire executions in `pact/pom.xml` exist because alphabetical ordering puts `PatientProviderTest` before `PrescriptionConsumerTest` — provider verification would fail on a missing artifact. The `-DfailIfNoSpecifiedTests=false` flag in CI is required when filtering by class name across a multi-execution Surefire config. These are not decisions you make by reading documentation — they come from debugging a real build. + +**6. Two-stage performance model — why component gates on every PR prevent the performance regression problem that only shows up in load testing two weeks before release.** +Component gates run one script per endpoint, 60 seconds each, on every PR. A developer knows within 2 minutes whether their change degraded response time. Without this, performance testing happens as a late-stage activity — a load test run against staging two weeks before release, at which point the regression is buried in six sprints of code and root cause takes days to isolate. The system load test on staging validates the full e2e journey under realistic concurrency. The scheduled stress test is break point discovery — it informs capacity planning and auto-scaling thresholds, not a pass/fail gate. Three different purposes, three different triggers, none of them redundant. + +--- + +### Customization Guide + +**TypeScript stack (swap REST Assured for Playwright API)** +Replace the `rest-assured` module with a Node.js module using Playwright's `APIRequestContext`. Keep Karate for BDD scenarios (Karate runs on JVM but can be used alongside a TypeScript build). Replace Pact JVM with `@pact-foundation/pact` (same consumer-driven model, TypeScript DSL). The GitHub Actions jobs stay structurally identical — only the runtime and test commands change. + +**Python stack (swap REST Assured for Requests + pytest)** +Replace `rest-assured` with a Python module using `requests` and `pytest`. Use `pytest-pact` for contract testing. Karate can stay as the BDD governance layer (cross-language capability is a Karate strength). The client/test separation pattern applies directly: `client.py` classes contain no assertions, `test_*.py` files contain no HTTP configuration. + +**Non-Java teams (Karate becomes primary layer)** +If the QE team cannot maintain Java code, collapse REST Assured into Karate. Karate's native HTTP client handles most functional validation scenarios. Retain Pact — it is the highest-value layer and the one with no adequate alternative. Limit Java ownership to `KarateRunner.java` and the Pact test classes; everything else is `.feature` files and `karate-config.js`. + +**k6 performance layer (already included — Prompt 8)** +The k6 layer is built into this skill. Use Prompt 8 as written. Key reminder when adapting: cron schedule in `performance-stress.yml` must be a literal string (`'PERF_SCHEDULE'`), not a GitHub Actions expression — expressions are not evaluated in `on.schedule.cron`. Threshold syntax must use `p(95)` not `p95` — the shorthand form throws a runtime parse error in k6. diff --git a/docs/CLAUDE_SKILL_API_QE_PLATFORM.md b/docs/CLAUDE_SKILL_API_QE_PLATFORM.md index c4ed288..4ab1d9f 100644 --- a/docs/CLAUDE_SKILL_API_QE_PLATFORM.md +++ b/docs/CLAUDE_SKILL_API_QE_PLATFORM.md @@ -6,7 +6,7 @@ ### Purpose -This skill builds a production-quality, three-layer API testing reference architecture on a Java/Maven multi-module project. Use it when replacing Postman/Newman in a CI pipeline, demonstrating enterprise QE platform thinking, or establishing a governed, code-reviewed testing foundation for a Java microservices org. The output is a fully working Maven project with REST Assured (functional validation), Karate DSL (BDD governance), and Pact (consumer-driven contract testing), wired together in a GitHub Actions pipeline with a 4-job dependency chain. Every layer is independently buildable and independently runnable against any REST API. +This skill builds a production-quality, four-layer API testing reference architecture on a Java/Maven multi-module project. Use it when replacing Postman/Newman in a CI pipeline, demonstrating enterprise QE platform thinking, or establishing a governed, code-reviewed testing foundation for a Java microservices org. The output is a fully working Maven project with REST Assured (functional validation), Karate DSL (BDD governance), Pact (consumer-driven contract testing), and k6 (two-stage performance engineering), wired together in a GitHub Actions pipeline with a 7-job dependency chain with two-stage performance gates. Every layer is independently buildable and independently runnable against any REST API. --- @@ -31,6 +31,7 @@ Replace these values throughout all prompts before using them. | `PROVIDER_NAME` | `PatientService` | Pact provider service name (the service receiving calls) | | `TARGET_API` | `https://jsonplaceholder.typicode.com` | Mock API base URL used in CI and local runs | | `DOMAIN` | `prescription/patient` | Business domain for test naming, feature file directories, and scenario language | +| `PERF_SCHEDULE` | `0 2 * * 1` | Cron schedule for stress test (Monday 2AM UTC) | --- @@ -484,6 +485,135 @@ pipeline-design.md must cover: --- +### Prompt 8 — k6 Two-Stage Performance Layer + +``` +Add a two-stage k6 performance testing layer to PROJECT_NAME. +k6 is JavaScript — not a Maven module. Do not add it to pom.xml. + +Create this folder structure: +k6/ +├── README.md +├── config/ +│ ├── thresholds.js +│ └── environments.js +├── component/ +│ ├── patient-lookup.js +│ ├── prescription-api.js +│ └── cart-api.js +├── system/ +│ ├── prescription-checkout-load.js +│ └── prescription-checkout-stress.js +└── reports/ + └── .gitkeep + +k6/config/environments.js: +export const config = { + baseUrl: __ENV.API_BASE_URL || 'TARGET_API', + environment: __ENV.ENVIRONMENT || 'local', +}; + +k6/config/thresholds.js — three threshold tiers. +CRITICAL: k6 threshold syntax uses p(95) not p95. +"p95<500" will throw "failed parsing threshold expression" at runtime. +The correct syntax is "p(95)<500". + +export const componentThresholds = { + http_req_duration: ['p(95)<500', 'p(99)<1000'], + http_req_failed: ['rate<0.01'], +}; +export const systemThresholds = { + http_req_duration: ['p(95)<2000', 'p(99)<3000'], + http_req_failed: ['rate<0.01'], + http_reqs: ['rate>10'], +}; +export const stressThresholds = { + http_req_duration: ['p(99)<5000'], + http_req_failed: ['rate<0.05'], +}; + +Component test files (k6/component/): +Three files — one per endpoint. Each follows this pattern: +- Import config and componentThresholds +- stages: 15s ramp to 5 VUs → 30s hold → 15s ramp down (60s total) +- Random patientId: Math.floor(Math.random() * 10) + 1 +- patient-lookup.js: GET /users/{id}, check status 200, duration < 500, + id and name present in response body +- prescription-api.js: POST /posts with JSON payload, check status 201, + duration < 500, id returned +- cart-api.js: GET /posts?userId={id}, check status 200, + duration < 500, response is an array +- All checks use r.timings.duration (not r.timings.waiting) +- sleep(1) after each iteration + +System test files (k6/system/): +prescription-checkout-load.js — full e2e journey, systemThresholds: + stages: 1m ramp to 20 VUs → 3m hold → 1m ramp down + Three groups in sequence: Patient Record Lookup, Prescription History, + Submit Prescription Refill + Each group: HTTP call, check status + SLA, sleep(Math.random() * 2 + 1) + +prescription-checkout-stress.js — break point discovery, stressThresholds: + Add this comment at top: + "Stress test — identifies break point and recovery behavior. + Run scheduled, not on every deployment. + Results inform capacity planning and auto-scaling thresholds." + stages: 2m→50 VUs, 3m→100, 2m→200, 2m hold at 200, 1m→0 + Same three-group journey as load test + +Update .github/workflows/pr-quality-gate.yml: +Add three k6 component jobs that run in PARALLEL with rest-assured-functional +and karate-bdd (no dependencies on each other): + + k6-patient-lookup, k6-prescription-api, k6-cart-api — each with: + - Install k6 via apt (keyring method, not snap): + sudo gpg -k + sudo gpg --no-default-keyring \ + --keyring /usr/share/keyrings/k6-archive-keyring.gpg \ + --keyserver hkp://keyserver.ubuntu.com:80 \ + --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D69 + echo "deb [signed-by=...] https://dl.k6.io/deb stable main" | sudo tee ... + sudo apt-get update && sudo apt-get install k6 + - k6 run --env API_BASE_URL=${{ env.API_BASE_URL }} k6/component/{script}.js + +Update pact-consumer needs array to include all five parallel jobs: + needs: [rest-assured-functional, karate-bdd, k6-patient-lookup, + k6-prescription-api, k6-cart-api] + +Update .github/workflows/staging-smoke.yml: +Add k6-system-load job after staging-smoke: + needs: [staging-smoke] + Install k6, run prescription-checkout-load.js + --out json=k6/reports/load-results.json + upload-artifact: k6/reports/ as k6-load-results + +Create .github/workflows/performance-stress.yml: + on: + schedule: + - cron: 'PERF_SCHEDULE' (literal string — NOT an expression) + workflow_dispatch: + inputs: + target_url: (description, required: false, default: TARGET_API) + + CRITICAL: Do not use ${{ vars.X || 'default' }} in the cron field. + GitHub Actions does not evaluate expressions in on.schedule.cron. + Use a literal cron string only. + + One job: stress-test + Install k6, run prescription-checkout-stress.js + --out json=k6/reports/stress-results.json + upload-artifact: k6/reports/ as k6-stress-results + +k6/README.md — peer-level, no tutorials. Cover: + - Two-stage model: why component gates on PR vs system load on deploy + - Threshold values and the reasoning behind each (p95 vs p99, 500ms vs 2000ms) + - Running locally (three commands) + - Azure Load Testing production path (reference only, not implemented) + - Pipeline integration table: stage, trigger, scripts, gate behavior +``` + +--- + ### Prompt 7 — Validation and Cleanup ``` @@ -537,10 +667,14 @@ After all tests pass: - [ ] Karate: feature files readable by a non-engineer — scenario names describe business outcomes, not HTTP operations - [ ] Pact: consumer contract JSON generated at `pact/target/pacts/CONSUMER_NAME-PROVIDER_NAME.json` - [ ] Pact: provider verification calls the real API (check logs — no mock server in provider test) -- [ ] Pipeline: 4-job dependency chain correct — jobs 1+2 parallel, job 3 waits on both, job 4 waits on job 3 -- [ ] Pipeline: pact artifact download path in job 4 matches `@PactFolder("target/pacts")` +- [ ] Pipeline: 7-job dependency chain correct — 5 parallel jobs (REST Assured, Karate, 3×k6 component), then Pact Consumer, then Pact Provider +- [ ] Pipeline: pact artifact download path in job 6 matches `@PactFolder("target/pacts")` +- [ ] k6: p(95) syntax in all threshold expressions — `p(95)<500` not `p95<500` +- [ ] performance-stress.yml uses literal cron string — not a `${{ vars.X || 'default' }}` expression +- [ ] Five parallel jobs in PR gate before Pact Consumer (`needs` array has all five) +- [ ] Stress test artifacts uploaded after run (`k6/reports/` as k6-stress-results) - [ ] All tests passing locally before any commit -- [ ] YAML syntax valid for both workflow files +- [ ] YAML syntax valid for all three workflow files - [ ] Git log shows clean linear commit history — one commit per layer, no fixup commits visible --- @@ -564,6 +698,9 @@ Functional tests run on PR because they're fast and scoped to the changed servic **5. How to answer "did you write this?"** Yes. Walk through these specifics: the `FailureOnlyLoggingFilter` in `ApiConfig` is a static inner class that logs at ERROR only when `response.statusCode() >= 400` — silent in the passing case. The Pact V3 format is explicit because Pact 4.6.7 defaults to V4 which requires a different DSL. The two ordered Surefire executions in `pact/pom.xml` exist because alphabetical ordering puts `PatientProviderTest` before `PrescriptionConsumerTest` — provider verification would fail on a missing artifact. The `-DfailIfNoSpecifiedTests=false` flag in CI is required when filtering by class name across a multi-execution Surefire config. These are not decisions you make by reading documentation — they come from debugging a real build. +**6. Two-stage performance model — why component gates on every PR prevent the performance regression problem that only shows up in load testing two weeks before release.** +Component gates run one script per endpoint, 60 seconds each, on every PR. A developer knows within 2 minutes whether their change degraded response time. Without this, performance testing happens as a late-stage activity — a load test run against staging two weeks before release, at which point the regression is buried in six sprints of code and root cause takes days to isolate. The system load test on staging validates the full e2e journey under realistic concurrency. The scheduled stress test is break point discovery — it informs capacity planning and auto-scaling thresholds, not a pass/fail gate. Three different purposes, three different triggers, none of them redundant. + --- ### Customization Guide @@ -577,5 +714,5 @@ Replace `rest-assured` with a Python module using `requests` and `pytest`. Use ` **Non-Java teams (Karate becomes primary layer)** If the QE team cannot maintain Java code, collapse REST Assured into Karate. Karate's native HTTP client handles most functional validation scenarios. Retain Pact — it is the highest-value layer and the one with no adequate alternative. Limit Java ownership to `KarateRunner.java` and the Pact test classes; everything else is `.feature` files and `karate-config.js`. -**Adding k6 performance layer** -Add a fourth module `k6/` at the root. k6 scripts are JavaScript, so this module holds the scripts and a Maven exec plugin that invokes the k6 binary. Add a fifth job to `pr-quality-gate.yml` — `k6-performance` — that runs after `rest-assured-functional` with a threshold: p95 < 500ms. This gate runs on PR, not post-deploy, for the same reason Pact runs on PR: catching a regression before merge costs a conversation, catching it post-deploy costs a rollback. +**k6 performance layer (already included — Prompt 8)** +The k6 layer is built into this skill. Use Prompt 8 as written. Key reminder when adapting: cron schedule in `performance-stress.yml` must be a literal string (`'PERF_SCHEDULE'`), not a GitHub Actions expression — expressions are not evaluated in `on.schedule.cron`. Threshold syntax must use `p(95)` not `p95` — the shorthand form throws a runtime parse error in k6. diff --git a/docs/CURRENT_STATE_POSTMAN.md b/docs/CURRENT_STATE_POSTMAN.md new file mode 100644 index 0000000..c532995 --- /dev/null +++ b/docs/CURRENT_STATE_POSTMAN.md @@ -0,0 +1,85 @@ +# Current State — Postman/Newman at Enterprise Scale + +## What Postman Does Well + +Postman is genuinely excellent at what it was designed to do: API exploration, interactive documentation, and functional validation for small teams. For a team of two to five engineers iterating on a single service, Postman collections are fast to write, easy to share, and Newman extends that work cleanly into CLI execution. The feedback loop is tight, the tooling is approachable, and for early-stage API development, nothing gets you from zero to a working test suite faster. This is not a criticism of Postman. It is a recognition that the tool was built for a specific problem space, and it solves that problem well. + +--- + +## Where It Breaks + +### 1. Collection Governance at Team Scale + +Collections are JSON files. At small scale, this is fine. At enterprise scale — ten teams, forty microservices, dozens of contributors — JSON becomes the governance problem. + +Merge conflicts on collection files are common and difficult to resolve meaningfully. Two engineers add tests to the same collection in the same sprint. The resulting diff is hundreds of lines of nested JSON. There is no enforceable standard for assertion quality, naming conventions, or structural patterns. A test that asserts `pm.response.code === 200` and nothing else passes code review because there is no review surface that makes the inadequacy obvious. + +Newman CI integration looks like this: + +```bash +newman run collection.json -e environment.json \ + --reporters cli,junit \ + --reporter-junit-export results.xml +``` + +For one team, this is sufficient. For ten teams, it means ten independently maintained collection files with no shared standards, no shared assertion libraries, and no mechanism to enforce consistency across services. Pipeline output is a JUnit XML file. What it does not tell you is whether the tests inside that file were actually asserting anything meaningful. + +--- + +### 2. Environment Variable Management + +Environment files are JSON. Variables are frequently hardcoded into collections. There is no type safety, no schema validation, and no enforcement mechanism to ensure that a variable referenced in a collection actually exists in the environment file being used. + +Secrets management is manual discipline. Engineers copy staging credentials into local environment files. Production credentials live in a separate file — or they do not, until they are needed. The pipeline breaks because one engineer updated the staging environment file and pushed. The production file was not updated. No one noticed until the post-deploy smoke test failed at 11pm. + +This is not a hypothetical. This is the failure mode that happens in every organization that runs Postman at scale long enough. + +--- + +### 3. Contract Blindness Between Microservices + +Newman validates API responses against expectations written at the time the collection was authored. It cannot validate service-to-service contracts — the implicit agreements between services about what shape data will take as it moves through a distributed system. + +A concrete example: `PatientService` returns a user object with an `email` field. `PrescriptionService` consumes that field downstream. A developer on the Patient team renames `email` to `emailAddress` — a reasonable change, properly documented in their service's Swagger file. The `PrescriptionService` Newman collection continues to pass. Its tests never looked at that field. Integration testing catches the breakage two weeks later in a staging environment. The root cause is buried in six sprints of commits across two repositories, owned by two teams. + +Newman had no mechanism to surface this. It was not designed to. Contract testing requires a different architectural layer entirely. + +--- + +### 4. Shift-Left Gap + +In most organizations running Postman at scale, collections are owned by QE, not developers. The developer builds the API. QE builds the collection. The hand-off introduces a quality gap that is structural, not accidental: the test is never in the same sprint as the feature. By the time QE has authored and validated the collection, the developer has moved on. Feedback is delayed. Defects are more expensive to fix. + +This directly contradicts the shift-left mandate that most engineering organizations have adopted as policy. Postman's tooling does not prevent shift-right testing — it subtly encourages it by making QE the natural owner of the collection authoring workflow. + +--- + +### 5. No Performance Signal + +Newman has no performance testing capability. Performance is a separate tool, a separate phase, and in most organizations, a separate team. The result is that performance regressions are invisible in the CI pipeline. A change that doubles the p99 latency on a critical endpoint ships through the pipeline without any signal. The regression surfaces during load testing two weeks before the release date, when remediation options are limited and pressure is highest. + +There is no architectural reason performance signal has to live outside the pipeline. It is a tooling gap, not an engineering constraint. + +--- + +## The Organizational Cost + +The maintenance burden of a Postman-at-scale environment grows linearly with team size. Every new microservice added to the portfolio adds a new collection to govern, a new environment file to maintain, and a new QE workstream to staff. Pipeline confidence erodes as collections drift out of sync with the APIs they are supposed to validate — tests pass because they are asserting against outdated assumptions, not because the service is behaving correctly. QE, rather than accelerating delivery, becomes a bottleneck: holding sprint velocity back while collections are authored, reviewed, and stabilized. This is the organizational cost that does not appear on any dashboard until it is already embedded in the team's delivery rhythm. + +--- + +## What the Solution Looks Like + +The reference architecture at [github.com/carthikl/api-quality-platform-reference](https://github.com/carthikl/api-quality-platform-reference) addresses each of these gaps with a four-layer approach built for team scale, CI/CD integration, and contract protection. It replaces collection governance chaos with code-reviewed TypeScript, adds Pact-based contract validation between services, and embeds k6 performance signal directly into the PR gate — before merge, not before release. It was architected and directed over a weekend using Claude Code — the same AI-native productivity model I would bring to the Walgreens QE function. + +--- + +## A Note on Postman's Role Going Forward + +Postman stays. Its role narrows. + +- **Exploration** — engineers use it to understand unfamiliar APIs during development +- **Documentation** — collections serve as living, interactive API documentation for the team +- **Environment smoke** — Newman runs 5-minute post-deploy smoke tests to confirm a deployment landed correctly + +That is the right job for Postman. Not the primary quality gate in the pipeline. diff --git a/docs/TEST_PYRAMID.md b/docs/TEST_PYRAMID.md new file mode 100644 index 0000000..35378ec --- /dev/null +++ b/docs/TEST_PYRAMID.md @@ -0,0 +1,88 @@ +# Test Pyramid — Complete Quality Engineering Strategy + +## The Principle + +The testing pyramid is not a tool selection framework. It is a risk distribution model. Each layer catches a different category of defect at the lowest possible cost — and cost is measured in time, compute, and feedback delay, not just money. The higher the layer, the more expensive the defect it catches: a cross-service business flow failure found in staging costs an order of magnitude more to diagnose and remediate than a contract violation caught at PR merge. The lower the layer, the faster the signal returns to the developer. Designing a quality engineering strategy means deliberately placing each category of risk at the layer where it can be caught earliest and cheapest. + +--- + +## The Complete Pyramid + +``` + ▲ + /|\ + / | \ + / | \ + / E2E/ \ + / Integration\ + / (Karate @smoke)\ + /___________________\ + / \ + / Service Virtualization\ + / (Hoverfly / WireMock) \ + /____________________________\ + / \ + / Contract Testing \ + / (Pact) \ + /______________________________________\ + / \ + / Component Functional Testing \ + / (REST Assured + Karate) \ + /______________________________________________ \ + / \ + / Performance Engineering \ + / (k6) \ + /______________________________________________________ \ +/ \ +/ Unit Testing \ +/ (Developer Framework) \ +/___________________________________________________________\ +``` + +--- + +## What Each Layer Catches + +| Layer | Tool | Defect Category | When It Runs | +|---|---|---|---| +| E2E Integration | Karate `@smoke` | Cross-service business flow failures | Staging deploy | +| Service Virtualization | Hoverfly / WireMock | Integration failures in isolation | PR + staging | +| Contract Testing | Pact | Interface contract violations | PR + deploy | +| Component Functional | REST Assured + Karate | Endpoint behavior regressions | Every PR | +| Performance Engineering | k6 | SLA violations at component and system level | PR + staging + scheduled | +| Unit Testing | Developer framework | Logic and calculation errors | Every commit | + +--- + +## What This Repository Implements + +Four of the six layers are implemented and running in the pipeline: + +- **Component Functional** — REST Assured + Karate ✅ +- **Contract Testing** — Pact consumer/provider ✅ +- **Performance Engineering** — k6 two-stage ✅ +- **E2E Integration** — Karate `@smoke` on staging ✅ + +--- + +## The Gap — Service Virtualization + +Service virtualization sits between component testing and full integration testing. It allows services to be tested in combination without all dependencies being live — catching integration failures in isolation, before a full staging environment is required. + +**Hoverfly** is lightweight, Go-based, and designed for microservices. It captures real traffic and replays it as stubs, making it CI/CD native with minimal configuration overhead. Ideal for Java/Kubernetes environments where teams want fast, low-friction dependency stubbing at the network layer. + +**WireMock** is Java-native with a deep enterprise ecosystem. More configuration overhead than Hoverfly, but stronger IDE integration, broader community adoption, and more expressive stubbing capabilities for complex scenarios. + +**Why not implemented here:** JSONPlaceholder does not require virtualization — it is already a stub service. In a production Walgreens implementation, Hoverfly would virtualize the `PatientService` dependency so `PrescriptionService` can be tested in isolation without `PatientService` being deployed. A developer changing `PrescriptionService` business logic would run against a Hoverfly stub of `PatientService` in the PR pipeline — catching integration failures before the branch ever reaches staging. That is the production pattern. It is not demonstrated in this reference architecture because the reference uses a public mock API that eliminates the need for it. + +--- + +## Inter-Service Functional Testing + +The question of how to test functional behavior between microservices beyond contract validation is answered by three layers working together, not one. Pact validates the interface agreement — field names, types, and response structure — ensuring that a change to `PatientService` that renames `email` to `emailAddress` is caught before it reaches any downstream consumer. Hoverfly or WireMock then validates functional behavior with stubbed dependencies: business logic, error handling, and edge cases that require a dependency to be present but not necessarily live. Finally, Karate `@smoke` validates the full end-to-end journey with real services on staging, confirming that the system behaves correctly as an integrated whole under realistic conditions. Each layer is necessary. None is sufficient alone. + +--- + +## The Operating Principle + +"The pyramid is not a QA artifact. It is an engineering discipline — owned by the squads, governed by the platform, measured by the outcomes." diff --git a/docs/architecture-decision.md b/docs/architecture-decision.md index cad1f23..439979b 100644 --- a/docs/architecture-decision.md +++ b/docs/architecture-decision.md @@ -23,6 +23,7 @@ Three-layer Maven multi-module platform, each layer solving exactly one of the a - **REST Assured** — deep payload validation, auth flows, and stateful sequences that require Java logic - **Karate** — governed BDD scenarios reviewable by non-engineers; test changes are PR-gated artifacts, not personal account exports - **Pact** — consumer-driven contracts that surface schema drift between services at PR time, not in production +- **k6** — two-stage performance engineering; performance caught late in the cycle --- diff --git a/docs/images/pipeline-visual.png b/docs/images/pipeline-visual.png index bb27905..eba24eb 100644 Binary files a/docs/images/pipeline-visual.png and b/docs/images/pipeline-visual.png differ diff --git a/docs/pipeline-design.md b/docs/pipeline-design.md index 77db114..13819c8 100644 --- a/docs/pipeline-design.md +++ b/docs/pipeline-design.md @@ -14,10 +14,11 @@ A test in the wrong gate is quality theater. Catching a contract violation post- |-----|-------------------|----------| | REST Assured | HTTP behavior, schema, SLA for the changed service | Fastest signal on functional regressions; scoped to the module under change | | Karate @smoke | Critical paths across service boundaries | Readable by the PR reviewer; catches cross-service breakage | +| k6 component tests | p(95)<500ms threshold per endpoint — one script per endpoint | Performance regression caught at PR, not load test; runs in parallel with REST Assured and Karate | | Pact consumer | Generates the contract artifact from the changed consumer code | Contract must come from known-good code — functional tests gate this | | Pact provider | Verifies the provider still satisfies all consumer contracts | A provider change that breaks a downstream consumer is blocked here, not in production | -Jobs 1 and 2 run in parallel. Job 3 waits on both. Job 4 waits on Job 3. Wall time under 5 minutes. +Jobs 1, 2, and 3 (REST Assured, Karate, and k6 component tests) run in parallel. Five parallel jobs must pass before Pact Consumer runs. Pact Provider waits on Pact Consumer. Wall time under 5 minutes. **Merge block is absolute.** Any failure blocks merge. No bypass without QE Director approval. @@ -33,11 +34,21 @@ Nothing additional runs. The PR gate is the quality gate. Triggering the same te **Workflow:** `staging-smoke.yml` | **Trigger:** `workflow_dispatch` (initiated by deployment pipeline post-rollout) -Runs Karate `@smoke` scenarios only against the live staging endpoint. +Runs Karate `@smoke` scenarios against the live staging endpoint, followed by the k6 system load test — the full prescription checkout journey at load, with a p(95)<2000ms threshold enforced end-to-end. -This gate validates the **deployment**, not the code: DNS resolution, ingress routing, service mesh config, secrets injection. The code was already validated before the artifact was promoted. REST Assured and Pact do not re-run here — they would prove nothing new. +This gate validates the **deployment**, not the code: DNS resolution, ingress routing, service mesh config, secrets injection, and system-level performance under realistic traffic. The code was already validated before the artifact was promoted. REST Assured and Pact do not re-run here — they would prove nothing new. -If smoke fails, staging is unhealthy. Stop deployments until resolved. +If smoke fails, staging is unhealthy. If k6 system load fails, the deployment is functionally live but not performance-safe. Stop promotions to production until resolved. + +--- + +## Scheduled — Performance Stress + +**Workflow:** `performance-stress.yml` | **Trigger:** Scheduled Monday 2AM UTC, on-demand via `workflow_dispatch` + +Runs the k6 stress test — ramping to 200 VUs for break point discovery. This is not a regression gate; it is a capacity planning signal. The test identifies the point at which the system begins to degrade, so that capacity decisions are data-driven rather than estimated. + +Results are uploaded as a pipeline artifact and retained for trend comparison across runs. On-demand dispatch supports ad-hoc capacity validation before major release events or infrastructure changes. --- @@ -45,8 +56,14 @@ If smoke fails, staging is unhealthy. Stop deployments until resolved. ``` rest-assured-functional ──┐ - ├──→ pact-consumer ──→ pact-provider -karate-bdd ────────────────┘ + │ +karate-bdd ────────────────┼──→ pact-consumer ──→ pact-provider + │ +k6-patient-lookup ─────────┤ + │ +k6-prescription-api ───────┤ + │ +k6-cart-api ───────────────┘ ``` Pact consumer runs after functional tests because a contract generated from broken code is wrong by definition — it encodes the bug as a requirement and trains the provider to satisfy incorrect expectations. The sequence is not a convenience; it is a correctness constraint. diff --git a/k6/pom.xml b/k6/pom.xml new file mode 100644 index 0000000..a98a743 --- /dev/null +++ b/k6/pom.xml @@ -0,0 +1,66 @@ + + + 4.0.0 + + + com.wag.qe + api-quality-platform-reference + 1.0.0-SNAPSHOT + + + k6 + pom + + Walgreens API Quality Platform — k6 Component Performance + + k6 performance component tests, invoked through Maven via exec-maven-plugin. + Binding k6 to the standard test phase gives `mvn test -pl k6` the same semantics + as every other test module in this repo — governance parity across layers. + + + + + + + org.codehaus.mojo + exec-maven-plugin + + + k6-patient-lookup + test + + exec + + + k6 + ${project.basedir}/.. + + run + k6/component/patient-lookup.js + + + + ${api.base.url} + + + + + + + + diff --git a/pom.xml b/pom.xml index 6eed2e8..ff44406 100644 --- a/pom.xml +++ b/pom.xml @@ -19,6 +19,7 @@ rest-assured karate pact + k6 @@ -48,6 +49,7 @@ 3.2.5 3.2.5 3.12.1 + 3.1.0 https://jsonplaceholder.typicode.com @@ -155,6 +157,16 @@ maven-failsafe-plugin ${maven-failsafe.version} + + + org.codehaus.mojo + exec-maven-plugin + ${exec-maven-plugin.version} +