SELL-362 dogfood: doctor accuracy (R2-B1/B2/B5) + test-order model correction by next-devin · Pull Request #55 · NextCommerceCo/campaigns-os

next-devin · 2026-06-01T13:43:05Z

Addresses the campaigns-os-side findings from the SELL-362 Round 2 dogfood (result comment), plus a corrected model for test-order gating that surfaced during review.

Part 1 — doctor accuracy (the warning-noise findings)

R2-B1 — stop flagging spec-declared refs as starter demo refs

Doctor flagged 21 real low-integer API ref_ids as starter demo refs (57% of warnings were noise). specDeclaredCommerceRefs() now treats a ref matching a package/offer/shipping_method the spec declares as real; dangling refs still flag.

R2-B2 — defer routing-meta + currency to built output

Doctor kept warning on spec/source state the build had already fixed. Routing-meta now defers to the authoritative built-output check; currency partitions source vs. built (warns only when built output has HTML and is dirty).

R2-B5 — under-configured store profile (new readiness warning)

Placeholder store_url and empty available_payment_methods now warn — framed as real-shopper readiness (a real customer can't transact), decoupled from test-order proof.

Part 2 — test-order model correction

The prior model treated typed-card test orders as order-mutating, merchant-risk, approval-gated events requiring --allow-test-orders + --sandbox-test-card-confirmed + packet policy + a domain allowlist + an approved order count. Per the admin-api testing guide, that's wrong: global test cards bypass the gateway, create no transactions, need no merchant setup, and are safe on live stores. The only real control is depth; the only artifact is one reusable customer.

Gating removed: --test-order <mode> is sufficient — no permission flags, no packet-policy gate, no doctor "test orders not enabled / avoid mutations" warning, no orchestration block. (allowed_domains_confirmed is untouched — it governs SDK origin loading, a separate real concern. qa policy set + packet qa.* booleans stay as informational metadata.)
Stable test customer: both code paths minted a fresh undeletable customer every run. Now resolve to ONE stable address: --test-email > CAMPAIGNS_OS_QA_TEST_EMAIL > stable synthetic fallback. The real deliverable inbox stays in the internal skill (the public package can't hardcode that domain — private-string guard — and shouldn't).
common preset (new default): checkout + first-upsell accept/decline + one deeper mixed path on 2+ offers (3-5 shapes). full stays the explicit every-permutation opt-in. --max-test-orders reframed as an accidental-flood guard.
Docs/skills/agents: rewrote the QA doc, bundled skills, and all four agent-instruction flavors (claude/codex/cursor/copilot) to the corrected model.

Tests

npm run check green: 30 tests + fixtures / private-string / template-doctrine gates.
New: doctor-demo-ref (5), doctor-built-output (4), doctor-store-profile (5), stable-email + common-preset coverage in qa-browser.test.

Follow-ups (separate, not this repo)

Internal next-campaigns-ops QA skill/README still carry the old approval/allowlist phrasing and the real --test-email inbox — needs the same model correction (separate PR; that's where the deliverable address lives).
R2-B1 export-side (ref-level provenance on Map export) → Map Builder / nc-campaigns-proxy.
R2-B3 (lint-sdk.mjs olympus-only) → target page-kit / meridian.
R2-B4 (candidate→source handoff) → process / SELL-361.

🤖 Generated with Claude Code

…2-B1) SELL-362 dogfood (Sasa) found doctor flagging 21 real low-integer Campaigns-API ref_ids ("1"/"2") as "starter-looking demo refs" — 57% of warnings were noise on a valid spec, training operators to ignore the very warning meant to catch real placeholders. Root cause: collectDemoRefHits suppressed a hit only when an ancestor carried `_provenance.api`, but the Map exporter does not stamp provenance down to the ref level, so valid refs whose value happened to match a starter demoOnlyValue were flagged. Fix: add a second escape hatch — suppress a hit when its value matches a ref the spec itself declares as a real commerce entity (package / offer / shipping_method), via a new specDeclaredCommerceRefs() built on the existing specPackageRefs/specShippingRefs helpers. A ref that points at a declared entity is legitimate regardless of ref-level provenance stamping; refs that point at nothing the spec defines are still flagged, so real signal is kept. - src/cli.mjs: specDeclaredCommerceRefs(); collectDemoRefHits() suppression + export for testing - src/doctor-demo-ref.test.mjs: 5 unit tests (declared package/shipping suppressed; dangling ref still flagged; provenance escape intact; empty vocab) wired into `npm run check` The export-side half of R2-B1 (stamp ref_id-level provenance on Map export) remains a separate Map Builder / proxy task. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…-B2) SELL-362 dogfood (Sasa): after a correct build, doctor kept emitting "fix before QA" warnings computed from the spec/source even though the build had already fixed them in the output — browser QA later proved the deployed output was right. Same noise class as R2-B1. Two emitters fixed: - Routing meta: validateSpecRoutingMetaTags flagged the spec's unrooted meta-tag *hint* literals, but the page-kit build roots them and validateBuiltSdkMetaTags checks the rendered _site/<slug>/ authoritatively. Now defers to that built-output check once assembly is complete and the rendered site exists. - Hardcoded currency: validateMarketSensitiveCopy scanned source+target lumped together, so raw $ in the source input still warned after the build tokenized it. Now partitions by surface: when the built campaign output contains HTML and is currency-clean, source-only residue is downgraded to an info note; warnings on the built output still fire. Guards on the target actually containing built HTML (an empty output dir means the build has not run, so source warnings stand) — caught by the existing market-copy fixture. - src/doctor-built-output.test.mjs: 4 unit tests (routing defers w/ built site, warns without; currency source-only info vs. built-output warn) wired into `npm run check`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

kilo-code-bot · 2026-06-01T15:37:09Z

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Overview

Severity	Count
CRITICAL	0
WARNING	0
SUGGESTION	0

Review Details

Incremental review of PR #55 (commit 5629e34) covering the test-order model corrections: removal of approval gates, stable test customer email, and common default mode. All changes are additive improvements building on the previously reviewed R2-B1/R2-B2/R2-B5 fixes.

Test-Order Model Changes: The common preset correctly maps to a 3-5 shape sample (checkout baseline + first-upsell accept/decline + one deeper mixed path when depth >= 2). The --test-order boolean-true case correctly falls through to common. Permission flag removal is consistent across browser and legacy API paths.

Stable Test Email: testEmail correctly resolves to a single stable address with precedence: explicit flag > env var > default fallback. No per-run unique emails are generated.

R2-B5 (validateSpecStoreProfile): No regressions from the new commit. Function signature correctly includes warnings parameter at all call sites.

R2-B1/R2-B2: No regressions. Declared commerce refs exclusion and built-output deferral logic remain intact.

Files Reviewed (10 files)

src/cli.mjs - 0 issues
src/qa-browser.mjs - 0 issues
src/qa-browser.test.mjs - 0 issues
src/qa-node.mjs - 0 issues
src/doctor-store-profile.test.mjs - 0 issues
src/doctor-demo-ref.test.mjs - 0 issues
src/doctor-built-output.test.mjs - 0 issues
skills/next-campaigns-os/SKILL.md - 0 issues
skills/next-campaigns-qa/SKILL.md - 0 issues
docs/qa-and-test-orders.md - 0 issues

_{Reviewed by minimax-m2.7 · 1,063,683 tokens}

SELL-362 dogfood: the test store had store_url=https://localhost:3000/ and available_payment_methods: [] — a store that passes doctor's required-field check but cannot complete a real order. Neither doctor nor browser QA surfaced it, so it would only fail at typed-card proof time, for store-config reasons rather than campaign reasons. Extend validateSpecStoreProfile with two readiness *warnings* (not blockers, so routing/visual-only runs are unaffected): - spec.store_profile.placeholder_store_url: store_url is a local/dev host or a reserved test/example domain (conservative match; real merchant domains do not trip it). - spec.store_profile.no_payment_methods: available_payment_methods is an explicit empty array. Absent != empty, so an unconfigured field does not warn. This makes store-config risk visible *before* order-proof approval — directly relevant to the typed-card allowlist decision. - src/doctor-store-profile.test.mjs: 5 unit tests, wired into `npm run check`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… customer, `common` default The prior model treated typed-card test orders as order-mutating, merchant-risk, approval-gated events. They are not: global test cards bypass the payment gateway, create no transactions, need no merchant setup, and are safe on live stores (developer-docs admin-api testing guide). The only real control is depth, and the only artifact is one reusable customer. Execution gating removed: - qa-node maybeRunTestOrders / maybeRunLegacyApiTestOrders no longer require --allow-test-orders / --sandbox-test-card-confirmed or packet qa.* policy booleans. `--test-order <mode>` is sufficient intent. - doctor no longer emits the "test orders not enabled / avoid mutations" warning or the sandbox-confirmation error; buildNextStep no longer blocks the test-orders stage on those codes. (allowed_domains_confirmed is untouched — it governs SDK origin loading, a separate real concern.) - The packet qa.* booleans and `qa policy set` remain as informational metadata (back-compat, fixtures), but no longer gate anything. Stable test customer (the per-run-unique bug): - Both code paths minted a fresh, undeletable customer every run (qa+campaigns-os-<runId>-<ts>@example.com). Now resolve to ONE stable address: --test-email > CAMPAIGNS_OS_QA_TEST_EMAIL > stable synthetic fallback. The real deliverable inbox stays in the internal next-campaigns-ops skill (the public package can't hardcode it — private-string guard bans that domain — and shouldn't). Depth as the control: - New `common` preset (also bare `--test-order`): checkout baseline + first-upsell accept/decline + one deeper mixed path when 2+ offers (3-5 shapes, under the flood cap). `full` stays the explicit every-permutation opt-in. --max-test-orders is reframed as an accidental-flood guard, not a permission gate. Also reframes R2-B5 store-profile warnings to the real-shopper angle (decoupled from test-order proof), and rewrites the docs, bundled skills, and all four agent-instruction flavors (claude/codex/cursor/copilot) to the corrected model. Tests: stable-email + `common`-preset coverage added; help-string fixture updated. `npm run check` green (30 tests + all gates). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

next-devin and others added 2 commits June 1, 2026 20:42

next-devin changed the title ~~fix(doctor): stop flagging spec-declared refs as starter demo refs (R2-B1)~~ fix(doctor): reduce dogfood warning noise — R2-B1 + R2-B2 Jun 1, 2026

next-devin changed the title ~~fix(doctor): reduce dogfood warning noise — R2-B1 + R2-B2~~ doctor: accuracy fixes from SELL-362 dogfood (R2-B1, R2-B2, R2-B5) Jun 1, 2026

next-devin changed the title ~~doctor: accuracy fixes from SELL-362 dogfood (R2-B1, R2-B2, R2-B5)~~ SELL-362 dogfood: doctor accuracy (R2-B1/B2/B5) + test-order model correction Jun 1, 2026

next-devin merged commit 0bb703f into main Jun 1, 2026
2 checks passed

next-devin deleted the fix/sell-362-r2-b1-doctor-demo-ref-provenance branch June 1, 2026 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SELL-362 dogfood: doctor accuracy (R2-B1/B2/B5) + test-order model correction#55

SELL-362 dogfood: doctor accuracy (R2-B1/B2/B5) + test-order model correction#55
next-devin merged 4 commits into
mainfrom
fix/sell-362-r2-b1-doctor-demo-ref-provenance

next-devin commented Jun 1, 2026 •

edited

Loading

Uh oh!

kilo-code-bot Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

next-devin commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Part 1 — doctor accuracy (the warning-noise findings)

R2-B1 — stop flagging spec-declared refs as starter demo refs

R2-B2 — defer routing-meta + currency to built output

R2-B5 — under-configured store profile (new readiness warning)

Part 2 — test-order model correction

Tests

Follow-ups (separate, not this repo)

Uh oh!

kilo-code-bot Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Overview

Review Details

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

next-devin commented Jun 1, 2026 •

edited

Loading

kilo-code-bot Bot commented Jun 1, 2026 •

edited

Loading