SELL-362 dogfood: doctor accuracy (R2-B1/B2/B5) + test-order model correction#55
Conversation
…2-B1)
SELL-362 dogfood (Sasa) found doctor flagging 21 real low-integer
Campaigns-API ref_ids ("1"/"2") as "starter-looking demo refs" — 57% of
warnings were noise on a valid spec, training operators to ignore the very
warning meant to catch real placeholders.
Root cause: collectDemoRefHits suppressed a hit only when an ancestor carried
`_provenance.api`, but the Map exporter does not stamp provenance down to the
ref level, so valid refs whose value happened to match a starter demoOnlyValue
were flagged.
Fix: add a second escape hatch — suppress a hit when its value matches a ref
the spec itself declares as a real commerce entity (package / offer /
shipping_method), via a new specDeclaredCommerceRefs() built on the existing
specPackageRefs/specShippingRefs helpers. A ref that points at a declared
entity is legitimate regardless of ref-level provenance stamping; refs that
point at nothing the spec defines are still flagged, so real signal is kept.
- src/cli.mjs: specDeclaredCommerceRefs(); collectDemoRefHits() suppression +
export for testing
- src/doctor-demo-ref.test.mjs: 5 unit tests (declared package/shipping
suppressed; dangling ref still flagged; provenance escape intact; empty
vocab) wired into `npm run check`
The export-side half of R2-B1 (stamp ref_id-level provenance on Map export)
remains a separate Map Builder / proxy task.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-B2) SELL-362 dogfood (Sasa): after a correct build, doctor kept emitting "fix before QA" warnings computed from the spec/source even though the build had already fixed them in the output — browser QA later proved the deployed output was right. Same noise class as R2-B1. Two emitters fixed: - Routing meta: validateSpecRoutingMetaTags flagged the spec's unrooted meta-tag *hint* literals, but the page-kit build roots them and validateBuiltSdkMetaTags checks the rendered _site/<slug>/ authoritatively. Now defers to that built-output check once assembly is complete and the rendered site exists. - Hardcoded currency: validateMarketSensitiveCopy scanned source+target lumped together, so raw $ in the source input still warned after the build tokenized it. Now partitions by surface: when the built campaign output contains HTML and is currency-clean, source-only residue is downgraded to an info note; warnings on the built output still fire. Guards on the target actually containing built HTML (an empty output dir means the build has not run, so source warnings stand) — caught by the existing market-copy fixture. - src/doctor-built-output.test.mjs: 4 unit tests (routing defers w/ built site, warns without; currency source-only info vs. built-output warn) wired into `npm run check`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Overview
Review DetailsIncremental review of PR #55 (commit 5629e34) covering the test-order model corrections: removal of approval gates, stable test customer email, and Test-Order Model Changes: The Stable Test Email: R2-B5 ( R2-B1/R2-B2: No regressions. Declared commerce refs exclusion and built-output deferral logic remain intact. Files Reviewed (10 files)
Reviewed by minimax-m2.7 · 1,063,683 tokens |
SELL-362 dogfood: the test store had store_url=https://localhost:3000/ and available_payment_methods: [] — a store that passes doctor's required-field check but cannot complete a real order. Neither doctor nor browser QA surfaced it, so it would only fail at typed-card proof time, for store-config reasons rather than campaign reasons. Extend validateSpecStoreProfile with two readiness *warnings* (not blockers, so routing/visual-only runs are unaffected): - spec.store_profile.placeholder_store_url: store_url is a local/dev host or a reserved test/example domain (conservative match; real merchant domains do not trip it). - spec.store_profile.no_payment_methods: available_payment_methods is an explicit empty array. Absent != empty, so an unconfigured field does not warn. This makes store-config risk visible *before* order-proof approval — directly relevant to the typed-card allowlist decision. - src/doctor-store-profile.test.mjs: 5 unit tests, wired into `npm run check`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… customer, `common` default The prior model treated typed-card test orders as order-mutating, merchant-risk, approval-gated events. They are not: global test cards bypass the payment gateway, create no transactions, need no merchant setup, and are safe on live stores (developer-docs admin-api testing guide). The only real control is depth, and the only artifact is one reusable customer. Execution gating removed: - qa-node maybeRunTestOrders / maybeRunLegacyApiTestOrders no longer require --allow-test-orders / --sandbox-test-card-confirmed or packet qa.* policy booleans. `--test-order <mode>` is sufficient intent. - doctor no longer emits the "test orders not enabled / avoid mutations" warning or the sandbox-confirmation error; buildNextStep no longer blocks the test-orders stage on those codes. (allowed_domains_confirmed is untouched — it governs SDK origin loading, a separate real concern.) - The packet qa.* booleans and `qa policy set` remain as informational metadata (back-compat, fixtures), but no longer gate anything. Stable test customer (the per-run-unique bug): - Both code paths minted a fresh, undeletable customer every run (qa+campaigns-os-<runId>-<ts>@example.com). Now resolve to ONE stable address: --test-email > CAMPAIGNS_OS_QA_TEST_EMAIL > stable synthetic fallback. The real deliverable inbox stays in the internal next-campaigns-ops skill (the public package can't hardcode it — private-string guard bans that domain — and shouldn't). Depth as the control: - New `common` preset (also bare `--test-order`): checkout baseline + first-upsell accept/decline + one deeper mixed path when 2+ offers (3-5 shapes, under the flood cap). `full` stays the explicit every-permutation opt-in. --max-test-orders is reframed as an accidental-flood guard, not a permission gate. Also reframes R2-B5 store-profile warnings to the real-shopper angle (decoupled from test-order proof), and rewrites the docs, bundled skills, and all four agent-instruction flavors (claude/codex/cursor/copilot) to the corrected model. Tests: stable-email + `common`-preset coverage added; help-string fixture updated. `npm run check` green (30 tests + all gates). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses the campaigns-os-side findings from the SELL-362 Round 2 dogfood (result comment), plus a corrected model for test-order gating that surfaced during review.
Part 1 — doctor accuracy (the warning-noise findings)
R2-B1 — stop flagging spec-declared refs as starter demo refs
Doctor flagged 21 real low-integer API ref_ids as starter demo refs (57% of warnings were noise).
specDeclaredCommerceRefs()now treats a ref matching a package/offer/shipping_method the spec declares as real; dangling refs still flag.R2-B2 — defer routing-meta + currency to built output
Doctor kept warning on spec/source state the build had already fixed. Routing-meta now defers to the authoritative built-output check; currency partitions source vs. built (warns only when built output has HTML and is dirty).
R2-B5 — under-configured store profile (new readiness warning)
Placeholder
store_urland emptyavailable_payment_methodsnow warn — framed as real-shopper readiness (a real customer can't transact), decoupled from test-order proof.Part 2 — test-order model correction
The prior model treated typed-card test orders as order-mutating, merchant-risk, approval-gated events requiring
--allow-test-orders+--sandbox-test-card-confirmed+ packet policy + a domain allowlist + an approved order count. Per the admin-api testing guide, that's wrong: global test cards bypass the gateway, create no transactions, need no merchant setup, and are safe on live stores. The only real control is depth; the only artifact is one reusable customer.--test-order <mode>is sufficient — no permission flags, no packet-policy gate, no doctor "test orders not enabled / avoid mutations" warning, no orchestration block. (allowed_domains_confirmedis untouched — it governs SDK origin loading, a separate real concern.qa policy set+ packetqa.*booleans stay as informational metadata.)--test-email>CAMPAIGNS_OS_QA_TEST_EMAIL> stable synthetic fallback. The real deliverable inbox stays in the internal skill (the public package can't hardcode that domain — private-string guard — and shouldn't).commonpreset (new default): checkout + first-upsell accept/decline + one deeper mixed path on 2+ offers (3-5 shapes).fullstays the explicit every-permutation opt-in.--max-test-ordersreframed as an accidental-flood guard.Tests
npm run checkgreen: 30 tests + fixtures / private-string / template-doctrine gates.New:
doctor-demo-ref(5),doctor-built-output(4),doctor-store-profile(5), stable-email +common-preset coverage inqa-browser.test.Follow-ups (separate, not this repo)
next-campaigns-opsQA skill/README still carry the old approval/allowlist phrasing and the real--test-emailinbox — needs the same model correction (separate PR; that's where the deliverable address lives).lint-sdk.mjsolympus-only) → target page-kit / meridian.🤖 Generated with Claude Code