Restructure build plans: integrate builder into fix, add hardening by jeduden · Pull Request #235 · jeduden/mdsmith

jeduden · 2026-05-04T06:01:35Z

Summary

This PR restructures the mdsmith build system plans to integrate recipe execution into mdsmith fix rather than a separate mdsmith build subcommand, and reorganizes the security and UX layers. The changes reflect a shift from the original plan 102 architecture to a more integrated approach with explicit hardening and debugging support.

Key Changes

Plan 102 refocus: Changed from "Builder interface and mdsmith build subcommand" to "Multi-output <?build?> directive". Now focuses on directive syntax (outputs: list, inputs: list, body template rendering per output) rather than the execution engine.
Plan 103 refinement: Updated staleness/dependency tracking to use ActionID content-hashing with length-framed fields, storing cache in .mdsmith/build-cache.json. The ActionID covers the recipe spec, sorted input content hashes, and sorted output paths; per-output content hashes are stored separately for tamper detection. Any overlap across two directives' outputs: paths (exact or directory-prefix) is a build error at config load.
Plan 104 updates: Adjusted lifecycle hooks to run within mdsmith fix build pass rather than a separate mdsmith build invocation. Simplified failure semantics and execution order.
New Plan 115: "Builder execution wired into mdsmith fix" — the core execution engine. Removes the separate mdsmith build subcommand and built-in recipe drivers (screenshot, vhs). Recipes run inside mdsmith fix after the lint-fix pass. Introduces Builder interface for custom recipes and atomic multi-output write semantics.
New Plan 116: "Build execution UX" — adds stdout/stderr capture to per-target log files, rich failure diagnostics, --build-explain for staleness debugging, --build-verify for non-determinism detection, and --build-jobs for parallel execution (no extra overlap check needed since plan 103 already enforces disjoint outputs at config load).
New Plan 117: "Build execution hardening" — security layer including trust gate (.mdsmith.yml.trust stores the full trusted .mdsmith.yml contents byte-for-byte so mdsmith trust can produce a real diff), hermetic execution environment with allowlisted PATH, atomic write hardening with os.MkdirTemp random-suffix staging, output post-conditions (all declared outputs must exist; no undeclared writes detected via pre/post snapshots that include sha256 of contents, mode, size, and mtime), and process-group kill on timeout.
Research spikes added: Two new research documents (go-only.md and language-relaxed.md) evaluating whether to outsource the build engine to external orchestrators (make, Bazel, Tup, etc.). Conclusion: keep the internal engine; add mdsmith targets --json side door for power users.

Notable Implementation Details

ActionID hashing: Uses length-framed concatenation of recipe command, params, input contents, and output paths to prevent path-collision attacks. Includes cache.version field for schema evolution. Per-output content hashes are stored next to the ActionID for tamper detection on subsequent runs.
Staging directory strategy: Per-recipe temp dirs under .mdsmith/build-staging/ created via os.MkdirTemp (random suffix; OS guarantees uniqueness). World-writable parent refusal on Unix.
Trust model: .mdsmith.yml.trust stores the full trusted .mdsmith.yml contents (not just a hash), so mdsmith trust can show a real diff when the config drifts. CI environments opt in via MDSMITH_TRUST_BUILD=1 env var.
Rename strategy: Per-output Lstat symlink check followed by os.Rename (atomic per-file replace on POSIX). Multi-output rename is not transactionally atomic across files; the plan documents the partial-failure semantics (log, exit FAIL, next fix reruns the recipe because no cache write happened).
Output overlap rule: Plan 103 rejects overlapping outputs: paths at config load (exact and directory-prefix collisions). This applies in serial and parallel mode; --build-jobs N does not need a separate check.
Log retention: Per-recipe logs stored under .mdsmith/build-logs/<action-id>.log and kept until the cache entry is invalidated, enabling post-hoc debugging.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Copilot

Pull request overview

This PR updates mdsmith’s build-system planning docs to reflect a new architecture where <?build?> execution runs inside mdsmith fix, and adds new plans for execution UX and hardening plus supporting research spikes.

Changes:

Refocuses Plan 102 on the multi-output <?build?> directive (outputs:/inputs:) and related rule/doc updates.
Reworks Plans 103/104 around a mdsmith fix build pass (staleness, hooks, flags) and adds new Plans 115–117 for execution wiring, UX, and security hardening.
Adds research docs evaluating external build orchestrators (Go-only and language-relaxed).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
plan/117_build-execution-hardening.md	New plan defining trust gate, hermetic exec, atomic write hardening, and post-condition checks.
plan/116_build-execution-ux.md	New plan for logs/diagnostics, `--build-explain`, `--build-verify`, and parallel execution UX.
plan/115_builder-execution-in-fix.md	New plan wiring builder execution into `mdsmith fix` and defining core flags/behavior.
plan/104_build-lifecycle-hooks.md	Updates lifecycle hooks plan to run within the `fix` build pass and adjusts semantics/flags.
plan/103_build-staleness-and-deps.md	Updates staleness/cache design and flags to target the `fix` build pass.
plan/102_build-subcommand.md	Refactors Plan 102 to describe the multi-output directive surface and validation rules.
docs/research/build-orchestrator/go-only.md	New research spike on Go-only external orchestrators.
docs/research/build-orchestrator/language-relaxed.md	New research spike allowing non-Go orchestrators; keeps recommendation internal.
PLAN.md	Updates plan index table titles and adds entries for Plans 115–117.

codecov · 2026-05-04T06:05:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.27%. Comparing base (024dad8) to head (8fb9ce9).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

Components	Coverage Δ
Go	`96.23% <ø> (+<0.01%)`	⬆️
TypeScript	`99.35% <ø> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Drop built-in recipe drivers (screenshot via chromedp, vhs) and the `mdsmith build` subcommand. Builds now run as part of `mdsmith fix` with `--no-build` / `--build-only` opt-outs. Remove `build.base-url` config field (its only consumer was the deleted screenshot driver). `<?build?>` directive replaces `output:` (singular) with `outputs:` (list), so one recipe invocation can produce many files. Adds `inputs:` (list of paths or globs) for staleness tracking. The body template renders once per output. New collective placeholders `{outputs}` and `{inputs}` in recipe `command` strings expand to N argv arguments. Plan 102 narrows to the multi-output directive only. Plan 115 (new) covers Builder execution wiring. Plan 103 adopts the `cmd/go/internal/cache` ActionID pattern, hashing `(recipe.command || sorted input contents || output set)` per target. Plan 104 reframes hooks as part of the `fix` build pass. https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

A security review of plans 102/103/104/115 surfaced one critical finding (no trust gate before recipe execution) plus four high findings (atomic-write race, ActionID hash collision via path separators, implicit cache trust, hook param values unvalidated). A parallel review of build-system gotchas from Bazel/Gradle/Buck/ Ninja added five behavioral lessons: non-deterministic outputs defeat caching, missing-output post-conditions are required, stale-cache surprises drive `make clean` muscle memory, parallel jobs collide on undeclared shared state, buffered stdout hides hangs. Findings folded into existing plans: - 102 — path-shape rules: NUL/newline/whitespace/ADS/UNC/reserved device names rejected; symlinks resolved to project root; glob match cap of 10k. - 103 — length-framed ActionID (no path-collision attacks); cache stores per-output content hashes for tamper detection; `built-at` documented as informational. - 104 — hook param value baseline (no NUL/newline/length>4KB); build pass refuses to run when MDS040 errors against config. - 115 — trimmed back to basic execution wiring; hardening lifted out to plan 117. Tokenization pinned to `strings.Fields`. New plans: - 116 — execution UX: per-recipe stdout/stderr capture with buffer-and-dump default plus `--build-stream` live mode, per-target log retention under `.mdsmith/build-logs/`, six-field failure block with last-20-lines tail, `--build-explain TARGET`, `--build-verify` (run-twice diff), `--build-jobs N`. - 117 — execution hardening: `.mdsmith.yml.trust` direnv-style trust gate (with `MDSMITH_TRUST_BUILD=1` for CI), hermetic env (allowlisted PATH and `build.exec.env-pass-through`), atomic write with `O_EXCL` staging and `RENAME_NOREPLACE`, output post-conditions (Bazel #14543: every declared output must exist; no undeclared write may slip), process-group SIGTERM then SIGKILL on `--build-timeout`. https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Documents the spike that asked whether mdsmith should outsource its build engine to an existing Go-based orchestrator (go-task, mage, goyek, n2, please, etc.) instead of implementing the internal engine in plans 102/103/115/116/117. Verdict: stick with the internal engine. No Go-native orchestrator matches mdsmith's threat model (no trust gate, no hermetic env, no atomic multi-output writes, no output-confinement post-conditions). Recommend adding one new subcommand `mdsmith targets --json` as a side door for users already running cross-tool DAGs through go-task or make. A companion spike with the Go-language constraint relaxed will land in a follow-up commit. https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Companion to the Go-only spike. Asks: if mdsmith users can be required to install a non-Go binary, what becomes possible? Verdict: even with the language constraint dropped, the internal engine remains the right call. The candidates split: - Strongest correctness model: Shake (Haskell) — content-hash native, early cutoff, lint mode. Distribution requires GHC + Stack, hostile to the tech-writer audience. - Strongest threat model: Tup. Its FUSE / PTRACE tracing literally implements output confinement and undeclared-write detection — exactly what plan 117 reimplements by hand. Distribution requires a kernel filesystem on every platform. - Closest to plan 117 overall: Pants v2 (Python + Rust engine, sandboxed by default). ~150 MB install for a Markdown linter audience inverts the value ratio. Buck2 looks attractive but its docs admit local builds are not sandboxed without remote execution. The spike validates several internal-engine design choices: plan 103's content-hash cache mirrors Shake / Bazel / Buck2 / Pants; plan 117's output-confinement is Tup's headline feature without the FUSE dependency; plan 116's --build-explain maps to Shake's --lint and -VV. https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Per follow-up request to consider three more candidates: - ejholmes/walk (Go) — added to go-only.md. Interesting no-DSL design (Walkfile is an executable that responds to `deps` and `exec`) and parallel DAG execution via semaphore. But its plan.go does no staleness detection itself; the Walkfile script owns rebuild decisions. Adopting it would replace ~50 lines of errgroup with a third-party dep that hasn't released since Sep 2017 (138 stars, dormant). - GNU make — added to language-relaxed.md. Universal, pre-installed, but mtime-only with the same CI cache- restore failure mode as ninja, and no atomic multi- output. References Miller's "Recursive Make Considered Harmful" as the foundational caveat. - redo — expanded in language-relaxed.md. Apenwarr's Python implementation is the active fork; DJB's original sketch cited; ports in C/C++/Go are mixed. Content-hash native via redo-stamp; integration would let mdsmith absorb most of plan 103. Verdict unchanged in both docs. https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

- Remove unused link reference definitions from plans 103, 104, 116, 117 (MDS053 was added in main after this branch diverged) - Fix split inline code span across lines in plan/104 task 1 - Fix trust-gate design in plan/117: store full config content in .mdsmith.yml.trust instead of just sha256, so mdsmith trust can diff - Fix inaccurate os.MkdirTemp description: remove O_CREAT|O_EXCL claim (those flags apply to files, not directories) - Fix plan/102: remove "always stale" claim for inputs:[] that was inconsistent with plan/103's ActionID algorithm https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

…erlap - plan/117: replace RENAME_NOREPLACE (which would fail every rebuild) with Lstat symlink check + os.Rename (atomic per-file replace); document multi-output rename is not transactionally atomic and specify partial-failure recovery semantics - plan/117: strengthen undeclared-write snapshot to include sha256 of contents and mode bits, not just size+mtime (preserves catch even when a recipe modifies a file while keeping size/mtime stable); document the two known limits (parent-dir-only scope, single pre/post hash) - plan/103: any overlap in declared outputs (exact or directory-prefix) is now a build error at config load, not just on --build-jobs > 1; closes "last writer wins" hole in serial mode - plan/116: drop the now-redundant N>1 overlap check; config validation satisfies the parallel-execution safety contract - plan/102: reword empty-inputs note to match plan/103's algorithm; ActionID covers recipe spec + sorted output paths (not output content hashes — those live in the separate per-output hash store that staleness step 5 uses for tamper detection) https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

…holders - plan/102 (symlink check): EvalSymlinks fails on non-existent paths; spec now walks the longest existing prefix and uses Clean for the rest so missing outputs are not rejected - plan/102 ({outputs}/{inputs} placeholders): require standalone argv tokens after strings.Fields tokenization; embedded use is an MDS040 validation error (no defined semantics for fragment-list expansion) - plan/117 (staging root): add Lstat directory check on .mdsmith/build-staging/ before MkdirTemp; symlink/non-dir is refused (closes the "attacker pre-replaced staging root" gap) - plan/117 (rename strategy): drop RENAME_NOREPLACE from task and AC (it would fail every rebuild); align with the design section's Lstat + os.Rename approach - plan/117 (symlink snapshot): record symlink target via os.Readlink in addition to Lstat metadata (Lstat alone returns no link target) - plan/103 (overlap detection): tasks and AC now explicitly require directory-prefix overlap detection, matching the design section - plan/116 (--build-no-cache logs): clarify that --build-no-cache still writes logs but never associates them with a cache entry; orphan logs are removed at the start of the next fix invocation https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

- plan/116 (--build-explain selector): pick one rule — match TARGET against each directive's first declared output (string equality after path normalization). Plan 103's overlap rule rules out ambiguity at config load - plan/116 (--build-no-cache logs): reconcile two retention rules. --build-no-cache still writes logs but creates no cache entry; orphans are deleted at the start of the next fix invocation. Tasks and AC aligned - plan/117 (world-writable check): align design, task, AC, and test on the staging *root* (.mdsmith/build-staging/) — not its parent — matching the design section - plan/102 (path normalization): pick a single strategy. Build-time symlink check uses filepath.* internally on OS-native separators, then ToSlash for comparison and diagnostic output, preserving the slash-only invariant on Windows - plan/103 (ActionID param canonicalization): add inner length-framing per key/value/path; key=value\0 was ambiguous when keys contained = or NUL. Two-layer length framing removes the ambiguity without needing an extra validation pass https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

…mode - plan/117: PATH and Cmd.Dir are not filesystem confinement (a recipe can still write via absolute paths or ..). Reword as build determinism, not security; note that real confinement needs a sandbox - plan/102: clarify how `output:` actually fails. MDS039 reports unknown params as warnings, not errors. The hard failure comes from `outputs:` being required, so a directive with only `output:` fails on the missing required field; the unknown-param warning is secondary https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

…nputs - plan/116: rename "config validation" to "target-graph load" — the overlap check happens after every <?build?> directive across the project has been collected, not when parsing .mdsmith.yml alone - plan/104: clarify the recipe-pass summary — plan 115 specifies OK | FAIL; SKIP arrives once plan 103's staleness layer lands - plan/102: drop the special-case "** not allowed as first segment" rule for inputs: globs. Build inputs: now use full doublestar syntax (including leading **/) per docs/reference/globs.md, the same as every other config and directive surface - plan/102: replace the misleading remote-URL example for empty inputs:. Empty inputs: is for recipes that are pure functions of their config; recipes depending on remote/time-varying state need a synthetic input file the author touches, or --build-force on a schedule https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

- plan/102: align task and AC with the path-shape rules. Earlier rounds dropped the "** at position 0" restriction in the design section but the task and AC still repeated the old rule. Now all three say full doublestar syntax (including leading **/) is accepted, matching docs/reference/globs.md - plan/115: clarify --build-dry-run scope. Plan 104 says dry-run lists hooks alongside recipes; plan 115 introduces the flag and now states it enumerates targets (and hooks once plan 104 lands) so the two plans don't conflict - plan/115: table auto-formatted by mdsmith fix to fit the new --build-dry-run row width https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

- plan/115: drop the stale O_EXCL claim from the plan 117 hardening summary. Plan 117 specifies os.MkdirTemp + Lstat/symlink checks, not O_EXCL. The summary now just says "hardened staging dir" - plan/116: the overlap check happens at target-graph load (after collecting every <?build?> directive across the project), not at config load (parsing .mdsmith.yml). Three call sites in tasks and AC said "config load"; aligned all three with the design section - docs/research/build-orchestrator/go-only.md: --build-recipe filters by recipe, not by individual target. The side door's smallest unit of invocation is therefore one recipe at a time. A per-target selector would need a path-based selector mdsmith does not yet expose https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

- plan/116: clarify log retention and orphan cleanup. Each cache entry (plan 103) carries an action-id, and the log file lives at build-logs/<action-id>.log. Orphan cleanup removes any log whose id is not the action-id of any cache entry — not "any log without a cache entry" as the prior wording could be misread (plan 103's cache is keyed by sorted output-set, not action-id, so the link needs to be stated explicitly) - plan/117: change "honour" to "honor" to match the project's American-English convention (rest of the repo uses "honor", "honors", "honoring") https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

- plan/103: align the "Pattern borrowed from cmd/go/internal/cache" section with the actual cache-file design. The earlier wording said the cache is a flat ActionID -> outputs map; the design section keys entries by sorted outputs set with the ActionID stored inside each entry. The pattern-borrowed section now states this explicitly (mdsmith borrows the content-addressed input model, not the lookup key) - plan/104: fix subject/verb agreement. "{param} tokens with no matching params entry is a config error" -> "A {param} token with no matching params entry is a config error" - plan/116: drop the "rule rules out" repetition. Now reads "Plan 103's overlap rule eliminates ambiguity at target-graph load." https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

…ordering - plan/103: ActionID paths are project-root-relative, not absolute. Symlinks resolved, separators normalized via filepath.ToSlash. Plan 115's Target carries the absolute Root separately. The ActionID is therefore stable across clone locations and platforms (otherwise cache hits would depend on the checkout directory). Field listing updated: "sorted resolved inputs" -> "sorted relative inputs". - plan/115: Target.Inputs/Outputs change from "resolved absolute paths" to "project-root-relative, slash-normalized". Add Root field (absolute project root); the builder recomposes absolute paths via filepath.Join(target.Root, p) at exec time. - plan/104: document --build-only interaction with the execution order. --no-build skips the entire build pass (and both hook lists). --build-only skips step 1 (lint-fix) but still runs steps 2-4 — hooks bracket the recipe pass either way. https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

…tstrap - plan/103: define the on-disk action-id encoding explicitly. The length-framed sha256 is serialized as "sha256-<64 lowercase hex>". This is what gets used as cache key, log filename (<action-id>.log), and wire form. The encoding contains no path-unsafe characters, so Windows / colon / slash issues are ruled out - plan/117: clarify the staging-root bootstrap. On a first run .mdsmith/build-staging/ does not exist; step 1 now creates it via os.MkdirAll with mode 0o700 if absent, and only refuses if it exists as a symlink or non-directory. ENOENT no longer aborts the first build pass https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

- plan/103: disambiguate cache key vs metadata. The JSON map is keyed by the sorted outputs set; the action-id is stored as entry metadata (used as log filename and on the wire), NOT as the map key. Round 11's "used as cache key" wording was wrong - plan/116: log paths now consistently use the .mdsmith/ prefix. Two call sites (the retention paragraph and task 8) had dropped the prefix and could be misread as repo-root-relative - plan/117: umask filters the mode passed to os.MkdirAll, so the resulting permissions on .mdsmith/build-staging/ may not be 0o700. Step 1 now says mdsmith creates the directory and *Chmods* it to 0o700 explicitly, which is what guarantees the strict perms the hardening relies on https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

… sort - PLAN.md: catalog now sorts numeric:id so plan IDs collate numerically (52 before 100, not lexicographically); regenerated - plan/117: unify the hermetic env model. Cmd.Env = PATH + build.exec.env-pass-through, with env-pass-through as the single inheritance mechanism (default [HOME, LANG, LC_ALL], replaces not appends). Removes the contradiction between the always-included list and the schema example; example now adds SOURCE_DATE_EPOCH - plan/116: failure-diagnostic example now shows absolute paths. Cmd.Dir is the staging dir (plan 117), so {inputs}/{outputs} expand to absolute (project-rooted source, staging output); project-relative argv would not resolve. source: line stays relative for readability - plan/116: "config load" -> "target-graph load" in the overlap integration-test bullet (overlap detection scans directives, so it cannot happen at pure config parse) - plan/115: base-url AC corrected. yamlutil.UnmarshalSafe is non-strict, so a removed struct field is silently ignored, not an "unknown field" error. AC + task now require an explicit top-level-key scan that detects lingering build.base-url and errors https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

- plan/117: specify the cross-platform atomic-replace contract. os.Rename on Windows does not silently replace via plain rename; the plan now states POSIX rename(2) for in-place replace and Windows MoveFileEx(MOVEFILE_REPLACE_EXISTING) (which Go's os.Rename wraps), atomic for same-volume files. The Lstat symlink-refusal check runs on both - plan/103: tighten the ActionID symlink-resolution spec. The hashed path is always the project-root-relative, slash-normalized string; symlink resolution feeds the safety check only and never alters it. Inputs run EvalSymlinks for root-escape protection before hashing; outputs resolve only the longest existing prefix (plan 102) since they need not exist yet https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

 ---
 id: 102
-title: Builder interface and mdsmith build subcommand
+title: Multi-output `<?build?>` directive
 status: "🔲"
 summary: >-


 ---
 id: 103
 title: Build target staleness and dependency tracking
 status: "🔲"
 summary: >-


 ---
 id: 104
 title: Build lifecycle hooks (before/after)
 status: "🔲"
 summary: >-


+  Make the build pass debuggable. Capture
+  per-recipe stdout/stderr to per-target log
+  files (ninja-style buffer-and-dump default;
+  `--build-stream` for live tailing). Print
+  rich failure diagnostics. Add
+  `--build-explain TARGET` to print the
+  ActionID inputs. Add `--build-verify`
+  (run-twice diff) for non-determinism
+  detection. Add `--build-jobs N` for
+  parallel execution.
+model: opus
+---
+# Build execution UX (stdout/stderr, debug, parallel)
+
+## Goal
+
+Make the build pass debuggable. Capture
+every recipe's stdout and stderr. Persist
+the streams under `.mdsmith/build-logs/`.
+Print actionable failure messages. Add
+helpers for staleness explanation, non-
+determinism detection, and parallel
+execution.
+
+## Context
+
+Plan 115 dispatches recipes and prints
+`OK | FAIL`. Plan 117 hardens execution.
+Neither helps debug a hung recipe or
+explain a freshness verdict. Five gotchas
+drive this plan:


+  Layer security on top of plan 115's basic
+  builder execution. Trust gate so a freshly
+  cloned repo cannot run recipes silently.
+  Hermetic env (allowlisted PATH and env
+  pass-through). Atomic-write hardening
+  (random-suffix staging, refuse if the
+  `.mdsmith/build-staging/` root is a
+  symlink or world-writable, symlink-safe
+  rename). Output
+  post-conditions: every declared output
+  must exist; no undeclared write may slip
+  out. Process-group kill on timeout.
+model: opus
+---
+# Build execution hardening
+
+## Goal
+
+Make the build pass safe on an untrusted
+repo. Plan 115 wires the recipe through
+`os/exec`. Plan 117 adds the defenses that
+stop a hostile recipe or config from
+escaping its declared inputs/outputs,
+leaking child processes, or writing where
+it should not.
+
+## Context
+
+The threat model treats both `.mdsmith.yml`
+and `<?build?>` directives as untrusted.
+Plan 115's wiring assumes trusted input;


Copilot AI review requested due to automatic review settings May 4, 2026 06:01

Copilot started reviewing on behalf of jeduden May 4, 2026 06:02 View session

jeduden force-pushed the claude/rework-build-architecture-BfyKY branch from f075381 to 008b7f9 Compare May 4, 2026 06:04