Skip to content

Restructure build plans: integrate builder into fix, add hardening#235

Open
jeduden wants to merge 20 commits into
mainfrom
claude/rework-build-architecture-BfyKY
Open

Restructure build plans: integrate builder into fix, add hardening#235
jeduden wants to merge 20 commits into
mainfrom
claude/rework-build-architecture-BfyKY

Conversation

@jeduden
Copy link
Copy Markdown
Owner

@jeduden jeduden commented May 4, 2026

Summary

This PR restructures the mdsmith build system plans to integrate recipe execution into mdsmith fix rather than a separate mdsmith build subcommand, and reorganizes the security and UX layers. The changes reflect a shift from the original plan 102 architecture to a more integrated approach with explicit hardening and debugging support.

Key Changes

  • Plan 102 refocus: Changed from "Builder interface and mdsmith build subcommand" to "Multi-output <?build?> directive". Now focuses on directive syntax (outputs: list, inputs: list, body template rendering per output) rather than the execution engine.

  • Plan 103 refinement: Updated staleness/dependency tracking to use ActionID content-hashing with length-framed fields, storing cache in .mdsmith/build-cache.json. The ActionID covers the recipe spec, sorted input content hashes, and sorted output paths; per-output content hashes are stored separately for tamper detection. Any overlap across two directives' outputs: paths (exact or directory-prefix) is a build error at config load.

  • Plan 104 updates: Adjusted lifecycle hooks to run within mdsmith fix build pass rather than a separate mdsmith build invocation. Simplified failure semantics and execution order.

  • New Plan 115: "Builder execution wired into mdsmith fix" — the core execution engine. Removes the separate mdsmith build subcommand and built-in recipe drivers (screenshot, vhs). Recipes run inside mdsmith fix after the lint-fix pass. Introduces Builder interface for custom recipes and atomic multi-output write semantics.

  • New Plan 116: "Build execution UX" — adds stdout/stderr capture to per-target log files, rich failure diagnostics, --build-explain for staleness debugging, --build-verify for non-determinism detection, and --build-jobs for parallel execution (no extra overlap check needed since plan 103 already enforces disjoint outputs at config load).

  • New Plan 117: "Build execution hardening" — security layer including trust gate (.mdsmith.yml.trust stores the full trusted .mdsmith.yml contents byte-for-byte so mdsmith trust can produce a real diff), hermetic execution environment with allowlisted PATH, atomic write hardening with os.MkdirTemp random-suffix staging, output post-conditions (all declared outputs must exist; no undeclared writes detected via pre/post snapshots that include sha256 of contents, mode, size, and mtime), and process-group kill on timeout.

  • Research spikes added: Two new research documents (go-only.md and language-relaxed.md) evaluating whether to outsource the build engine to external orchestrators (make, Bazel, Tup, etc.). Conclusion: keep the internal engine; add mdsmith targets --json side door for power users.

Notable Implementation Details

  • ActionID hashing: Uses length-framed concatenation of recipe command, params, input contents, and output paths to prevent path-collision attacks. Includes cache.version field for schema evolution. Per-output content hashes are stored next to the ActionID for tamper detection on subsequent runs.

  • Staging directory strategy: Per-recipe temp dirs under .mdsmith/build-staging/ created via os.MkdirTemp (random suffix; OS guarantees uniqueness). World-writable parent refusal on Unix.

  • Trust model: .mdsmith.yml.trust stores the full trusted .mdsmith.yml contents (not just a hash), so mdsmith trust can show a real diff when the config drifts. CI environments opt in via MDSMITH_TRUST_BUILD=1 env var.

  • Rename strategy: Per-output Lstat symlink check followed by os.Rename (atomic per-file replace on POSIX). Multi-output rename is not transactionally atomic across files; the plan documents the partial-failure semantics (log, exit FAIL, next fix reruns the recipe because no cache write happened).

  • Output overlap rule: Plan 103 rejects overlapping outputs: paths at config load (exact and directory-prefix collisions). This applies in serial and parallel mode; --build-jobs N does not need a separate check.

  • Log retention: Per-recipe logs stored under .mdsmith/build-logs/<action-id>.log and kept until the cache entry is invalidated, enabling post-hoc debugging.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY

Copilot AI review requested due to automatic review settings May 4, 2026 06:01
@jeduden jeduden force-pushed the claude/rework-build-architecture-BfyKY branch from f075381 to 008b7f9 Compare May 4, 2026 06:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates mdsmith’s build-system planning docs to reflect a new architecture where <?build?> execution runs inside mdsmith fix, and adds new plans for execution UX and hardening plus supporting research spikes.

Changes:

  • Refocuses Plan 102 on the multi-output <?build?> directive (outputs:/inputs:) and related rule/doc updates.
  • Reworks Plans 103/104 around a mdsmith fix build pass (staleness, hooks, flags) and adds new Plans 115–117 for execution wiring, UX, and security hardening.
  • Adds research docs evaluating external build orchestrators (Go-only and language-relaxed).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
plan/117_build-execution-hardening.md New plan defining trust gate, hermetic exec, atomic write hardening, and post-condition checks.
plan/116_build-execution-ux.md New plan for logs/diagnostics, --build-explain, --build-verify, and parallel execution UX.
plan/115_builder-execution-in-fix.md New plan wiring builder execution into mdsmith fix and defining core flags/behavior.
plan/104_build-lifecycle-hooks.md Updates lifecycle hooks plan to run within the fix build pass and adjusts semantics/flags.
plan/103_build-staleness-and-deps.md Updates staleness/cache design and flags to target the fix build pass.
plan/102_build-subcommand.md Refactors Plan 102 to describe the multi-output directive surface and validation rules.
docs/research/build-orchestrator/go-only.md New research spike on Go-only external orchestrators.
docs/research/build-orchestrator/language-relaxed.md New research spike allowing non-Go orchestrators; keeps recommendation internal.
PLAN.md Updates plan index table titles and adds entries for Plans 115–117.

Comment thread plan/104_build-lifecycle-hooks.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/102_build-subcommand.md Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.27%. Comparing base (024dad8) to head (8fb9ce9).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
Components Coverage Δ
Go 96.23% <ø> (+<0.01%) ⬆️
TypeScript 99.35% <ø> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/117_build-execution-hardening.md
Comment thread plan/102_build-subcommand.md Outdated
Comment thread plan/103_build-staleness-and-deps.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Comment thread plan/102_build-subcommand.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/103_build-staleness-and-deps.md Outdated
Comment thread plan/102_build-subcommand.md
Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/116_build-execution-ux.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Comment thread plan/116_build-execution-ux.md Outdated
Comment thread plan/116_build-execution-ux.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/102_build-subcommand.md Outdated
Comment thread plan/103_build-staleness-and-deps.md
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/102_build-subcommand.md Outdated
@jeduden jeduden requested a review from Copilot May 4, 2026 09:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Comment thread plan/116_build-execution-ux.md Outdated
Comment thread plan/104_build-lifecycle-hooks.md Outdated
Comment thread plan/102_build-subcommand.md Outdated
Comment thread plan/102_build-subcommand.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread plan/102_build-subcommand.md Outdated
Comment thread plan/115_builder-execution-in-fix.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Comment thread plan/115_builder-execution-in-fix.md Outdated
Comment thread plan/116_build-execution-ux.md Outdated
Comment thread docs/research/build-orchestrator/go-only.md Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread plan/116_build-execution-ux.md Outdated
Comment thread plan/117_build-execution-hardening.md Outdated
claude added 18 commits May 16, 2026 10:24
Drop built-in recipe drivers (screenshot via chromedp, vhs) and the
`mdsmith build` subcommand. Builds now run as part of `mdsmith fix`
with `--no-build` / `--build-only` opt-outs. Remove `build.base-url`
config field (its only consumer was the deleted screenshot driver).

`<?build?>` directive replaces `output:` (singular) with `outputs:`
(list), so one recipe invocation can produce many files. Adds
`inputs:` (list of paths or globs) for staleness tracking. The body
template renders once per output. New collective placeholders
`{outputs}` and `{inputs}` in recipe `command` strings expand to N
argv arguments.

Plan 102 narrows to the multi-output directive only. Plan 115 (new)
covers Builder execution wiring. Plan 103 adopts the
`cmd/go/internal/cache` ActionID pattern, hashing
`(recipe.command || sorted input contents || output set)` per
target. Plan 104 reframes hooks as part of the `fix` build pass.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
A security review of plans 102/103/104/115 surfaced one critical
finding (no trust gate before recipe execution) plus four high
findings (atomic-write race, ActionID hash collision via path
separators, implicit cache trust, hook param values unvalidated).
A parallel review of build-system gotchas from Bazel/Gradle/Buck/
Ninja added five behavioral lessons: non-deterministic outputs
defeat caching, missing-output post-conditions are required,
stale-cache surprises drive `make clean` muscle memory, parallel
jobs collide on undeclared shared state, buffered stdout hides
hangs.

Findings folded into existing plans:

- 102 — path-shape rules: NUL/newline/whitespace/ADS/UNC/reserved
  device names rejected; symlinks resolved to project root; glob
  match cap of 10k.
- 103 — length-framed ActionID (no path-collision attacks); cache
  stores per-output content hashes for tamper detection; `built-at`
  documented as informational.
- 104 — hook param value baseline (no NUL/newline/length>4KB);
  build pass refuses to run when MDS040 errors against config.
- 115 — trimmed back to basic execution wiring; hardening lifted
  out to plan 117. Tokenization pinned to `strings.Fields`.

New plans:

- 116 — execution UX: per-recipe stdout/stderr capture with
  buffer-and-dump default plus `--build-stream` live mode,
  per-target log retention under `.mdsmith/build-logs/`, six-field
  failure block with last-20-lines tail, `--build-explain TARGET`,
  `--build-verify` (run-twice diff), `--build-jobs N`.
- 117 — execution hardening: `.mdsmith.yml.trust` direnv-style
  trust gate (with `MDSMITH_TRUST_BUILD=1` for CI), hermetic env
  (allowlisted PATH and `build.exec.env-pass-through`), atomic
  write with `O_EXCL` staging and `RENAME_NOREPLACE`, output
  post-conditions (Bazel #14543: every declared output must
  exist; no undeclared write may slip), process-group SIGTERM
  then SIGKILL on `--build-timeout`.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
Documents the spike that asked whether mdsmith should outsource its
build engine to an existing Go-based orchestrator (go-task, mage,
goyek, n2, please, etc.) instead of implementing the internal engine
in plans 102/103/115/116/117.

Verdict: stick with the internal engine. No Go-native orchestrator
matches mdsmith's threat model (no trust gate, no hermetic env, no
atomic multi-output writes, no output-confinement post-conditions).
Recommend adding one new subcommand `mdsmith targets --json` as a
side door for users already running cross-tool DAGs through go-task
or make.

A companion spike with the Go-language constraint relaxed will land
in a follow-up commit.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
Companion to the Go-only spike. Asks: if mdsmith users can be
required to install a non-Go binary, what becomes possible?

Verdict: even with the language constraint dropped, the internal
engine remains the right call. The candidates split:

- Strongest correctness model: Shake (Haskell) — content-hash
  native, early cutoff, lint mode. Distribution requires GHC +
  Stack, hostile to the tech-writer audience.
- Strongest threat model: Tup. Its FUSE / PTRACE tracing
  literally implements output confinement and undeclared-write
  detection — exactly what plan 117 reimplements by hand.
  Distribution requires a kernel filesystem on every platform.
- Closest to plan 117 overall: Pants v2 (Python + Rust engine,
  sandboxed by default). ~150 MB install for a Markdown linter
  audience inverts the value ratio.

Buck2 looks attractive but its docs admit local builds are not
sandboxed without remote execution.

The spike validates several internal-engine design choices: plan
103's content-hash cache mirrors Shake / Bazel / Buck2 / Pants;
plan 117's output-confinement is Tup's headline feature without
the FUSE dependency; plan 116's --build-explain maps to Shake's
--lint and -VV.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
Per follow-up request to consider three more candidates:

- ejholmes/walk (Go) — added to go-only.md. Interesting
  no-DSL design (Walkfile is an executable that responds to
  `deps` and `exec`) and parallel DAG execution via
  semaphore. But its plan.go does no staleness detection
  itself; the Walkfile script owns rebuild decisions.
  Adopting it would replace ~50 lines of errgroup with a
  third-party dep that hasn't released since Sep 2017
  (138 stars, dormant).
- GNU make — added to language-relaxed.md. Universal,
  pre-installed, but mtime-only with the same CI cache-
  restore failure mode as ninja, and no atomic multi-
  output. References Miller's "Recursive Make Considered
  Harmful" as the foundational caveat.
- redo — expanded in language-relaxed.md. Apenwarr's
  Python implementation is the active fork; DJB's
  original sketch cited; ports in C/C++/Go are mixed.
  Content-hash native via redo-stamp; integration would
  let mdsmith absorb most of plan 103.

Verdict unchanged in both docs.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
- Remove unused link reference definitions from plans 103, 104, 116, 117
  (MDS053 was added in main after this branch diverged)
- Fix split inline code span across lines in plan/104 task 1
- Fix trust-gate design in plan/117: store full config content in
  .mdsmith.yml.trust instead of just sha256, so mdsmith trust can diff
- Fix inaccurate os.MkdirTemp description: remove O_CREAT|O_EXCL claim
  (those flags apply to files, not directories)
- Fix plan/102: remove "always stale" claim for inputs:[] that was
  inconsistent with plan/103's ActionID algorithm

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
…erlap

- plan/117: replace RENAME_NOREPLACE (which would fail every rebuild)
  with Lstat symlink check + os.Rename (atomic per-file replace);
  document multi-output rename is not transactionally atomic and
  specify partial-failure recovery semantics
- plan/117: strengthen undeclared-write snapshot to include sha256 of
  contents and mode bits, not just size+mtime (preserves catch even
  when a recipe modifies a file while keeping size/mtime stable);
  document the two known limits (parent-dir-only scope, single
  pre/post hash)
- plan/103: any overlap in declared outputs (exact or directory-prefix)
  is now a build error at config load, not just on --build-jobs > 1;
  closes "last writer wins" hole in serial mode
- plan/116: drop the now-redundant N>1 overlap check; config validation
  satisfies the parallel-execution safety contract
- plan/102: reword empty-inputs note to match plan/103's algorithm;
  ActionID covers recipe spec + sorted output paths (not output
  content hashes — those live in the separate per-output hash store
  that staleness step 5 uses for tamper detection)

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
…holders

- plan/102 (symlink check): EvalSymlinks fails on non-existent paths;
  spec now walks the longest existing prefix and uses Clean for the rest
  so missing outputs are not rejected
- plan/102 ({outputs}/{inputs} placeholders): require standalone argv
  tokens after strings.Fields tokenization; embedded use is an MDS040
  validation error (no defined semantics for fragment-list expansion)
- plan/117 (staging root): add Lstat directory check on
  .mdsmith/build-staging/ before MkdirTemp; symlink/non-dir is refused
  (closes the "attacker pre-replaced staging root" gap)
- plan/117 (rename strategy): drop RENAME_NOREPLACE from task and AC
  (it would fail every rebuild); align with the design section's
  Lstat + os.Rename approach
- plan/117 (symlink snapshot): record symlink target via os.Readlink
  in addition to Lstat metadata (Lstat alone returns no link target)
- plan/103 (overlap detection): tasks and AC now explicitly require
  directory-prefix overlap detection, matching the design section
- plan/116 (--build-no-cache logs): clarify that --build-no-cache
  still writes logs but never associates them with a cache entry;
  orphan logs are removed at the start of the next fix invocation

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
- plan/116 (--build-explain selector): pick one rule — match TARGET
  against each directive's first declared output (string equality
  after path normalization). Plan 103's overlap rule rules out
  ambiguity at config load
- plan/116 (--build-no-cache logs): reconcile two retention rules.
  --build-no-cache still writes logs but creates no cache entry;
  orphans are deleted at the start of the next fix invocation. Tasks
  and AC aligned
- plan/117 (world-writable check): align design, task, AC, and test
  on the staging *root* (.mdsmith/build-staging/) — not its parent —
  matching the design section
- plan/102 (path normalization): pick a single strategy. Build-time
  symlink check uses filepath.* internally on OS-native separators,
  then ToSlash for comparison and diagnostic output, preserving the
  slash-only invariant on Windows
- plan/103 (ActionID param canonicalization): add inner length-framing
  per key/value/path; key=value\0 was ambiguous when keys contained =
  or NUL. Two-layer length framing removes the ambiguity without
  needing an extra validation pass

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
…mode

- plan/117: PATH and Cmd.Dir are not filesystem confinement (a recipe
  can still write via absolute paths or ..). Reword as build
  determinism, not security; note that real confinement needs a
  sandbox
- plan/102: clarify how `output:` actually fails. MDS039 reports
  unknown params as warnings, not errors. The hard failure comes from
  `outputs:` being required, so a directive with only `output:` fails
  on the missing required field; the unknown-param warning is
  secondary

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
…nputs

- plan/116: rename "config validation" to "target-graph load" — the
  overlap check happens after every <?build?> directive across the
  project has been collected, not when parsing .mdsmith.yml alone
- plan/104: clarify the recipe-pass summary — plan 115 specifies
  OK | FAIL; SKIP arrives once plan 103's staleness layer lands
- plan/102: drop the special-case "** not allowed as first segment"
  rule for inputs: globs. Build inputs: now use full doublestar
  syntax (including leading **/) per docs/reference/globs.md, the
  same as every other config and directive surface
- plan/102: replace the misleading remote-URL example for empty
  inputs:. Empty inputs: is for recipes that are pure functions of
  their config; recipes depending on remote/time-varying state need
  a synthetic input file the author touches, or --build-force on a
  schedule

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
- plan/102: align task and AC with the path-shape rules. Earlier
  rounds dropped the "** at position 0" restriction in the design
  section but the task and AC still repeated the old rule. Now all
  three say full doublestar syntax (including leading **/) is
  accepted, matching docs/reference/globs.md
- plan/115: clarify --build-dry-run scope. Plan 104 says dry-run
  lists hooks alongside recipes; plan 115 introduces the flag and
  now states it enumerates targets (and hooks once plan 104 lands)
  so the two plans don't conflict
- plan/115: table auto-formatted by mdsmith fix to fit the new
  --build-dry-run row width

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
- plan/115: drop the stale O_EXCL claim from the plan 117 hardening
  summary. Plan 117 specifies os.MkdirTemp + Lstat/symlink checks,
  not O_EXCL. The summary now just says "hardened staging dir"
- plan/116: the overlap check happens at target-graph load (after
  collecting every <?build?> directive across the project), not at
  config load (parsing .mdsmith.yml). Three call sites in tasks and
  AC said "config load"; aligned all three with the design section
- docs/research/build-orchestrator/go-only.md: --build-recipe filters
  by recipe, not by individual target. The side door's smallest unit
  of invocation is therefore one recipe at a time. A per-target
  selector would need a path-based selector mdsmith does not yet
  expose

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
- plan/116: clarify log retention and orphan cleanup. Each cache
  entry (plan 103) carries an action-id, and the log file lives at
  build-logs/<action-id>.log. Orphan cleanup removes any log whose
  id is not the action-id of any cache entry — not "any log without
  a cache entry" as the prior wording could be misread (plan 103's
  cache is keyed by sorted output-set, not action-id, so the link
  needs to be stated explicitly)
- plan/117: change "honour" to "honor" to match the project's
  American-English convention (rest of the repo uses "honor",
  "honors", "honoring")

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
- plan/103: align the "Pattern borrowed from cmd/go/internal/cache"
  section with the actual cache-file design. The earlier wording
  said the cache is a flat ActionID -> outputs map; the design
  section keys entries by sorted outputs set with the ActionID
  stored inside each entry. The pattern-borrowed section now states
  this explicitly (mdsmith borrows the content-addressed input
  model, not the lookup key)
- plan/104: fix subject/verb agreement. "{param} tokens with no
  matching params entry is a config error" -> "A {param} token with
  no matching params entry is a config error"
- plan/116: drop the "rule rules out" repetition. Now reads
  "Plan 103's overlap rule eliminates ambiguity at target-graph
  load."

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
…ordering

- plan/103: ActionID paths are project-root-relative, not absolute.
  Symlinks resolved, separators normalized via filepath.ToSlash. Plan
  115's Target carries the absolute Root separately. The ActionID is
  therefore stable across clone locations and platforms (otherwise
  cache hits would depend on the checkout directory). Field listing
  updated: "sorted resolved inputs" -> "sorted relative inputs".
- plan/115: Target.Inputs/Outputs change from "resolved absolute
  paths" to "project-root-relative, slash-normalized". Add Root
  field (absolute project root); the builder recomposes absolute
  paths via filepath.Join(target.Root, p) at exec time.
- plan/104: document --build-only interaction with the execution
  order. --no-build skips the entire build pass (and both hook
  lists). --build-only skips step 1 (lint-fix) but still runs steps
  2-4 — hooks bracket the recipe pass either way.

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
…tstrap

- plan/103: define the on-disk action-id encoding explicitly. The
  length-framed sha256 is serialized as "sha256-<64 lowercase hex>".
  This is what gets used as cache key, log filename
  (<action-id>.log), and wire form. The encoding contains no
  path-unsafe characters, so Windows / colon / slash issues are
  ruled out
- plan/117: clarify the staging-root bootstrap. On a first run
  .mdsmith/build-staging/ does not exist; step 1 now creates it via
  os.MkdirAll with mode 0o700 if absent, and only refuses if it
  exists as a symlink or non-directory. ENOENT no longer aborts
  the first build pass

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
- plan/103: disambiguate cache key vs metadata. The JSON map is keyed
  by the sorted outputs set; the action-id is stored as entry
  metadata (used as log filename and on the wire), NOT as the map
  key. Round 11's "used as cache key" wording was wrong
- plan/116: log paths now consistently use the .mdsmith/ prefix.
  Two call sites (the retention paragraph and task 8) had dropped
  the prefix and could be misread as repo-root-relative
- plan/117: umask filters the mode passed to os.MkdirAll, so the
  resulting permissions on .mdsmith/build-staging/ may not be 0o700.
  Step 1 now says mdsmith creates the directory and *Chmods* it to
  0o700 explicitly, which is what guarantees the strict perms the
  hardening relies on

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
@jeduden jeduden force-pushed the claude/rework-build-architecture-BfyKY branch from 2d58423 to 3ee9daf Compare May 16, 2026 10:25
@jeduden jeduden requested a review from Copilot May 16, 2026 10:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread PLAN.md Outdated
Comment thread plan/115_builder-execution-in-fix.md Outdated
… sort

- PLAN.md: catalog now sorts numeric:id so plan IDs collate
  numerically (52 before 100, not lexicographically); regenerated
- plan/117: unify the hermetic env model. Cmd.Env = PATH +
  build.exec.env-pass-through, with env-pass-through as the single
  inheritance mechanism (default [HOME, LANG, LC_ALL], replaces not
  appends). Removes the contradiction between the always-included
  list and the schema example; example now adds SOURCE_DATE_EPOCH
- plan/116: failure-diagnostic example now shows absolute paths.
  Cmd.Dir is the staging dir (plan 117), so {inputs}/{outputs}
  expand to absolute (project-rooted source, staging output);
  project-relative argv would not resolve. source: line stays
  relative for readability
- plan/116: "config load" -> "target-graph load" in the overlap
  integration-test bullet (overlap detection scans directives, so
  it cannot happen at pure config parse)
- plan/115: base-url AC corrected. yamlutil.UnmarshalSafe is
  non-strict, so a removed struct field is silently ignored, not an
  "unknown field" error. AC + task now require an explicit
  top-level-key scan that detects lingering build.base-url and errors

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread plan/117_build-execution-hardening.md Outdated
Comment thread plan/103_build-staleness-and-deps.md Outdated
- plan/117: specify the cross-platform atomic-replace contract.
  os.Rename on Windows does not silently replace via plain rename;
  the plan now states POSIX rename(2) for in-place replace and
  Windows MoveFileEx(MOVEFILE_REPLACE_EXISTING) (which Go's
  os.Rename wraps), atomic for same-volume files. The Lstat
  symlink-refusal check runs on both
- plan/103: tighten the ActionID symlink-resolution spec. The
  hashed path is always the project-root-relative, slash-normalized
  string; symlink resolution feeds the safety check only and never
  alters it. Inputs run EvalSymlinks for root-escape protection
  before hashing; outputs resolve only the longest existing prefix
  (plan 102) since they need not exist yet

https://claude.ai/code/session_01VFEMANvrBU45Hduy8kkxSY
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Comment on lines 1 to 5
---
id: 102
title: Builder interface and mdsmith build subcommand
title: Multi-output `<?build?>` directive
status: "🔲"
summary: >-
Comment on lines 1 to 5
---
id: 103
title: Build target staleness and dependency tracking
status: "🔲"
summary: >-
Comment on lines 1 to 5
---
id: 104
title: Build lifecycle hooks (before/after)
status: "🔲"
summary: >-
Comment on lines +6 to +36
Make the build pass debuggable. Capture
per-recipe stdout/stderr to per-target log
files (ninja-style buffer-and-dump default;
`--build-stream` for live tailing). Print
rich failure diagnostics. Add
`--build-explain TARGET` to print the
ActionID inputs. Add `--build-verify`
(run-twice diff) for non-determinism
detection. Add `--build-jobs N` for
parallel execution.
model: opus
---
# Build execution UX (stdout/stderr, debug, parallel)

## Goal

Make the build pass debuggable. Capture
every recipe's stdout and stderr. Persist
the streams under `.mdsmith/build-logs/`.
Print actionable failure messages. Add
helpers for staleness explanation, non-
determinism detection, and parallel
execution.

## Context

Plan 115 dispatches recipes and prints
`OK | FAIL`. Plan 117 hardens execution.
Neither helps debug a hung recipe or
explain a freshness verdict. Five gotchas
drive this plan:
Comment on lines +6 to +36
Layer security on top of plan 115's basic
builder execution. Trust gate so a freshly
cloned repo cannot run recipes silently.
Hermetic env (allowlisted PATH and env
pass-through). Atomic-write hardening
(random-suffix staging, refuse if the
`.mdsmith/build-staging/` root is a
symlink or world-writable, symlink-safe
rename). Output
post-conditions: every declared output
must exist; no undeclared write may slip
out. Process-group kill on timeout.
model: opus
---
# Build execution hardening

## Goal

Make the build pass safe on an untrusted
repo. Plan 115 wires the recipe through
`os/exec`. Plan 117 adds the defenses that
stop a hostile recipe or config from
escaping its declared inputs/outputs,
leaking child processes, or writing where
it should not.

## Context

The threat model treats both `.mdsmith.yml`
and `<?build?>` directives as untrusted.
Plan 115's wiring assumes trusted input;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants