diff --git a/.claude/skills/lading-optimize-review/SKILL.md b/.claude/skills/lading-optimize-review/SKILL.md
index d9dcdee56..f5e92998b 100644
--- a/.claude/skills/lading-optimize-review/SKILL.md
+++ b/.claude/skills/lading-optimize-review/SKILL.md
@@ -1,16 +1,18 @@
 ---
 name: lading-optimize-review
 description: Reviews optimization patches for lading using a 5-persona peer review system. Requires unanimous approval backed by benchmarks. Bugs discovered during review are valuable - invoke /lading-optimize-validate to validate them.
-allowed-tools: Bash(cat:*) Bash(ci/*:*) Read Write Edit Glob Grep Skill
+allowed-tools: Bash(cat:*) Bash(sample:*) Bash(samply:*) Bash(cargo:*) Bash(ci/*:*) Bash(hyperfine:*) Bash(*/payloadtool:*) Bash(tee:*) Read Write Edit Glob Grep Skill
 ---
 
 # Optimization Patch Review
 
 A rigorous 5-persona peer review system for optimization patches in lading. Requires unanimous approval backed by concrete benchmark data. **Duplicate Hunter persona prevents redundant work.**
 
-## CRITICAL: Recording is MANDATORY
+## Role: Judge
 
-**EVERY review outcome MUST be recorded in db.yaml.**
+**Review is the decision-maker. It does NOT record results.**
+
+Review judges using benchmarks and 5-persona review, then returns a structured report.
 
 ## Valuable Outcomes
 
@@ -24,62 +26,43 @@ A rigorous 5-persona peer review system for optimization patches in lading. Requ
 
 ---
 
-## Phase 0: Pre-flight
-
-Run `/lading-preflight` first. Then check for duplicate work:
-
-```bash
-# Check previous reviews
-cat .claude/skills/lading-optimize-review/assets/db.yaml
+## Phase 1: Benchmark Execution
 
-# Check previous hunts
-cat .claude/skills/lading-optimize-hunt/assets/db.yaml
-```
+### Step 1: Read Baseline Data
 
-**Check for:**
-- Same optimization already reviewed -> REJECT as "DUPLICATE"
-- Same file + technique already approved -> REJECT as "DUPLICATE"
+Read the baseline benchmark files captured:
 
----
+- `/tmp/criterion-baseline.log` — micro-benchmark baseline
+- `/tmp/baseline.json` — macro-benchmark timing baseline
+- `/tmp/baseline-mem.txt` — macro-benchmark memory baseline
 
-## Phase 1: Validation Gate
+**If baseline data is missing -> REJECT. Baselines must be captured before any code change and before this gets invoked.**
 
-**MANDATORY: ci/validate must pass before any review proceeds.**
+### Step 2: Run Post-Change Micro-benchmarks
 
 ```bash
-ci/validate
+cargo criterion 2>&1 | tee /tmp/criterion-optimized.log
 ```
 
-**If ci/validate fails -> REJECT immediately. No exceptions.**
-
----
+Note: Criterion automatically compares against the last run and reports percentage changes.
 
-## Phase 2: Measurement
+Compare results — look for "change:" lines showing improvement/regression.
 
-**Review existing benchmark data.** The optimization should already have benchmark results. If missing, request them or REJECT.
+Example output: `time: [1.2345 ms 1.2456 ms 1.2567 ms] change: [-5.1234% -4.5678% -4.0123%]`
 
-**The optimization being reviewed MUST provide benchmark data.** Review should focus on analyzing existing data, not generating new benchmarks.
+### Step 3: Run Post-Change Macro-benchmarks
 
-### Expected Benchmark Data
+Use the config provided by the caller (stored in `$PAYLOADTOOL_CONFIG`):
 
-The optimization should include:
-
-**Micro-benchmarks (criterion):**
-- Benchmark name and throughput change
-- Example: `syslog_100MiB: +42.0% throughput (481 -> 683 MiB/s)`
-
-**Macro-benchmarks (payloadtool with hyperfine):**
-- Time change with absolute values
-- Memory change with absolute values
-- Allocation count change with absolute values
-- Example:
-  ```
-  Time: -14.5% (8.3 ms -> 7.1 ms)
-  Memory: -35.8% (6.17 MiB -> 3.96 MiB)
-  Allocations: -49.3% (67,688 -> 34,331)
-  ```
+```bash
+cargo build --release --bin payloadtool
+hyperfine --warmup 3 --runs 30 --export-json /tmp/optimized.json \
+  "./target/release/payloadtool $PAYLOADTOOL_CONFIG"
+./target/release/payloadtool "$PAYLOADTOOL_CONFIG" --memory-stats 2>&1 | tee /tmp/optimized-mem.txt
+```
 
 ### Statistical Requirements
+
 - Minimum 30 runs for hyperfine (`--runs 30`)
 - Criterion handles statistical significance internally
 - Time improvement >= 5% for significance
@@ -88,31 +71,33 @@ The optimization should include:
 
 ### NO EXCEPTIONS
 
-**If benchmark data is missing -> REJECT. Period.**
-
-- "Test dependencies don't work" -> REJECT. Fix deps first.
+- "Test dependencies don't work" -> REJECT. Fix dependencies first.
 - "Theoretically better" -> REJECT. Prove it with numbers.
 - "Obviously an improvement" -> REJECT. Obvious is not measured.
 - "Will benchmark later" -> REJECT. Benchmark now.
 
 ---
 
-## Phase 3: Five-Persona Review
+## Phase 2: Five-Persona Review
 
 ### 1. Duplicate Hunter (Checks for Redundant Work)
-- [ ] Optimization not already in reviews.yaml
+- [ ] Optimization not already in db.yaml
 - [ ] File + technique combo not already approved
 - [ ] No substantially similar optimization exists
 - [ ] If duplicate found -> REJECT with "DUPLICATE: see <existing entry>"
 
 ### 2. Skeptic (Demands Proof)
+- [ ] Baseline data verified present
+- [ ] Post-change benchmarks executed with identical methodology
 - [ ] Hot path verified via profiling (not just guessed)
-- [ ] Benchmark improvement meets thresholds (5% time, 10% mem, 20% allocs)
+- [ ] Micro threshold met (>=5% time)
+- [ ] Macro threshold met (>=5% time OR >=10% mem OR >=20% allocs)
 - [ ] Statistical significance confirmed (p<0.05 or criterion "faster")
 - [ ] Improvement is real, not measurement noise
+- [ ] Benchmark methodology sound (same config, same machine)
 
 ### 3. Conservative (Guards Correctness)
-- [ ] `ci/validate` passes completely
+- [ ] Run `ci/validate` and validate that it passes completely
 - [ ] No semantic changes to output
 - [ ] **Determinism preserved** (same seed -> same output)
 - [ ] No `.unwrap()` or `.expect()` added (lading MUST NOT panic)
@@ -136,7 +121,7 @@ The optimization should include:
 
 ---
 
-## Phase 4: Kani/Property Test Check
+## Phase 3: Kani/Property Test Check
 
 If the optimization touches critical code:
 
@@ -162,14 +147,26 @@ ci/kani lading_payload
 
 ---
 
-## Phase 5: Decision
+## Phase 4: Decision
 
 | Outcome | Votes | Action |
 |---------|-------|--------|
-| **APPROVED** | 5/5 APPROVE | Record success |
-| **REJECTED** | Any REJECT | Record lesson |
-| **DUPLICATE** | Duplicate Hunter REJECT | Record as DUPLICATE |
-| **BUG FOUND** | Correctness issue | Invoke `/lading-optimize-validate` |
+| **APPROVED** | 5/5 APPROVE | Return APPROVED report |
+| **REJECTED** | Any REJECT | Return REJECTED report |
+| **DUPLICATE** | Duplicate Hunter REJECT | Return DUPLICATE report |
+| **BUG FOUND** | Correctness issue | Invoke `/lading-optimize-validate`, return BUG_FOUND report |
+
+### Bug Triage
+
+When issues are discovered, determine the correct action:
+
+| Signal                                  | Is It a Bug? | Action                             |
+| --------------------------------------- | ------------ | ---------------------------------- |
+| ci/validate fails after optimization    | Possible bug | Invoke `/lading-optimize-validate` |
+| Determinism broken                      | Possible bug | Invoke `/lading-optimize-validate` |
+| Conservative finds correctness issue    | Possible bug | Invoke `/lading-optimize-validate` |
+| Benchmark regression but correct output | NOT a bug    | REJECT optimization                |
+| No improvement but correct output       | NOT a bug    | REJECT optimization                |
 
 ### When Bug Is Found
 
@@ -180,101 +177,31 @@ If review discovers a bug instead of an optimization:
 ```
 
 The validate skill will:
-1. Attempt Kani proof (if feasible)
-2. Create property test reproducing the bug
-3. Verify the fix works
-4. Record in its db.yaml
+1. Confirm this is actually a bug (not just a failed optimization)
+2. Attempt Kani proof (if feasible)
+3. Create property test reproducing the bug
+4. Verify the fix works
+5. Record in its db.yaml
 
-Then return here to record the finding as BUG_FOUND in Phase 6.
+Then return here to return a BUG_FOUND report in Phase 6.
 
 ---
 
-## Phase 6: MANDATORY Recording
-
-### Update db.yaml
-
-1. Add entry to `assets/db.yaml` index
-2. Create detailed file in `assets/db/` directory
+## Phase 5: Return Report
 
-**assets/db.yaml entry:**
-```yaml
-entries:
-  - id: <descriptive-name>
-    verdict: <approved|rejected|duplicate|bug_found>
-    file: assets/db/<review-name>.yaml
-```
-
-**assets/db/<review-name>.yaml** for APPROVED:
-```yaml
-id: <descriptive-name>
-target: <file:function>
-technique: <technique>
-verdict: approved
-date: <YYYY-MM-DD>
-votes:
-  duplicate_hunter: approve
-  skeptic: approve
-  conservative: approve
-  rust_expert: approve
-  greybeard: approve
-measurements:
-  time: <-X%>
-  memory: <-X%>
-  allocations: <-X%>
-reason: |
-  <summary of why approved>
-lessons: |
-  <pattern learned>
-```
+**Review does NOT record results and does NOT create files. Return a structured YAML report to the caller.**
 
-**assets/db/<review-name>.yaml** for REJECTED:
-```yaml
-id: <descriptive-name>
-target: <file:function>
-technique: <technique>
-verdict: rejected
-date: <YYYY-MM-DD>
-votes:
-  duplicate_hunter: <approve|reject>
-  skeptic: <approve|reject>
-  conservative: <approve|reject>
-  rust_expert: <approve|reject>
-  greybeard: <approve|reject>
-reason: |
-  <why rejected - which persona(s) and why>
-lessons: |
-  <what NOT to do next time>
-```
+Fill in the appropriate template and return the completed YAML:
 
-**assets/db/<review-name>.yaml** for DUPLICATE:
-```yaml
-id: <descriptive-name>
-target: <file:function>
-technique: <technique>
-verdict: duplicate
-date: <YYYY-MM-DD>
-duplicate_of: <existing entry>
-reason: |
-  <explanation of duplication>
-```
+| Verdict   | Template                         |
+| --------- | -------------------------------- |
+| approved  | `assets/approved.template.yaml`  |
+| rejected  | `assets/rejected.template.yaml`  |
+| duplicate | `assets/duplicate.template.yaml` |
+| bug_found | `assets/bug-found.template.yaml` |
 
-**assets/db/<review-name>.yaml** for BUG_FOUND:
-```yaml
-id: <descriptive-name>
-target: <file:function>
-verdict: bug_found
-date: <YYYY-MM-DD>
-validation_file: <path to validate db entry>
-reason: |
-  <what bug was found>
-lessons: |
-  <what was learned>
-```
+1. Read the appropriate template from the `assets/` directory
+2. Fill in all placeholder values with actual data from the review
+3. Return the filled-in report
 
 ---
-
-## Usage
-
-```
-/lading-optimize-review
-```
diff --git a/.claude/skills/lading-optimize-review/assets/approved.template.yaml b/.claude/skills/lading-optimize-review/assets/approved.template.yaml
new file mode 100644
index 000000000..443159e2b
--- /dev/null
+++ b/.claude/skills/lading-optimize-review/assets/approved.template.yaml
@@ -0,0 +1,30 @@
+id: <descriptive-name>
+target: <file:function>
+technique: <technique>
+date: <YYYY-MM-DD>
+status: success
+verdict: approved
+votes:
+  duplicate_hunter: approve
+  skeptic: approve
+  conservative: approve
+  rust_expert: approve
+  greybeard: approve
+measurements:
+  benchmarks:
+    micro:
+      # Flat keys: {name}_{size}_{metric}
+      # With raw values:    {name}_{size}_{metric}: <baseline> -> <optimized> (<improvement%>)
+      # Without raw values: {name}_{size}_{metric}: <improvement%>
+    macro:
+      # Required: time, memory, allocations
+      # Optional: peak_live
+      # With raw values:    <category>: <baseline> -> <optimized> (<improvement%>)
+      # Without raw values: <category>: <improvement%>
+      time:
+      memory:
+      allocations:
+reason: |
+  <summary of why approved>
+lessons: |
+  <pattern learned if any>
diff --git a/.claude/skills/lading-optimize-review/assets/bug-found.template.yaml b/.claude/skills/lading-optimize-review/assets/bug-found.template.yaml
new file mode 100644
index 000000000..6212082e7
--- /dev/null
+++ b/.claude/skills/lading-optimize-review/assets/bug-found.template.yaml
@@ -0,0 +1,8 @@
+id: <descriptive-name>
+target: <file:function>
+technique: <technique>
+date: <YYYY-MM-DD>
+status: failure
+verdict: bug_found
+reason: |
+  <what bug was found>
diff --git a/.claude/skills/lading-optimize-review/assets/db.yaml b/.claude/skills/lading-optimize-review/assets/db.yaml
deleted file mode 100644
index c4b6256e3..000000000
--- a/.claude/skills/lading-optimize-review/assets/db.yaml
+++ /dev/null
@@ -1,7 +0,0 @@
-entries:
-  - id: fluent-on-demand-serialization
-    verdict: approved
-    file: assets/db/opt-fluent-on-demand-serialization.yaml
-  - id: payload-cache-inline
-    verdict: approved
-    file: assets/db/opt-payload-cache-inline.yaml
diff --git a/.claude/skills/lading-optimize-review/assets/db/opt-fluent-on-demand-serialization.yaml b/.claude/skills/lading-optimize-review/assets/db/opt-fluent-on-demand-serialization.yaml
deleted file mode 100644
index ae337afdc..000000000
--- a/.claude/skills/lading-optimize-review/assets/db/opt-fluent-on-demand-serialization.yaml
+++ /dev/null
@@ -1,107 +0,0 @@
-id: fluent-on-demand-serialization
-verdict: approved
-date: 2026-01-22
-target: lading_payload/src/fluent.rs::to_bytes
-technique: on-demand-serialization
-
-votes:
-  duplicate_hunter: approve
-  skeptic: approve
-  conservative: approve
-  rust_expert: approve
-  greybeard: approve
-
-measurements:
-  time: -82.6%
-  memory: -90.7%
-  allocations: -82.6%
-  peak_live: -53.6%
-
-  micro_benchmarks:
-    fluent_1MiB: +132% throughput (34.27 -> 79.65 MiB/s)
-    fluent_10MiB: +21% throughput (97.57 -> 117.97 MiB/s)
-    fluent_100MiB: +3% throughput (118.68 -> 122.22 MiB/s)
-
-  macro_benchmarks:
-    time_baseline: 2347 ms
-    time_optimized: 409.8 ms
-    allocations_baseline: 3,622,824
-    allocations_optimized: 629,596
-    memory_baseline: 1.499 GiB
-    memory_optimized: 143.1 MiB
-
-reason: |
-  Unanimous approval from all five personas.
-
-  Duplicate Hunter: No duplicate work detected. This is a unique optimization of fluent.rs
-  using on-demand serialization technique.
-
-  Skeptic: Outstanding benchmark data exceeding all thresholds by wide margins:
-  - Time improvement: -82.6% (threshold: 5%) - 16x better than required
-  - Memory improvement: -90.7% (threshold: 10%) - 9x better than required
-  - Allocation reduction: -82.6% (threshold: 20%) - 4x better than required
-  Micro-benchmarks show 132% throughput improvement at 1 MiB. Statistical significance
-  confirmed with Criterion (100 samples) and hyperfine (10 runs).
-
-  Conservative: All correctness requirements met:
-  - ci/validate passes completely (60/60 tests)
-  - No semantic changes to output format
-  - Determinism preserved (same seed produces same output)
-  - No unwrap/expect added (uses ? for error propagation)
-  - Property test payload_not_exceed_max_bytes validates core invariant
-
-  Rust Expert: Excellent Rust patterns demonstrated:
-  - Use statements properly at file top
-  - Pre-computation in initialization (buffer with 4096 capacity)
-  - Proper borrow checker handling with std::mem::take idiom
-  - No unnecessary allocations in hot path
-  - Eliminated Vec<Vec<u8>> pattern completely
-
-  Greybeard: Simplicity achieved:
-  - Actually simpler than original (one loop vs two-loop binary search)
-  - Clear comments explain non-obvious borrow checker pattern
-  - Minimal scope (41 lines removed, 35 added)
-  - Classic streaming serialization pattern ("obviously fast")
-  - Complexity fully justified by 90.7% memory reduction
-
-validation:
-  ci_validate: passed
-  tests_passed: 60/60
-  cargo_fmt: passed
-  cargo_clippy: passed
-  shellcheck: passed
-  kani: not_applicable (lading_payload has no Kani proofs, property tests sufficient)
-  property_tests: payload_not_exceed_max_bytes validates core invariant
-
-lessons: |
-  This is a textbook example of a successful optimization:
-
-  1. **Clear hot path identification**: Vec<Vec<u8>> pre-serialization was obvious bottleneck
-
-  2. **Classic optimization pattern**: Stream serialization instead of batch-then-serialize
-
-  3. **Overwhelming performance gains**:
-     - 5.7x speedup (2347ms -> 409ms)
-     - 10x memory reduction (1.5 GiB -> 143 MiB)
-     - Eliminated 3 million allocations per run
-
-  4. **No correctness compromises**: All tests pass, determinism preserved
-
-  5. **Proper Rust idioms**: std::mem::take for borrow checker, ? for errors
-
-  6. **Actually simpler**: Replaced complex two-loop algorithm with single streaming loop
-
-  Pattern to replicate:
-  - Identify Vec<Vec<T>> pre-collection patterns
-  - Replace with on-demand processing + reusable buffer
-  - Use std::mem::take when buffer and data source share ownership
-  - Stream output incrementally rather than batch processing
-
-  This optimization should be considered the gold standard for lading payload optimizations.
-  It demonstrates that the biggest wins come from eliminating algorithmic inefficiencies
-  (batch processing) rather than micro-optimizing hot loops.
-
-  Future work:
-  - Survey other payload formats for similar Vec<Vec<u8>> patterns
-  - Consider if buffer size (4096) is optimal or should be tuned per format
-  - Document this pattern in contribution guidelines for future optimizers
diff --git a/.claude/skills/lading-optimize-review/assets/db/opt-payload-cache-inline.yaml b/.claude/skills/lading-optimize-review/assets/db/opt-payload-cache-inline.yaml
deleted file mode 100644
index f744a7b14..000000000
--- a/.claude/skills/lading-optimize-review/assets/db/opt-payload-cache-inline.yaml
+++ /dev/null
@@ -1,47 +0,0 @@
-id: payload-cache-inline
-verdict: approved
-date: 2026-01-20
-target: lading_payload/src/block.rs:Cache methods
-technique: inline
-votes:
-  duplicate_hunter: approve
-  skeptic: approve
-  conservative: approve
-  rust_expert: approve
-  greybeard: approve
-measurements:
-  time: -7.2%
-  memory: ~0%
-  allocations: ~0%
-reason: |
-  Added #[inline] attributes to five Cache methods (handle, total_size, peek_next_size,
-  peek_next_metadata, advance) to enable cross-crate inlining. These are small, hot-path
-  methods called 23+ times across 11 generator files.
-
-  Without #[inline], Rust cannot inline public methods across crate boundaries, forcing
-  function call overhead on every generator iteration. The advance() method is particularly
-  critical, called for every block transmitted.
-
-  All five personas approved unanimously:
-  - Duplicate Hunter: Confirmed new optimization, no duplicates
-  - Skeptic: 7.2% time improvement with statistical significance (6.914ms → 6.413ms)
-  - Conservative: ci/validate passed, determinism preserved, no semantic changes
-  - Rust Expert: Standard Rust performance practice, follows lading patterns
-  - Greybeard: Minimal change (5 one-line additions), justified by 7.2% improvement
-
-  Kani proofs ran successfully (no harnesses exist, which is acceptable).
-lessons: |
-  Cross-crate inlining in Rust requires explicit #[inline] attributes on public methods.
-  The compiler cannot inline across crate boundaries without this hint, even for trivial
-  methods. This optimization pattern is particularly valuable for:
-
-  1. Small accessor methods (1-3 lines)
-  2. Methods called in hot loops
-  3. Public APIs in library crates consumed by application crates
-
-  The 7.2% improvement came from eliminating just 5 function call overheads, demonstrating
-  that cross-crate call overhead is non-trivial. The 58% reduction in variance suggests
-  more predictable performance.
-
-  Key principle: When optimizing library crates, always consider cross-crate call patterns,
-  not just within-crate optimizations.
diff --git a/.claude/skills/lading-optimize-review/assets/duplicate.template.yaml b/.claude/skills/lading-optimize-review/assets/duplicate.template.yaml
new file mode 100644
index 000000000..4da4dd981
--- /dev/null
+++ b/.claude/skills/lading-optimize-review/assets/duplicate.template.yaml
@@ -0,0 +1,8 @@
+id: <descriptive-name>
+target: <file:function>
+technique: <technique>
+date: <YYYY-MM-DD>
+status: failure
+verdict: duplicate
+reason: |
+  <explanation of duplication with mention of existing entry>
diff --git a/.claude/skills/lading-optimize-review/assets/rejected.template.yaml b/.claude/skills/lading-optimize-review/assets/rejected.template.yaml
new file mode 100644
index 000000000..abd39eb38
--- /dev/null
+++ b/.claude/skills/lading-optimize-review/assets/rejected.template.yaml
@@ -0,0 +1,30 @@
+id: <descriptive-name>
+target: <file:function>
+technique: <technique>
+date: <YYYY-MM-DD>
+status: failure
+verdict: rejected
+votes:
+  duplicate_hunter: approve|reject
+  skeptic: approve|reject
+  conservative: approve|reject
+  rust_expert: approve|reject
+  greybeard: approve|reject
+measurements:
+  benchmarks:
+    micro:
+      # Flat keys: {name}_{size}_{metric}
+      # With raw values:    {name}_{size}_{metric}: <baseline> -> <optimized> (<improvement%>)
+      # Without raw values: {name}_{size}_{metric}: <improvement%>
+    macro:
+      # Required: time, memory, allocations
+      # Optional: peak_live
+      # With raw values:    <category>: <baseline> -> <optimized> (<improvement%>)
+      # Without raw values: <category>: <improvement%>
+      time:
+      memory:
+      allocations:
+reason: |
+  <summary of why rejected>
+lessons: |
+  <pattern learned if any>