diff --git a/.agents/skills/autoreview/SKILL.md b/.agents/skills/autoreview/SKILL.md
new file mode 100644
index 0000000..85d1939
--- /dev/null
+++ b/.agents/skills/autoreview/SKILL.md
@@ -0,0 +1,150 @@
+---
+name: autoreview
+description: "Autoreview closeout: local dirty changes, PR branch vs main, parallel tests."
+---
+
+# Autoreview
+
+Run Codex's built-in code review as a closeout check. This is code review (`codex review`), not Guardian `auto_review` approval routing.
+
+Codex native review mode performs best and is recommended. Non-Codex reviewers are fallback/second-opinion paths that receive a generated diff prompt, not the full Codex review-mode runtime.
+
+Use when:
+
+- user asks for Codex review / autoreview / second-model review
+- after non-trivial code edits, before final/commit/ship
+- reviewing a local branch or PR branch after fixes
+
+## Contract
+
+- Treat review output as advisory. Never blindly apply it.
+- Verify every finding by reading the real code path and adjacent files.
+- Read dependency docs/source/types when the finding depends on external behavior.
+- Reject unrealistic edge cases, speculative risks, broad rewrites, and fixes that over-complicate the codebase.
+- Prefer small fixes at the right ownership boundary; no refactor unless it clearly improves the bug class.
+- Keep going until the selected review path returns no accepted/actionable findings.
+- If a review-triggered fix changes code, rerun focused tests and rerun the review helper.
+- Default to Codex review with no fallback. Prefer Codex for final closeout because it uses native review mode; non-Codex reviewers use a Codex-inspired generated diff prompt. Use `--fallback-reviewer auto|claude|pi|opencode|droid|copilot` only when a second-model fallback is explicitly wanted and authenticated. The helper runs nested Codex review in yolo/full-access mode by default; use `--no-yolo` only when intentionally testing sandbox behavior.
+- Stop as soon as the review command/helper exits 0 with no accepted/actionable findings. Do not run an extra direct `codex review` just to get a nicer "clean" line, a second opinion, or clearer closeout wording.
+- Treat the helper's successful exit plus absence of actionable findings as the clean review result, even if the underlying Codex CLI output is terse.
+- If rejecting a finding as intentional/not worth fixing, add a brief inline code comment only when it explains a real invariant or ownership decision that future reviewers should know.
+- If creating or updating a PR while rejecting any autoreview finding, record the rejected finding and reason in the PR description so later reviewers can distinguish intentional design decisions from missed review output.
+- Do not push just to review. Push only when the user requested push/ship/PR update.
+- For OpenClaw maintainers, keep autoreview validation Crabbox/Testbox-aware when maintainer validation mode is enabled (`OPENCLAW_TESTBOX=1` or `AUTOREVIEW_OPENCLAW_MAINTAINER_VALIDATION=1`). A review pass may inspect files and run cheap non-Node probes, but it must not start local `pnpm`, Vitest, `tsgo`, `npm test`, or `node scripts/run-vitest.mjs` from a Codex/worktree review unless the operator explicitly requested local proof. For runtime proof, use existing evidence or route through Crabbox/Testbox and report the id. Do not apply this rule to ordinary contributors who do not have maintainer Testbox access.
+
+## Pick Target
+
+Dirty local work:
+
+```bash
+codex review --uncommitted
+```
+
+Use this only when the patch is actually unstaged/staged/untracked in the
+current checkout. For committed, pushed, or PR work, point Codex at the commit
+or branch diff instead; do not force `--mode local` / `--uncommitted` just
+because the helper docs mention dirty work first. A clean `--uncommitted` review
+only proves there is no local patch.
+
+Branch/PR work:
+
+```bash
+git fetch origin
+codex review --base origin/main
+```
+
+Do not pass any prompt with `--base`, `--commit`, or `--uncommitted`. Codex CLI
+review targets and custom review prompts are mutually exclusive: target modes
+generate their own review prompt internally. Use plain target review for native
+Codex closeout, or use custom prompt review (`codex review -`) only when you
+intentionally want a generated diff prompt instead of native target review.
+
+If an open PR exists, use its actual base:
+
+```bash
+base=$(gh pr view --json baseRefName --jq .baseRefName)
+codex review --base "origin/$base"
+```
+
+Committed single change:
+
+```bash
+codex review --commit HEAD
+```
+
+or with the helper:
+
+```bash
+.agents/skills/autoreview/scripts/autoreview --mode commit --commit HEAD
+```
+
+Use commit review for already-landed or already-pushed work on `main`. Reviewing
+clean `main` against `origin/main` is usually an empty diff after push. For a
+small stack, review each commit explicitly or review the branch before merging
+with `--base`.
+
+## Parallel Closeout
+
+Format first if formatting can change line locations. Then it is OK to run tests and review in parallel:
+
+```bash
+.agents/skills/autoreview/scripts/autoreview --parallel-tests "<focused test command>"
+```
+
+Tradeoff: tests may force code changes that stale the review. If tests or review lead to code edits, rerun the affected tests and rerun review until no accepted/actionable findings remain. Once that rerun exits cleanly, stop; do not spend another long review cycle on redundant confirmation.
+
+## Context Efficiency
+
+Codex review is usually noisy. Default to a subagent filter when subagents are available. Ask it to run the review and return only:
+
+- actionable findings it accepts
+- findings it rejects, with one-line reason
+- exact files/tests to rerun
+
+Run inline only for tiny changes or when subagents are unavailable.
+
+## Helper
+
+Bundled helper:
+
+```bash
+.agents/skills/autoreview/scripts/autoreview --help
+```
+
+The helper:
+
+- chooses dirty `--uncommitted` first
+- otherwise uses current PR base if `gh pr view` works
+- otherwise uses `origin/main` for non-main branches
+- auto-runs `PNPM_CONFIG_PM_ON_FAIL=ignore PNPM_CONFIG_VERIFY_DEPS_BEFORE_RUN=false PNPM_CONFIG_OFFLINE=true pnpm run check` in parallel when a repo has `package.json`, `pnpm-lock.yaml`, `node_modules`, and a `check` script; disable with `AUTOREVIEW_AUTO_TESTS=0`
+- use `--mode commit --commit <ref>` for already-committed work, especially clean `main` after landing
+- should be left in `--mode auto` or forced to `--mode branch` for PR/branch work; do not force `--mode local` after committing
+- supports `--reviewer codex|claude|pi|opencode|droid|copilot|auto`; `auto` means Codex first
+- supports `--fallback-reviewer auto|claude|pi|opencode|droid|copilot|none`; default is `none`
+- falls back only when Codex is unavailable or exits nonzero, not when Codex reports findings
+- writes only to stdout unless `--output` or `AUTOREVIEW_OUTPUT` is set
+- supports `--dry-run`, `--parallel-tests`, and commit refs
+- runs nested review with `--dangerously-bypass-approvals-and-sandbox --sandbox danger-full-access` by default
+- with `OPENCLAW_TESTBOX=1` or `AUTOREVIEW_OPENCLAW_MAINTAINER_VALIDATION=1`, disables auto local `pnpm run check` and routes Codex through generated prompt review (`codex review -`) so the no-local-heavy-tests policy is included; native Codex target review cannot accept extra prompt text
+- non-Codex reviewers receive the generated diff prompt and maintainer validation policy text when maintainer validation is active
+- keeps accepting `--full-access`; use `--no-yolo` or `AUTOREVIEW_YOLO=0` to opt out
+- still accepts legacy `CODEX_REVIEW_*` env vars when the matching `AUTOREVIEW_*` var is unset
+- prints `autoreview clean: no accepted/actionable findings reported` when the selected review command exits 0
+
+## Final Report
+
+Include:
+
+- review command used
+- tests/proof run
+- findings accepted/rejected, briefly why
+- the clean review result from the final helper/review run, or why a remaining finding was consciously rejected
+
+Do not run another Codex review solely to improve the final report wording. If the final helper run exited 0 and produced no accepted/actionable findings, report that exact run as clean.
+
+## PR / CI Closeout
+
+- Prefer direct run/job APIs after CI starts: `gh run view <run-id> --json jobs`; use PR rollup only for final mergeability.
+- After rebase, compare `origin/main..HEAD`; drop CI-fix commits already upstream before pushing.
+- For prompt snapshot CI failures, prove/generate with Linux Node 24 before rerunning the failed job.
+- Update PR body once near the final head unless proof labels are missing or stale enough to block CI.
diff --git a/.agents/skills/autoreview/scripts/autoreview b/.agents/skills/autoreview/scripts/autoreview
new file mode 100755
index 0000000..92aaac7
--- /dev/null
+++ b/.agents/skills/autoreview/scripts/autoreview
@@ -0,0 +1,697 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+usage() {
+  cat <<'EOF'
+Usage: autoreview [options]
+
+Options:
+  --mode auto|local|branch|commit
+                              Target selection. Default: auto.
+  --base REF                 Base ref for branch review. Default: PR base or origin/main.
+  --commit REF               Commit ref for commit review. Default: HEAD.
+  --reviewer codex|claude|pi|opencode|droid|copilot|auto
+                              Review engine. Default: Codex.
+  --fallback-reviewer auto|claude|pi|opencode|droid|copilot|none
+                              Fallback when Codex is unavailable or exits nonzero. Default: none.
+  --codex-bin PATH           Codex binary. Default: codex.
+  --claude-bin PATH          Claude binary. Default: claude.
+  --pi-bin PATH              Pi binary. Default: pi.
+  --opencode-bin PATH        OpenCode binary. Default: opencode.
+  --droid-bin PATH           Droid binary. Default: droid.
+  --copilot-bin PATH         GitHub Copilot binary. Default: copilot.
+  --full-access              Keep yolo/full-access mode enabled. Default.
+  --no-yolo                  Run nested Codex review with normal sandbox/approval prompts.
+  --output FILE              Also save output to file.
+  --parallel-tests CMD       Run review and test command concurrently.
+                              Default: PNPM_CONFIG_PM_ON_FAIL=ignore PNPM_CONFIG_VERIFY_DEPS_BEFORE_RUN=false PNPM_CONFIG_OFFLINE=true pnpm run check when available.
+  --dry-run                  Print selected commands, do not run.
+  -h, --help                 Show help.
+
+Environment:
+  OPENCLAW_TESTBOX=1 or AUTOREVIEW_OPENCLAW_MAINTAINER_VALIDATION=1
+                              Enable maintainer-only OpenClaw Crabbox/Testbox validation policy.
+
+Modes:
+  local   codex review --uncommitted
+  branch  codex review --base <base>
+  commit  codex review --commit <commit>
+  auto    dirty tree -> local, else PR/current branch -> branch
+EOF
+}
+
+mode=auto
+base_ref=
+commit_ref=HEAD
+reviewer=${AUTOREVIEW_REVIEWER:-${CODEX_REVIEW_REVIEWER:-auto}}
+fallback_reviewer=${AUTOREVIEW_FALLBACK_REVIEWER:-${CODEX_REVIEW_FALLBACK_REVIEWER:-none}}
+codex_bin=${CODEX_BIN:-codex}
+claude_bin=${CLAUDE_BIN:-claude}
+pi_bin=${PI_BIN:-pi}
+opencode_bin=${OPENCODE_BIN:-opencode}
+droid_bin=${DROID_BIN:-droid}
+copilot_bin=${COPILOT_BIN:-copilot}
+codex_args=()
+yolo=${AUTOREVIEW_YOLO:-${CODEX_REVIEW_YOLO:-1}}
+output=${AUTOREVIEW_OUTPUT:-${CODEX_REVIEW_OUTPUT:-}}
+parallel_tests=
+parallel_tests_auto=false
+dry_run=false
+codex_review_prompt=
+codex_review_prompt_file=false
+openclaw_maintainer_validation=${AUTOREVIEW_OPENCLAW_MAINTAINER_VALIDATION:-${OPENCLAW_TESTBOX:-0}}
+
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --mode)
+      mode=${2:-}
+      shift 2
+      ;;
+    --base)
+      base_ref=${2:-}
+      shift 2
+      ;;
+    --commit)
+      commit_ref=${2:-}
+      shift 2
+      ;;
+    --reviewer)
+      reviewer=${2:-}
+      shift 2
+      ;;
+    --fallback-reviewer)
+      fallback_reviewer=${2:-}
+      shift 2
+      ;;
+    --codex-bin)
+      codex_bin=${2:-}
+      shift 2
+      ;;
+    --claude-bin)
+      claude_bin=${2:-}
+      shift 2
+      ;;
+    --pi-bin)
+      pi_bin=${2:-}
+      shift 2
+      ;;
+    --opencode-bin)
+      opencode_bin=${2:-}
+      shift 2
+      ;;
+    --droid-bin)
+      droid_bin=${2:-}
+      shift 2
+      ;;
+    --copilot-bin)
+      copilot_bin=${2:-}
+      shift 2
+      ;;
+    --full-access)
+      yolo=1
+      shift
+      ;;
+    --no-yolo)
+      yolo=0
+      shift
+      ;;
+    --output)
+      output=${2:-}
+      shift 2
+      ;;
+    --parallel-tests)
+      parallel_tests=${2:-}
+      shift 2
+      ;;
+    --dry-run)
+      dry_run=true
+      shift
+      ;;
+    -h|--help)
+      usage
+      exit 0
+      ;;
+    *)
+      usage >&2
+      exit 2
+      ;;
+  esac
+done
+
+case "$yolo" in
+  0|false|False|FALSE|no|No|NO|off|Off|OFF) ;;
+  *) codex_args+=(--dangerously-bypass-approvals-and-sandbox --sandbox danger-full-access) ;;
+esac
+
+case "$mode" in
+  auto|local|branch|commit) ;;
+  *)
+    echo "invalid --mode: $mode" >&2
+    exit 2
+    ;;
+esac
+
+case "$reviewer" in
+  auto|codex|claude|pi|opencode|droid|copilot) ;;
+  *)
+    echo "invalid --reviewer: $reviewer" >&2
+    exit 2
+    ;;
+esac
+
+case "$fallback_reviewer" in
+  auto|claude|pi|opencode|droid|copilot|none) ;;
+  *)
+    echo "invalid --fallback-reviewer: $fallback_reviewer" >&2
+    exit 2
+    ;;
+esac
+
+repo_root=$(git rev-parse --show-toplevel)
+printf -v quoted_repo_root '%q' "$repo_root"
+
+has_package_check_script() {
+  command -v node >/dev/null 2>&1 || return 1
+  node -e 'const { readFileSync } = require("node:fs"); const p = JSON.parse(readFileSync(process.argv[1], "utf8")); process.exit(p.scripts?.check ? 0 : 1)' \
+    "$repo_root/package.json" \
+    >/dev/null 2>&1
+}
+
+auto_tests_disabled() {
+  case "${AUTOREVIEW_AUTO_TESTS:-${CODEX_REVIEW_AUTO_TESTS:-1}}" in
+    0|false|False|FALSE|no|No|NO|off|Off|OFF) return 0 ;;
+    *) return 1 ;;
+  esac
+}
+
+current_branch=$(git branch --show-current 2>/dev/null || true)
+dirty=false
+if [[ -n "$(git status --porcelain)" ]]; then
+  dirty=true
+fi
+
+pr_url=
+if [[ -z "$base_ref" && "$mode" != local ]] && command -v gh >/dev/null 2>&1; then
+  if pr_lines=$(gh pr view --json baseRefName,url --jq '[.baseRefName, .url] | @tsv' 2>/dev/null); then
+    base_name=${pr_lines%%$'\t'*}
+    pr_url=${pr_lines#*$'\t'}
+    if [[ -n "$base_name" ]]; then
+      base_ref="origin/$base_name"
+    fi
+  fi
+fi
+
+if [[ -z "$base_ref" ]]; then
+  base_ref=origin/main
+fi
+
+review_kind=
+if [[ "$mode" == local || ( "$mode" == auto && "$dirty" == true ) ]]; then
+  review_kind=local
+elif [[ "$mode" == commit ]]; then
+  review_kind=commit
+elif [[ "$mode" == branch || ( "$mode" == auto && -n "$current_branch" && "$current_branch" != "main" ) ]]; then
+  review_kind=branch
+else
+  echo "no review target: clean main checkout and no forced mode" >&2
+  exit 1
+fi
+
+if [[ "$review_kind" == local ]]; then
+  review_cmd=("$codex_bin" "${codex_args[@]}" review --uncommitted)
+elif [[ "$review_kind" == commit ]]; then
+  review_cmd=("$codex_bin" "${codex_args[@]}" review --commit "$commit_ref")
+else
+  review_cmd=("$codex_bin" "${codex_args[@]}" review --base "$base_ref")
+fi
+
+case "$openclaw_maintainer_validation" in
+  1|true|True|TRUE|yes|Yes|YES|on|On|ON) openclaw_maintainer_validation=1 ;;
+  *) openclaw_maintainer_validation=0 ;;
+esac
+if [[ -z "$parallel_tests" && "$openclaw_maintainer_validation" != 1 ]] &&
+  ! auto_tests_disabled; then
+  if [[ -f "$repo_root/package.json" && -f "$repo_root/pnpm-lock.yaml" && -d "$repo_root/node_modules" ]] &&
+    command -v pnpm >/dev/null 2>&1 &&
+    has_package_check_script; then
+    parallel_tests="cd $quoted_repo_root && PNPM_CONFIG_PM_ON_FAIL=ignore PNPM_CONFIG_VERIFY_DEPS_BEFORE_RUN=false PNPM_CONFIG_OFFLINE=true pnpm run check"
+    parallel_tests_auto=true
+  fi
+fi
+if [[ "$openclaw_maintainer_validation" == 1 ]]; then
+  codex_review_prompt=$(cat <<'EOF'
+OpenClaw maintainer autoreview validation policy:
+- Review the diff by reading code, tests, and dependency contracts.
+- Do not run local memory-heavy Node validation from review mode. This includes local pnpm checks/tests, Vitest, tsgo, npm test, and node scripts/run-vitest.mjs.
+- If runtime proof is needed, use existing proof or route validation through Crabbox / Blacksmith Testbox and report the exact provider and id.
+- If remote validation is not necessary for the finding, state the targeted proof that should be run instead of starting local tests.
+EOF
+)
+  review_cmd=("$codex_bin" "${codex_args[@]}" review -)
+  codex_review_prompt_file=true
+fi
+
+printf 'autoreview target: %s\n' "$review_kind"
+printf 'branch: %s\n' "${current_branch:-detached}"
+if [[ -n "$pr_url" ]]; then
+  printf 'pr: %s\n' "$pr_url"
+fi
+if [[ "$reviewer" == auto ]]; then
+  printf 'reviewer: codex\n'
+else
+  printf 'reviewer: %s\n' "$reviewer"
+fi
+case "$reviewer" in
+  codex|auto) ;;
+  *)
+    printf 'note: Codex native review mode is the recommended and best-supported review path; %s uses a generated diff prompt.\n' "$reviewer"
+    ;;
+esac
+if [[ "$reviewer" == auto || "$reviewer" == codex ]]; then
+  printf 'review:'
+  printf ' %q' "${review_cmd[@]}"
+  printf '\n'
+  if [[ "$codex_review_prompt_file" == true ]]; then
+    printf 'review policy: OpenClaw maintainer validation active; using generated prompt review because Codex target review cannot accept extra prompt text\n'
+  fi
+else
+  printf 'review: %s prompt review\n' "$reviewer"
+fi
+if [[ -n "$parallel_tests" ]]; then
+  printf 'tests: %s' "$parallel_tests"
+  if [[ "$parallel_tests_auto" == true ]]; then
+    printf ' (auto)'
+  fi
+  printf '\n'
+fi
+if [[ "$review_kind" == branch ]]; then
+  printf 'fetch: git fetch origin --quiet\n'
+fi
+if [[ -n "$output" ]]; then
+  printf 'output: %s\n' "$output"
+fi
+
+if [[ "$dry_run" == true ]]; then
+  exit 0
+fi
+
+if [[ "$review_kind" == branch ]]; then
+  git fetch origin --quiet || {
+    echo "warning: git fetch origin failed; reviewing with existing refs" >&2
+  }
+fi
+
+review_output=$output
+review_output_is_temp=false
+if [[ -z "$review_output" ]]; then
+  review_output=$(mktemp)
+  review_output_is_temp=true
+fi
+mkdir -p "$(dirname "$review_output")"
+: > "$review_output"
+
+cleanup() {
+  if [[ "${review_output_is_temp:-false}" == true && -n "${review_output:-}" ]]; then
+    rm -f "$review_output"
+  fi
+  if [[ -n "${prompt_file:-}" ]]; then
+    rm -f "$prompt_file"
+  fi
+}
+trap cleanup EXIT
+
+run_review() {
+  local status=0
+  mkdir -p "$(dirname "$review_output")"
+  if [[ "$codex_review_prompt_file" == true ]]; then
+    build_prompt_file || return
+    "${review_cmd[@]}" < "$prompt_file" 2>&1 | tee "$review_output"
+    status=${PIPESTATUS[0]}
+    rm -f "$prompt_file"
+    prompt_file=
+    return "$status"
+  fi
+  "${review_cmd[@]}" 2>&1 | tee "$review_output"
+}
+
+diff_for_review() {
+  case "$review_kind" in
+    local)
+      git -C "$repo_root" diff --stat
+      git -C "$repo_root" diff --cached --stat
+      git -C "$repo_root" diff --find-renames
+      git -C "$repo_root" diff --cached --find-renames
+      while IFS= read -r untracked_file; do
+        [[ -n "$untracked_file" ]] || continue
+        git -C "$repo_root" diff --no-index -- /dev/null "$untracked_file" || true
+      done < <(git -C "$repo_root" ls-files --others --exclude-standard)
+      ;;
+    commit)
+      git -C "$repo_root" show --find-renames --stat --format=fuller "$commit_ref"
+      git -C "$repo_root" show --find-renames --format=medium "$commit_ref"
+      ;;
+    branch)
+      git -C "$repo_root" diff --find-renames --stat "$base_ref"...HEAD
+      git -C "$repo_root" diff --find-renames "$base_ref"...HEAD
+      ;;
+  esac
+}
+
+build_prompt_file() {
+  prompt_file=$(mktemp)
+  {
+    cat <<EOF
+You are performing a closeout code review for the current repository.
+
+Review target: $review_kind
+Branch: ${current_branch:-detached}
+Base: ${base_ref:-}
+Commit: ${commit_ref:-}
+
+Rules:
+- Review the proposed code change as a closeout reviewer.
+- Focus on the diff below. If your CLI exposes read-only repository tools, inspect surrounding code and tests to verify findings; never modify files.
+- Do not modify files.
+${codex_review_prompt}
+- Report only discrete, actionable issues introduced by this change.
+- Prioritize correctness, regressions, security, data loss, performance cliffs, and missing tests that would catch a real bug.
+- Do not report pre-existing issues, speculative risks, broad rewrites, style nits, changelog gaps, or findings that depend on unstated assumptions.
+- Identify the concrete scenario where the issue appears, and keep the line reference as small as possible.
+- A finding should overlap changed code or clearly cite changed code as the cause.
+- For each accepted/actionable finding, use exactly this format:
+  [P<0-3>] Short title
+  File: path:line
+  Why: one sentence
+  Fix: one sentence
+- If no accepted/actionable findings, output exactly:
+  autoreview clean: no accepted/actionable findings reported
+
+Diff:
+EOF
+    diff_for_review
+  } > "$prompt_file" || return
+}
+
+reviewer_output_has_clean_marker() {
+  local path=$1
+  grep -Eq '^[^[:alnum:]]*autoreview clean: no accepted/actionable findings reported[[:space:]]*$' "$path"
+}
+
+run_prompt_reviewer() {
+  local selected=$1
+  local copilot_prompt=
+  local prompt_bytes=0
+  local reviewer_output
+  local status=0
+
+  if ! build_prompt_file; then
+    rm -f "${prompt_file:-}"
+    prompt_file=
+    return 1
+  fi
+  reviewer_output=$(mktemp)
+  mkdir -p "$(dirname "$review_output")"
+
+  case "$selected" in
+    claude)
+      if ! command -v "$claude_bin" >/dev/null 2>&1; then
+        echo "fallback reviewer unavailable: $claude_bin" >&2
+        status=127
+      elif printf 'fallback: claude -p\n' | tee -a "$review_output"; then
+        "$claude_bin" --tools "" --no-session-persistence -p < "$prompt_file" 2>&1 | tee -a "$review_output" "$reviewer_output"
+        status=$?
+      else
+        status=$?
+      fi
+      ;;
+    pi)
+      if ! command -v "$pi_bin" >/dev/null 2>&1; then
+        echo "fallback reviewer unavailable: $pi_bin" >&2
+        status=127
+      elif printf 'fallback: pi -p\n' | tee -a "$review_output"; then
+        "$pi_bin" --no-tools --no-session -p < "$prompt_file" 2>&1 | tee -a "$review_output" "$reviewer_output"
+        status=$?
+      else
+        status=$?
+      fi
+      ;;
+    opencode)
+      if ! command -v "$opencode_bin" >/dev/null 2>&1; then
+        echo "fallback reviewer unavailable: $opencode_bin" >&2
+        status=127
+      elif printf 'fallback: opencode run\n' | tee -a "$review_output"; then
+        "$opencode_bin" run --pure --dir "$repo_root" \
+          "Review the attached prompt file. Do not modify files." \
+          --file "$prompt_file" 2>&1 | tee -a "$review_output" "$reviewer_output"
+        status=$?
+      else
+        status=$?
+      fi
+      ;;
+    droid)
+      if ! command -v "$droid_bin" >/dev/null 2>&1; then
+        echo "fallback reviewer unavailable: $droid_bin" >&2
+        status=127
+      elif printf 'fallback: droid exec\n' | tee -a "$review_output"; then
+        "$droid_bin" exec --cwd "$repo_root" -f "$prompt_file" 2>&1 | tee -a "$review_output" "$reviewer_output"
+        status=$?
+      else
+        status=$?
+      fi
+      ;;
+    copilot)
+      if ! command -v "$copilot_bin" >/dev/null 2>&1; then
+        echo "fallback reviewer unavailable: $copilot_bin" >&2
+        status=127
+      elif printf 'fallback: copilot\n' | tee -a "$review_output"; then
+        prompt_bytes=$(wc -c < "$prompt_file" | tr -d '[:space:]')
+        if (( prompt_bytes > 120000 )); then
+          echo "copilot reviewer unavailable: generated prompt is too large for copilot -p; use codex, droid, or another file/stdin-capable reviewer" \
+            2>&1 | tee -a "$review_output" "$reviewer_output"
+          status=1
+        else
+          copilot_prompt=$(< "$prompt_file")
+          "$copilot_bin" -C "$repo_root" --available-tools=none --stream off --output-format text --silent \
+            -p "$copilot_prompt" \
+            2>&1 | tee -a "$review_output" "$reviewer_output"
+          status=$?
+        fi
+      else
+        status=$?
+      fi
+      ;;
+    *)
+      echo "unsupported prompt reviewer: $selected" >&2
+      status=2
+      ;;
+  esac
+  if [[ "$status" == 0 ]]; then
+    if grep -Eq '\[P[0-3]\]' "$reviewer_output"; then
+      status=1
+    elif ! grep -q '[^[:space:]]' "$reviewer_output"; then
+      status=1
+    elif ! reviewer_output_has_clean_marker "$reviewer_output"; then
+      status=1
+    fi
+  fi
+  rm -f "$reviewer_output"
+  rm -f "$prompt_file"
+  prompt_file=
+  return "$status"
+}
+
+run_selected_review() {
+  local selected=$1
+  case "$selected" in
+    codex)
+      if ! command -v "$codex_bin" >/dev/null 2>&1; then
+        echo "codex reviewer unavailable: $codex_bin" >&2
+        return 127
+      fi
+      run_review
+      ;;
+    claude|pi|opencode|droid|copilot)
+      run_prompt_reviewer "$selected"
+      ;;
+    *)
+      echo "unsupported reviewer: $selected" >&2
+      return 2
+      ;;
+  esac
+}
+
+fallback_reviewer_is_available() {
+  local selected=$1
+  case "$selected" in
+    claude) command -v "$claude_bin" >/dev/null 2>&1 ;;
+    pi) command -v "$pi_bin" >/dev/null 2>&1 ;;
+    opencode) command -v "$opencode_bin" >/dev/null 2>&1 ;;
+    droid) command -v "$droid_bin" >/dev/null 2>&1 ;;
+    copilot) command -v "$copilot_bin" >/dev/null 2>&1 ;;
+    *) return 1 ;;
+  esac
+}
+
+run_auto_fallback_review() {
+  local selected
+  if [[ "$fallback_reviewer" != auto ]]; then
+    run_selected_review "$fallback_reviewer"
+    return $?
+  fi
+
+  for selected in claude pi opencode droid copilot; do
+    if fallback_reviewer_is_available "$selected"; then
+      run_selected_review "$selected"
+      return $?
+    fi
+  done
+
+  echo "fallback reviewer unavailable: no configured fallback CLI found" >&2
+  return 127
+}
+
+run_auto_review() {
+  run_selected_review codex
+  local status=$?
+  if [[ "$status" == 0 ]]; then
+    return 0
+  fi
+  if (( status > 128 && status < 192 )); then
+    return "$status"
+  fi
+  if review_output_has_findings; then
+    return "$status"
+  fi
+  if [[ "$fallback_reviewer" == none ]]; then
+    return "$status"
+  fi
+  if [[ "$fallback_reviewer" == auto ]]; then
+    printf 'autoreview warning: codex exited %s; trying configured fallback reviewers\n' "$status" >&2
+  else
+    printf 'autoreview warning: codex exited %s; falling back to %s\n' "$status" "$fallback_reviewer" >&2
+  fi
+  run_auto_fallback_review
+}
+
+elapsed_since() {
+  local started_at=$1
+  local finished_at
+  finished_at=$(date +%s)
+  printf '%s\n' "$((finished_at - started_at))"
+}
+
+format_elapsed() {
+  local seconds=$1
+  if (( seconds < 60 )); then
+    printf '%ss\n' "$seconds"
+  else
+    printf '%sm%ss\n' "$((seconds / 60))" "$((seconds % 60))"
+  fi
+}
+
+review_output_empty() {
+  [[ ! -s "$review_output" ]] || ! grep -q '[^[:space:]]' "$review_output"
+}
+
+review_findings_text() {
+  if grep -Fxq 'codex' "$review_output"; then
+    awk '
+      $0 == "codex" {
+        capture = 1
+        output = $0 ORS
+        next
+      }
+      capture {
+        output = output $0 ORS
+      }
+      END {
+        printf "%s", output
+      }
+    ' "$review_output"
+    return
+  fi
+  cat "$review_output"
+}
+
+review_output_has_findings() {
+  review_findings_text | grep -Eq '\[P[0-3]\]'
+}
+
+report_clean_review_or_fail() {
+  local elapsed_text
+  elapsed_text=$(format_elapsed "${review_elapsed_seconds:-0}")
+
+  if review_output_has_findings; then
+    printf 'autoreview complete after %s\n' "$elapsed_text"
+    printf 'autoreview findings: accepted/actionable findings reported\n'
+    return 1
+  fi
+  if review_output_empty; then
+    printf 'autoreview complete after %s; no output\n' "$elapsed_text"
+    return 1
+  fi
+  printf 'autoreview complete after %s\n' "$elapsed_text"
+  printf 'autoreview clean: no accepted/actionable findings reported\n'
+}
+
+if [[ -z "$parallel_tests" ]]; then
+  review_started_at=$(date +%s)
+  set +e
+  if [[ "$reviewer" == auto ]]; then
+    run_auto_review
+  else
+    run_selected_review "$reviewer"
+  fi
+  review_status=$?
+  review_elapsed_seconds=$(elapsed_since "$review_started_at")
+  set -e
+  if [[ "$review_status" == 0 ]]; then
+    report_clean_review_or_fail
+    exit $?
+  fi
+  exit "$review_status"
+fi
+
+review_status_file=$(mktemp)
+review_elapsed_file=$(mktemp)
+tests_status_file=$(mktemp)
+
+(
+  set +e
+  review_started_at=$(date +%s)
+  if [[ "$reviewer" == auto ]]; then
+    run_auto_review
+  else
+    run_selected_review "$reviewer"
+  fi
+  status=$?
+  elapsed=$(elapsed_since "$review_started_at")
+  printf '%s\n' "$status" > "$review_status_file"
+  printf '%s\n' "$elapsed" > "$review_elapsed_file"
+) &
+review_pid=$!
+
+(
+  set +e
+  bash -lc "$parallel_tests"
+  status=$?
+  printf '%s\n' "$status" > "$tests_status_file"
+) &
+tests_pid=$!
+
+wait "$review_pid" || true
+wait "$tests_pid" || true
+
+review_status=$(cat "$review_status_file")
+review_elapsed_seconds=$(cat "$review_elapsed_file")
+tests_status=$(cat "$tests_status_file")
+rm -f "$review_status_file" "$review_elapsed_file" "$tests_status_file"
+
+printf 'autoreview exit: %s\n' "$review_status"
+printf 'tests exit: %s\n' "$tests_status"
+
+if [[ "$review_status" != 0 || "$tests_status" != 0 ]]; then
+  exit 1
+fi
+
+report_clean_review_or_fail
diff --git a/.agents/skills/crabbox/SKILL.md b/.agents/skills/crabbox/SKILL.md
new file mode 100644
index 0000000..f81c0b5
--- /dev/null
+++ b/.agents/skills/crabbox/SKILL.md
@@ -0,0 +1,711 @@
+---
+name: crabbox
+description: Use the Crabbox wrapper for OpenClaw remote validation across Linux, macOS, Windows, and WSL2, including delegated Blacksmith Testbox proof. Report the actual provider and id.
+---
+
+# Crabbox
+
+Use the Crabbox wrapper when OpenClaw needs remote Linux proof for broad tests,
+CI-parity checks, secrets, hosted services, Docker/E2E/package lanes, warmed
+reusable boxes, sync timing, logs/results, cache inspection, or lease cleanup.
+
+Crabbox is the transport/orchestration surface. The actual backend can be:
+
+- brokered AWS Crabbox: direct provider, `provider=aws`, lease ids like
+  `cbx_...`, `syncDelegated=false`
+- Blacksmith Testbox through Crabbox: delegated provider,
+  `provider=blacksmith-testbox`, ids like `tbx_...`, `syncDelegated=true`
+
+For OpenClaw maintainer broad `pnpm` gates, Blacksmith Testbox through the
+Crabbox wrapper is acceptable and often preferred when the standing Testbox
+rules apply. Do not describe those runs as "AWS Crabbox"; report them as
+Testbox-through-Crabbox with the `tbx_...` id and Actions run.
+
+Use the repo `.crabbox.yaml` brokered AWS path when the task specifically needs
+direct AWS Crabbox behavior, persistent direct-provider leases, `--fresh-pr`,
+`--full-resync`, environment forwarding, capture/download support, or provider
+comparison. Use `--provider blacksmith-testbox` when the task needs OpenClaw
+maintainer Testbox proof, prepared CI environment, broad/heavy pnpm gates, or
+the user asks for Testbox/Blacksmith.
+
+## First Checks
+
+- Run from the repo root. Crabbox sync mirrors the current checkout.
+- Check the wrapper and providers before remote work:
+
+```sh
+command -v crabbox
+../crabbox/bin/crabbox --version
+../crabbox/bin/crabbox run --help | sed -n '1,120p'
+../crabbox/bin/crabbox desktop launch --help
+../crabbox/bin/crabbox webvnc --help
+```
+
+- OpenClaw scripts prefer `../crabbox/bin/crabbox` when present. The user PATH
+  shim can be stale.
+- Check `.crabbox.yaml` for direct-provider defaults. Omitting `--provider`
+  means brokered AWS today.
+- The brokered AWS default is a Linux developer image in `eu-west-1`; the repo
+  config pins hot `eu-west-1a/b/c` placement so Fast Snapshot Restore can apply.
+  If warmup drifts well past the minute-scale path, verify image promotion,
+  region/AZ placement, and FSR state before blaming OpenClaw.
+- For broad OpenClaw maintainer `pnpm` gates, prefer the repo wrapper with
+  `--provider blacksmith-testbox` or the repo Testbox helpers when the standing
+  Testbox policy applies.
+- Always report the actual provider and id. `cbx_...` means AWS Crabbox;
+  `tbx_...` means Blacksmith Testbox through Crabbox. If the output only says
+  `blacksmith testbox list`, use `blacksmith testbox list --all` before
+  concluding no box exists.
+- If a warm direct-provider lease smells stale, retry with `--full-resync`
+  (alias `--fresh-sync`) before replacing the lease. This resets the remote
+  workdir, skips the fingerprint fast path, reseeds Git when possible, and
+  uploads the checkout from scratch.
+- For live/provider bugs, use the configured secret workflow before downgrading
+  to mocks. Copy only the exact needed key into the remote process environment
+  for that one command. Do not print it, do not sync it as a repo file, and do
+  not leave it in remote shell history or logs. If no secret-safe injection path
+  is available, say true live provider auth is blocked instead of silently using
+  a fake key.
+- Prefer local targeted tests for tight edit loops. Broad gates belong remote.
+- Do not treat inherited shell env as operator intent. In particular,
+  `OPENCLAW_LOCAL_CHECK_MODE=throttled` from the local shell is not permission
+  to move broad `pnpm check:changed`, `pnpm test:changed`, full `pnpm test`, or
+  lint/typecheck fan-out onto the laptop.
+- Only use `OPENCLAW_LOCAL_CHECK_MODE=throttled|full` when the user explicitly
+  asks for local proof in the current task. If Testbox is queued or capacity is
+  constrained, report the blocker and keep only targeted local edit-loop checks
+  running.
+
+## macOS And Windows Targets
+
+Use these only when the task needs an existing non-Linux host. OpenClaw broad
+Linux validation uses the repo Crabbox config unless a provider is explicitly
+requested.
+
+Native brokered Windows is available for Windows-specific proof. Use the AWS
+developer image in `us-west-2` on demand; it has the expected OpenClaw developer
+toolchain and Docker image cache. Keep broad Linux gates on Linux/Testbox unless
+the bug is Windows-specific:
+
+```sh
+../crabbox/bin/crabbox warmup \
+  --provider aws \
+  --target windows \
+  --windows-mode normal \
+  --region us-west-2 \
+  --market on-demand \
+  --timing-json
+```
+
+The hydrate workflow assumes Docker should already be baked into Linux images
+and only installs it as a fallback. Do not add per-run Docker installs to proof
+commands unless the image probe shows Docker is actually missing.
+
+When the user explicitly asks for brokered macOS runners, use Crabbox AWS
+macOS only after confirming the deployed coordinator supports EC2 Mac host
+lifecycle/image routes and the operator has AWS EC2 Mac Dedicated Host quota
+and IAM. Prefer `CRABBOX_HOST_ID` for a known Crabbox-managed Dedicated Host,
+or run the no-spend preflight first:
+
+```sh
+crabbox admin hosts quota --provider aws --target macos --region eu-west-1 --type mac2.metal --json
+crabbox admin hosts allocate --provider aws --target macos --region eu-west-1 --type mac2.metal --dry-run --json
+CRABBOX_MACOS_TYPES=all scripts/macos-host-region-preflight.sh
+```
+
+Do not silently substitute AWS macOS for normal OpenClaw Linux proof. Report
+paid-host blockers as quota, IAM, coordinator deployment, or host availability
+instead of falling back to local macOS.
+
+Crabbox supports static SSH targets:
+
+```sh
+../crabbox/bin/crabbox run --provider ssh --target macos --static-host mac-studio.local -- xcodebuild test
+../crabbox/bin/crabbox run --provider ssh --target windows --windows-mode normal --static-host win-dev.local -- pwsh -NoProfile -Command "dotnet test"
+../crabbox/bin/crabbox run --provider ssh --target windows --windows-mode wsl2 --static-host win-dev.local -- pnpm test
+```
+
+- `target=macos` and `target=windows --windows-mode wsl2` use the POSIX SSH,
+  bash, Git, rsync, and tar contract.
+- Native Windows uses OpenSSH, PowerShell, Git, and tar; sync is manifest tar
+  archive transfer into `static.workRoot`. Direct native Windows runs support
+  `--script*`, `--env-from-profile`, `--preflight`, and PowerShell `--shell`.
+- `crabbox actions hydrate/register` are Linux-only today; use plain
+  `crabbox run` loops for static macOS and Windows hosts.
+- Live proof needs a reachable, operator-managed SSH host. Without one, verify
+  with `../crabbox/bin/crabbox run --help`, config/flag tests, and the Crabbox
+  Go test suite.
+
+## Direct Brokered AWS Backend
+
+Use this when the task needs direct AWS Crabbox semantics rather than the
+prepared Blacksmith Testbox CI environment.
+
+Changed gate:
+
+```sh
+../crabbox/bin/crabbox run \
+  --idle-timeout 90m \
+  --ttl 240m \
+  --timing-json \
+  --shell -- \
+  "env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed"
+```
+
+Full suite:
+
+```sh
+../crabbox/bin/crabbox run \
+  --idle-timeout 90m \
+  --ttl 240m \
+  --timing-json \
+  --shell -- \
+  "env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test"
+```
+
+Focused rerun:
+
+```sh
+../crabbox/bin/crabbox run \
+  --idle-timeout 90m \
+  --ttl 240m \
+  --timing-json \
+  --shell -- \
+  "env CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test <path-or-filter>"
+```
+
+Read the JSON summary. Useful fields:
+
+- `provider`: `aws`
+- `leaseId`: `cbx_...`
+- `syncDelegated`: `false`
+- `commandPhases`: populated when the command prints `CRABBOX_PHASE:<name>`
+- `commandMs` / `totalMs`
+- `exitCode`
+
+Crabbox should stop one-shot AWS leases automatically after the run. Verify
+cleanup when a run fails, is interrupted, or the command output is unclear:
+
+```sh
+../crabbox/bin/crabbox list --provider aws
+```
+
+## Blacksmith Testbox Through Crabbox
+
+Use this for OpenClaw maintainer broad/heavy `pnpm` gates when the prepared CI
+environment is the right proof surface:
+
+```sh
+node scripts/crabbox-wrapper.mjs run \
+  --provider blacksmith-testbox \
+  --blacksmith-org openclaw \
+  --blacksmith-workflow .github/workflows/ci-check-testbox.yml \
+  --blacksmith-job check \
+  --blacksmith-ref main \
+  --idle-timeout 90m \
+  --ttl 240m \
+  --timing-json \
+  -- \
+  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 OPENCLAW_TESTBOX=1 OPENCLAW_TESTBOX_REMOTE_RUN=1 pnpm check:changed
+```
+
+Read the JSON summary and the Testbox line. Useful fields:
+
+- `provider`: `blacksmith-testbox`
+- `leaseId`: `tbx_...`
+- `syncDelegated`: `true`
+- `syncPhases`: delegated/skipped because Blacksmith owns checkout/sync
+- Actions run URL/id from the Testbox output
+- `exitCode`
+
+`blacksmith testbox list` may hide hydrating or ready boxes. Use:
+
+```sh
+blacksmith testbox list --all
+blacksmith testbox status <tbx_id>
+```
+
+## Observability Flags
+
+Use these on debugging runs before inventing ad hoc logging:
+
+- `--preflight`: prints run context, workspace mode, SSH target, remote user/cwd,
+  and target-specific tool probes. Defaults cover `git`, `tar`, `node`, `npm`,
+  `corepack`, `pnpm`, `yarn`, `bun`, `docker`, plus POSIX
+  `sudo`/`apt`/`bubblewrap` and native Windows
+  `powershell`/`execution_policy`/`longpaths`/`temp`/`pwsh`. Add
+  `--preflight-tools node,bun,docker`, `CRABBOX_PREFLIGHT_TOOLS`, or repo
+  `run.preflightTools` to replace the list. `default` expands built-ins; `none`
+  prints only the workspace summary. Preflight is diagnostic only; install
+  toolchains through Actions hydration, images, devcontainer/Nix/mise/asdf, or
+  the run script. On `blacksmith-testbox`, this prints a delegated-unsupported
+  note because the workflow owns setup.
+- `CRABBOX_ENV_ALLOW=NAME,...`: forwards only listed local env vars for direct
+  providers and prints `set len=N secret=true` style summaries. On
+  `blacksmith-testbox`, env forwarding is unsupported; put secrets in the
+  Testbox workflow instead.
+- `--env-from-profile <file>` plus `--allow-env NAME`: loads simple
+  `export NAME=value` / `NAME=value` lines from a local profile without
+  executing it, then forwards only allowlisted names. `--allow-env` is
+  repeatable and comma-separated. Profile values override ambient allowlisted
+  env values for that run. Direct POSIX, WSL2, and native Windows runs are
+  supported; delegated providers are not. Crabbox probes the uploaded profile
+  remotely and prints redacted presence/length metadata before the command.
+- `--env-helper <name>`: with `--env-from-profile` on POSIX SSH targets,
+  persists `.crabbox/env/<name>` and `.crabbox/env/<name>.env` so follow-up
+  commands on the same lease can run through `./.crabbox/env/<name> <command>`.
+  Use only on leases you control; the profile stays until cleanup, lease reset,
+  or `--full-resync`.
+- `--script <file>` / `--script-stdin`: upload a local script into
+  `.crabbox/scripts/` and execute it on the remote box. Shebang scripts execute
+  directly on POSIX; scripts without a shebang run through `bash`. Native
+  Windows uploads run through Windows PowerShell, and Crabbox appends `.ps1`
+  when needed. Arguments after `--` become script args.
+- `--fresh-pr owner/repo#123|URL|number`: skip dirty local sync and create a
+  fresh remote checkout of the GitHub PR. Bare numbers use the current repo's
+  GitHub origin. Add `--apply-local-patch` only when the current local
+  `git diff --binary HEAD` should be applied on top of that PR checkout.
+- `--full-resync` / `--fresh-sync`: reset a stale direct-provider workdir
+  before syncing. Use after sync fingerprints look wrong, SSH times out before
+  sync, or rsync watchdog output suggests it. It is redundant with
+  `--fresh-pr`, incompatible with `--no-sync`, and unsupported by delegated
+  providers.
+- `--capture-stdout <path>` / `--capture-stderr <path>`: write remote streams to
+  local files and keep binary/noisy output out of retained logs. Parent
+  directories must already exist. These are direct-provider only.
+- `--capture-on-fail`: on non-zero direct-provider exits, downloads
+  `.crabbox/captures/*.tar.gz` with `test-results`, `playwright-report`,
+  `coverage`, JUnit XML, and nearby logs. Treat as secret-bearing until reviewed.
+- `--keep-on-failure`: leave a failed one-shot lease alive for live debugging
+  until idle/TTL expiry. Useful on direct providers and delegated one-shots.
+- `--timing-json`: final machine-readable timing. Add
+  `echo CRABBOX_PHASE:install`, `CRABBOX_PHASE:test`, etc. in long shell
+  commands; direct providers and Blacksmith Testbox both report them as
+  `commandPhases`.
+
+Live-provider debug template for direct AWS/Hetzner leases:
+
+```sh
+mkdir -p .crabbox/logs
+../crabbox/bin/crabbox run --provider aws \
+  --preflight \
+  --allow-env OPENAI_API_KEY,OPENAI_BASE_URL \
+  --timing-json \
+  --capture-stdout .crabbox/logs/live-provider.stdout.log \
+  --capture-stderr .crabbox/logs/live-provider.stderr.log \
+  --capture-on-fail \
+  --shell -- \
+  "echo CRABBOX_PHASE:install; pnpm install --frozen-lockfile; echo CRABBOX_PHASE:test; pnpm test:live"
+```
+
+Do not pass `--capture-*`, `--download`, `--checksum`, `--force-sync-large`, or
+`--sync-only` to delegated providers. Also do not pass `--script*`,
+`--fresh-pr`, `--full-resync`, or `--env-helper` there. Crabbox rejects these
+because the provider owns sync or command transport. `--keep-on-failure` is OK
+for delegated one-shots when you need to inspect a failed lease.
+
+## Efficient Bug E2E Verification
+
+Use the smallest Crabbox lane that proves the reported user path, not just the
+touched code. Aim for one after-fix E2E proof before commenting, closing, or
+opening a PR for a user-visible bug.
+
+When the user says "test in Crabbox", do not simply copy tests to the remote
+box and run them there. Crabbox is for remote real-scenario proof: copy or
+install OpenClaw as the user would, run the same setup/update/CLI/Gateway/API
+call that failed, and capture behavior from that entrypoint. For regressions or
+bug reports, prove the broken state first when feasible, then run the same
+scenario after the fix.
+
+Pick the lane by symptom:
+
+- Docker/setup/install bug: build a package tarball and run the matching
+  `scripts/e2e/*-docker.sh` or package script. This proves npm packaging,
+  install paths, runtime deps, config writes, and container behavior.
+- Provider/model/auth bug: prefer true live E2E. Use the configured secret
+  workflow, then inject the single needed key into Crabbox if needed. Scrub
+  unrelated provider env vars in the child command so interactive defaults do
+  not drift to another provider. If only a dummy key is used, label the proof
+  narrowly, e.g. "UI/install path only; live provider auth not exercised."
+- Channel delivery bug: use the channel Docker/live lane when available; include
+  setup, config, gateway start, send/receive or agent-turn proof, and redacted
+  logs.
+- Gateway/session/tool bug: prefer an end-to-end CLI or Gateway RPC command that
+  creates real state and inspects the resulting files/API output.
+- Pure parser/config bug: targeted tests may be enough, but still run a
+  Crabbox command when OS, package, Docker, secrets, or service lifecycle could
+  change behavior.
+
+Efficient flow:
+
+1. Reproduce or prove the pre-fix symptom from the real user-facing entrypoint
+   when feasible. If the issue cannot be reproduced, capture the exact command
+   and observed behavior instead.
+2. Patch locally and run narrow local tests for edit speed.
+3. Run one Crabbox E2E command that starts from the user-facing entrypoint:
+   package install, Docker setup, onboarding, channel add, gateway start, or
+   agent turn as appropriate.
+4. Record proof as: Testbox id, command, environment shape, redacted secret
+   source, and copied success/failure output.
+5. If the issue says "cannot reproduce", ask for the missing config/log fields
+   that would distinguish the tested path from the reporter's path.
+
+Keep it efficient:
+
+- Reuse existing E2E scripts and helper assertions before writing ad hoc shell.
+- Use `--script <file>` or `--script-stdin` for multi-line E2E commands instead
+  of quote-heavy `--shell` strings on direct SSH providers.
+- Use `--fresh-pr <pr>` when validating an upstream PR in isolation from the
+  local dirty tree. Add `--apply-local-patch` only when testing a local fixup on
+  top of that PR.
+- Use `--full-resync` before replacing a warmed direct-provider lease when the
+  remote workdir or sync fingerprint appears stale.
+- Use one-shot Crabbox for a single proof; use a reusable Testbox only when
+  several commands must share built images, installed packages, or live state.
+- Prefer `OPENCLAW_CURRENT_PACKAGE_TGZ` with Docker/package lanes when testing a
+  candidate tarball; prefer the repo's package helper instead of direct source
+  execution when the bug might be packaging/install related.
+- Keep secrets redacted. It is fine to report key presence, source, and length;
+  never print secret values.
+- Include `--timing-json` on broad or flaky runs when command duration or sync
+  behavior matters.
+
+Before/after PR proof on delegated Testbox:
+
+- For PRs that should prove "broken before, fixed after", compare base and PR
+  on the same Testbox when practical. Fetch both refs, create detached temp
+  worktrees under `/tmp`, install in each, then run the same harness twice.
+- Do not checkout base/PR refs in the synced repo root. Delegated Testbox sync
+  may leave the root dirty with local files; `git checkout` can abort or mix
+  proof state.
+- Temp harness files under `/tmp` do not resolve repo packages by default. Put
+  the harness inside the worktree, or in ESM use
+  `createRequire(path.join(process.cwd(), "package.json"))` before requiring
+  workspace deps such as `@lydell/node-pty`.
+- For full-screen TUI/CLI bugs, a PTY harness is stronger than helper-only
+  assertions. Use a real PTY, wait for visible lifecycle markers, send input,
+  then send control keys and assert process exit/stuck behavior.
+- When validating a rebased local branch before push, remember delegated sync
+  usually validates synced file content on a detached dirty checkout, not a
+  remote commit object. Record the local head SHA, changed files, Testbox id,
+  and final success markers; after pushing, ensure the pushed SHA has the same
+  file content.
+- If GitHub CI is still queued but the exact changed content passed Testbox
+  `pnpm check:changed`, `pnpm check:test-types`, and the real E2E proof, it is
+  reasonable to merge once required checks allow it. Note any still-running
+  unrelated shards in the proof comment instead of waiting forever.
+
+Interactive CLI/onboarding:
+
+- For full-screen or prompt-heavy CLI flows, run the target command inside tmux
+  on the Crabbox and drive it with `tmux send-keys`; capture proof with
+  `tmux capture-pane`, redacted through `sed`.
+- Prefer deterministic arrow navigation over search typing for Clack-style
+  searchable selects. Raw `send-keys -l openai` may not trigger filtering in a
+  tmux pane; inspect option order locally or on-box and send exact Down/Enter
+  sequences.
+- Isolate mutable state with `OPENCLAW_STATE_DIR=$(mktemp -d)`. Plugin npm
+  installs live under that state dir (`npm/node_modules/...`), not under
+  `OPENCLAW_CONFIG_DIR`. Verify downloads by checking the state dir, package
+  lock, and installed package metadata.
+- To test automatic setup installs against local package artifacts, use
+  `OPENCLAW_ALLOW_PLUGIN_INSTALL_OVERRIDES=1` plus
+  `OPENCLAW_PLUGIN_INSTALL_OVERRIDES='{"plugin-id":"npm-pack:/tmp/plugin.tgz"}'`.
+  Pack with `npm pack`, set an isolated `OPENCLAW_STATE_DIR`, and verify the
+  package under `npm/node_modules`. Overrides are test-only and must not be
+  treated as official/trusted-source installs.
+- For OpenAI/Codex onboarding proof, the useful markers are the UI line
+  `Installed Codex plugin`, `npm/node_modules/@openclaw/codex`, and the
+  package-lock entry showing the bundled `@openai/codex` dependency. A dummy
+  OpenAI-shaped key can prove only UI/install behavior; it is not live auth.
+
+## Reuse And Keepalive
+
+For most Crabbox calls, one-shot is enough. Use reuse only when you need
+multiple manual commands on the same hydrated box.
+
+If Crabbox returns a reusable id or you intentionally keep a lease:
+
+```sh
+../crabbox/bin/crabbox run --id <cbx_id-or-slug> --no-sync --timing-json --shell -- "pnpm test <path>"
+```
+
+Stop boxes you created before handoff:
+
+```sh
+../crabbox/bin/crabbox stop <id-or-slug>
+blacksmith testbox stop --id <tbx_id>
+```
+
+## Interactive Desktop And WebVNC
+
+Prefer WebVNC for human inspection because the browser portal can preload the
+lease VNC password and avoids a native VNC client's copy/paste/password dance.
+Use native `crabbox vnc` only when WebVNC is unavailable, the browser portal is
+broken, or the user explicitly wants a local VNC client.
+
+Common desktop flow:
+
+```sh
+../crabbox/bin/crabbox warmup --provider hetzner --desktop --browser --class standard --idle-timeout 60m --ttl 240m
+../crabbox/bin/crabbox desktop launch --provider hetzner --id <cbx_id-or-slug> --browser --url https://example.com --webvnc --open --take-control
+```
+
+Useful WebVNC commands:
+
+```sh
+../crabbox/bin/crabbox webvnc --provider hetzner --id <cbx_id-or-slug> --open --take-control
+../crabbox/bin/crabbox webvnc daemon start --provider hetzner --id <cbx_id-or-slug> --open --take-control
+../crabbox/bin/crabbox webvnc daemon status --provider hetzner --id <cbx_id-or-slug>
+../crabbox/bin/crabbox webvnc daemon stop --provider hetzner --id <cbx_id-or-slug>
+../crabbox/bin/crabbox webvnc status --provider hetzner --id <cbx_id-or-slug>
+../crabbox/bin/crabbox webvnc reset --provider hetzner --id <cbx_id-or-slug> --open --take-control
+../crabbox/bin/crabbox desktop doctor --provider hetzner --id <cbx_id-or-slug>
+../crabbox/bin/crabbox desktop click --provider hetzner --id <cbx_id-or-slug> --x 640 --y 420
+../crabbox/bin/crabbox desktop paste --provider hetzner --id <cbx_id-or-slug> --text "user@example.com"
+../crabbox/bin/crabbox desktop key --provider hetzner --id <cbx_id-or-slug> ctrl+l
+../crabbox/bin/crabbox artifacts collect --id <cbx_id-or-slug> --all --output artifacts/<slug>
+../crabbox/bin/crabbox artifacts publish --dir artifacts/<slug> --pr <number>
+```
+
+`desktop launch --webvnc --open` is usually the nicest one-shot: it starts the
+browser/app inside the visible session, bridges the lease into the authenticated
+WebVNC portal, and opens the portal. Keep browsers windowed for human QA; use
+`--fullscreen` only for capture/video workflows.
+For human handoff, include `--take-control` so the opened portal viewer gets
+keyboard/mouse control automatically instead of landing as an observer.
+
+Human handoff preflight:
+
+- Do not assume a visible desktop or launched browser means the repo CLI/app is
+  installed, built, or on the interactive terminal's `PATH`.
+- Before handing WebVNC to a human tester, prove the expected command from the
+  same kept lease and from a neutral directory such as `~`.
+- If the handoff needs repo-local code, sync/build/link it explicitly on that
+  lease. Source-tree CLIs often need build output before a symlink works.
+- Prefer a real `command -v <expected-command> && <expected-command> --version`
+  check over a repo-root-only `pnpm ...` command.
+
+Generic handoff repair pattern:
+
+```sh
+../crabbox/bin/crabbox run --id <cbx_id-or-slug> --full-resync --shell -- \
+  "set -euo pipefail
+   pnpm install --frozen-lockfile
+   pnpm build
+   sudo ln -sf \"\$PWD/<cli-entry>\" /usr/local/bin/<expected-command>
+   cd ~
+   command -v <expected-command>
+   <expected-command> --version"
+```
+
+## If Crabbox Fails
+
+Keep the fallback narrow. First decide whether the failure is Crabbox itself,
+the brokered AWS lease, Blacksmith/Testbox, repo hydration, sync, or the test
+command.
+
+Fast checks:
+
+```sh
+command -v crabbox
+../crabbox/bin/crabbox --version
+../crabbox/bin/crabbox run --help | sed -n '1,140p'
+../crabbox/bin/crabbox doctor
+command -v blacksmith
+blacksmith --version
+blacksmith testbox list
+```
+
+Common Crabbox-only failures:
+
+- Provider missing or old CLI: use `../crabbox/bin/crabbox` from the sibling
+  repo, or update/install Crabbox before retrying.
+- Bad local config: inspect `.crabbox.yaml`, `crabbox config show`, and
+  `crabbox whoami`; normal OpenClaw proof should use brokered AWS without
+  asking for cloud keys.
+- Slug/claim confusion: use the raw `cbx_...` / `tbx_...` id, or run one-shot
+  without `--id`.
+- Sync/timing bug: add `--debug --timing-json`; capture the final JSON and the
+  printed Actions URL. Large sync warnings now include top source directories
+  by file count and a hint to update `.crabboxignore` / `sync.exclude`; inspect
+  those before reaching for `--force-sync-large`. Quiet rsync watchdogs and SSH
+  timeouts now print `next_action=` hints; follow them, usually `--full-resync`
+  first and a fresh lease second.
+- Cleanup uncertainty: run `crabbox list --provider aws`; for explicit
+  Blacksmith runs, use `blacksmith testbox list` and stop only boxes you
+  created.
+- Testbox queued/capacity pressure: do not retry Blacksmith repeatedly. Rerun
+  once without `--provider` so `.crabbox.yaml` routes to brokered AWS, or report
+  the Blacksmith blocker if Testbox itself is the requested proof.
+
+If brokered AWS cannot dispatch, sync, attach, or stop, retry once with
+`--debug` and `--timing-json`:
+
+```sh
+../crabbox/bin/crabbox run --debug --timing-json -- \
+  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed
+```
+
+Full suite:
+
+```sh
+../crabbox/bin/crabbox run --debug --timing-json -- \
+  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test
+```
+
+Auth fallback, only when `blacksmith` says auth is missing:
+
+```sh
+blacksmith auth login --non-interactive --organization openclaw
+```
+
+Raw Blacksmith footguns:
+
+- Run from repo root. The CLI syncs the current directory.
+- Save the returned `tbx_...` id in the session.
+- Reuse that id for focused reruns; stop it before handoff.
+- Raw commit SHAs are not reliable `warmup --ref` refs; use a branch or tag.
+- Treat `blacksmith testbox list` as cleanup diagnostics, not a shared reusable
+  queue.
+
+Use Blacksmith only when the task is specifically about Testbox, brokered AWS
+is unavailable, or an explicit comparison is needed. If Blacksmith is down or
+quota-limited, do not keep probing it; stay on brokered AWS and note the
+delegated-provider outage.
+
+## Blacksmith Backend Notes
+
+Crabbox Blacksmith backend delegates setup to:
+
+- org: `openclaw`
+- workflow: `.github/workflows/ci-check-testbox.yml`
+- job: `check`
+- ref: `main` unless testing a branch/tag intentionally
+
+The hydration workflow owns checkout, Node/pnpm setup, dependency install,
+secrets, ready marker, and keepalive. Crabbox owns dispatch, sync, SSH command
+execution, timing, logs/results, and cleanup.
+
+Minimal Blacksmith-backed Crabbox run, from repo root:
+
+```sh
+../crabbox/bin/crabbox run --provider blacksmith-testbox --timing-json -- \
+  CI=1 NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test:changed
+```
+
+Use direct Blacksmith only when Crabbox is the broken layer and you are
+isolating a Crabbox bug. Prefer direct `blacksmith testbox list` for cleanup
+diagnostics, not as a reusable work queue.
+
+Important Blacksmith footguns:
+
+- Always run from repo root. The CLI syncs the current directory.
+- Raw commit SHAs are not reliable `warmup --ref` refs; use a branch or tag.
+- If auth is missing and browser auth is acceptable:
+
+```sh
+blacksmith auth login --non-interactive --organization openclaw
+```
+
+## Brokered AWS
+
+Use AWS for normal OpenClaw remote proof. The repo `.crabbox.yaml` already
+selects brokered AWS, so omit `--provider` unless you are testing a different
+provider deliberately.
+
+```sh
+../crabbox/bin/crabbox warmup --class beast --market on-demand --idle-timeout 90m
+../crabbox/bin/crabbox hydrate --id <cbx_id-or-slug>
+../crabbox/bin/crabbox run --id <cbx_id-or-slug> --timing-json --shell -- "env NODE_OPTIONS=--max-old-space-size=4096 OPENCLAW_TEST_PROJECTS_PARALLEL=6 OPENCLAW_VITEST_MAX_WORKERS=1 OPENCLAW_VITEST_NO_OUTPUT_TIMEOUT_MS=900000 pnpm test:changed"
+../crabbox/bin/crabbox stop <cbx_id-or-slug>
+```
+
+Install/auth for owned Crabbox if needed:
+
+```sh
+brew install openclaw/tap/crabbox
+crabbox login --url https://crabbox.openclaw.ai --provider aws
+```
+
+New users should self-resolve broker auth before anyone asks for AWS keys:
+
+```sh
+crabbox config show
+crabbox doctor
+crabbox whoami
+```
+
+- If broker auth is missing, run `crabbox login --url https://crabbox.openclaw.ai --provider aws`.
+- If the CLI asks for `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, or AWS
+  profile setup during normal OpenClaw validation, assume the agent selected
+  the wrong path. Use brokered `crabbox login` or an existing brokered lease
+  before asking the user for cloud credentials.
+- Ask for AWS keys only for explicit direct-provider/account administration,
+  not for normal brokered OpenClaw proof.
+- Trusted automation may still use
+  `printf '%s' "$CRABBOX_COORDINATOR_TOKEN" | crabbox login --url https://crabbox.openclaw.ai --provider aws --token-stdin`.
+
+macOS config lives at:
+
+```text
+~/Library/Application Support/crabbox/config.yaml
+```
+
+It should include `broker.url`, `broker.token`, and usually `provider: aws`
+for OpenClaw lanes. Let that config drive normal validation.
+
+### Interactive Desktop / WebVNC
+
+For human desktop demos, prefer `webvnc` over native `vnc` and keep the remote
+desktop visible/windowed. Do not fullscreen the remote browser or hide the XFCE
+panel/window chrome unless the explicit goal is video/capture output. After
+launch, verify a screenshot shows the desktop panel plus browser title bar. If
+Chrome is fullscreen, toggle it back with:
+
+```sh
+crabbox run --id <lease> --shell -- 'DISPLAY=:99 xdotool search --onlyvisible --class google-chrome windowactivate key F11'
+```
+
+## Diagnostics
+
+```sh
+crabbox status --id <id-or-slug> --wait
+crabbox inspect --id <id-or-slug> --json
+crabbox sync-plan
+crabbox history --limit 20
+crabbox history --lease <id-or-slug>
+crabbox attach <run_id>
+crabbox events <run_id> --json
+crabbox logs <run_id>
+crabbox results <run_id>
+crabbox cache stats --id <id-or-slug>
+crabbox ssh --id <id-or-slug>
+blacksmith testbox list
+```
+
+Use `--debug` on `run` when measuring sync timing.
+Use `--timing-json` on warmup, hydrate, and run when comparing backends.
+Use `--market spot|on-demand` only on AWS warmup/one-shot runs.
+
+## Failure Triage
+
+- Crabbox cannot find provider: verify `../crabbox/bin/crabbox --help` lists
+  the provider selected by `.crabbox.yaml`; update Crabbox before falling back.
+- Hydration stuck or failed: open the printed GitHub Actions run URL and inspect
+  the hydration step.
+- Sync failed: rerun with `--debug`; check changed-file count and whether the
+  checkout is dirty.
+- Command failed: rerun only the failing shard/file first. Do not rerun a full
+  suite until the focused failure is understood.
+- Cleanup uncertain: `crabbox list --provider aws`; for explicit Blacksmith
+  runs, use `blacksmith testbox list` and stop owned `tbx_...` leases you
+  created.
+- Crabbox broken but Blacksmith works: use the direct Blacksmith fallback above,
+  then file/fix the Crabbox issue.
+
+## Boundary
+
+Do not add OpenClaw-specific setup to Crabbox itself. Put repo setup in the
+hydration workflow and keep Crabbox generic around lease, sync, command
+execution, logs/results, timing, and cleanup.
diff --git a/.crabbox.yaml b/.crabbox.yaml
new file mode 100644
index 0000000..b9070dc
--- /dev/null
+++ b/.crabbox.yaml
@@ -0,0 +1,49 @@
+profile: clawpatch-check
+provider: aws
+class: standard
+capacity:
+  market: spot
+  strategy: most-available
+  fallback: on-demand-after-120s
+  hints: true
+  regions:
+    - eu-west-1
+    - eu-west-2
+    - eu-central-1
+    - us-east-1
+    - us-west-2
+actions:
+  workflow: .github/workflows/crabbox-hydrate.yml
+  job: hydrate
+  ref: main
+  runnerLabels:
+    - crabbox
+    - openclaw
+    - clawpatch
+  runnerVersion: latest
+  ephemeral: true
+aws:
+  region: eu-west-1
+  rootGB: 160
+sync:
+  delete: true
+  checksum: false
+  gitSeed: true
+  fingerprint: true
+  baseRef: main
+  exclude:
+    - .artifacts
+    - .codex
+    - .DS_Store
+    - coverage
+    - dist
+    - dist-test
+    - node_modules
+env:
+  allow:
+    - CI
+    - NODE_OPTIONS
+    - PNPM_*
+ssh:
+  user: crabbox
+  port: "2222"
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index 7dabfaa..fbc569a 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -7,6 +7,9 @@
 
 # Security automation and disclosure surfaces.
 /SECURITY.md @openclaw/openclaw-secops
+/.agents/skills/ @openclaw/openclaw-secops
+/.crabbox.yaml @openclaw/openclaw-secops
+/.github/actionlint.yaml @openclaw/openclaw-secops
 /.github/codeql/ @openclaw/openclaw-secops
 /.github/dependabot.yml @openclaw/openclaw-secops
 /.github/workflows/ @openclaw/openclaw-secops
diff --git a/.github/actionlint.yaml b/.github/actionlint.yaml
new file mode 100644
index 0000000..fa23916
--- /dev/null
+++ b/.github/actionlint.yaml
@@ -0,0 +1,5 @@
+self-hosted-runner:
+  labels:
+    - crabbox
+    - openclaw
+    - clawpatch
diff --git a/.github/workflows/crabbox-hydrate.yml b/.github/workflows/crabbox-hydrate.yml
new file mode 100644
index 0000000..2daa12d
--- /dev/null
+++ b/.github/workflows/crabbox-hydrate.yml
@@ -0,0 +1,118 @@
+name: Crabbox Hydrate
+
+on:
+  workflow_dispatch:
+    inputs:
+      crabbox_id:
+        description: "Crabbox lease ID"
+        required: true
+        type: string
+      ref:
+        description: "Git ref to hydrate"
+        required: false
+        type: string
+      crabbox_runner_label:
+        description: "Dynamic Crabbox runner label"
+        required: true
+        type: string
+      crabbox_job:
+        description: "Hydration job identifier expected by Crabbox"
+        required: false
+        default: "hydrate"
+        type: string
+      crabbox_keep_alive_minutes:
+        description: "Minutes to keep the hydrated job alive"
+        required: false
+        default: "90"
+        type: string
+
+permissions:
+  contents: read
+
+jobs:
+  hydrate:
+    name: hydrate
+    runs-on: [self-hosted, crabbox, openclaw, clawpatch, "${{ inputs.crabbox_runner_label }}"]
+    timeout-minutes: 120
+    steps:
+      - uses: actions/checkout@v6
+        with:
+          ref: ${{ inputs.ref || github.ref }}
+
+      - uses: actions/setup-node@v6
+        with:
+          node-version: 24
+
+      - name: Prepare pnpm workspace
+        shell: bash
+        run: |
+          set -euo pipefail
+          git fetch --no-tags --depth=50 origin "+refs/heads/main:refs/remotes/origin/main"
+          corepack enable
+          corepack prepare "$(node -p "require('./package.json').packageManager")" --activate
+          pnpm install --frozen-lockfile
+          node --version
+          pnpm --version
+
+      - name: Mark Crabbox ready
+        shell: bash
+        env:
+          CRABBOX_ID: ${{ inputs.crabbox_id }}
+          CRABBOX_JOB: ${{ inputs.crabbox_job }}
+        run: |
+          set -euo pipefail
+          job="${CRABBOX_JOB}"
+          if [ -z "$job" ]; then job=hydrate; fi
+          case "$CRABBOX_ID" in
+            ''|*[!A-Za-z0-9._-]*)
+              echo "Invalid crabbox_id" >&2
+              exit 2
+              ;;
+          esac
+          mkdir -p "$HOME/.crabbox/actions"
+          state="$HOME/.crabbox/actions/${CRABBOX_ID}.env"
+          env_file="$HOME/.crabbox/actions/${CRABBOX_ID}.env.sh"
+          {
+            for key in CI GITHUB_ACTIONS GITHUB_WORKSPACE GITHUB_REPOSITORY GITHUB_RUN_ID GITHUB_RUN_NUMBER GITHUB_RUN_ATTEMPT GITHUB_REF GITHUB_REF_NAME GITHUB_SHA GITHUB_EVENT_NAME GITHUB_ACTOR RUNNER_OS RUNNER_ARCH RUNNER_TEMP RUNNER_TOOL_CACHE PATH; do
+              value="${!key-}"
+              if [ -n "$value" ]; then
+                printf 'export %s=%q\n' "$key" "$value"
+              fi
+            done
+          } > "${env_file}.tmp"
+          mv "${env_file}.tmp" "$env_file"
+          tmp="${state}.tmp"
+          {
+            echo "WORKSPACE=${GITHUB_WORKSPACE}"
+            echo "RUN_ID=${GITHUB_RUN_ID}"
+            echo "JOB=${job}"
+            echo "ENV_FILE=${env_file}"
+            echo "READY_AT=$(date -u +%Y-%m-%dT%H:%M:%SZ)"
+          } > "$tmp"
+          mv "$tmp" "$state"
+
+      - name: Keep Crabbox job alive
+        shell: bash
+        env:
+          CRABBOX_ID: ${{ inputs.crabbox_id }}
+          CRABBOX_KEEP_ALIVE_MINUTES: ${{ inputs.crabbox_keep_alive_minutes }}
+        run: |
+          set -euo pipefail
+          case "$CRABBOX_ID" in
+            ''|*[!A-Za-z0-9._-]*)
+              echo "Invalid crabbox_id" >&2
+              exit 2
+              ;;
+          esac
+          minutes="${CRABBOX_KEEP_ALIVE_MINUTES}"
+          case "$minutes" in
+            ''|*[!0-9]*) minutes=90 ;;
+          esac
+          stop="$HOME/.crabbox/actions/${CRABBOX_ID}.stop"
+          deadline=$(( $(date +%s) + minutes * 60 ))
+          while [ "$(date +%s)" -lt "$deadline" ]; do
+            if [ -f "$stop" ]; then
+              exit 0
+            fi
+            sleep 15
+          done
diff --git a/.github/workflows/stale.yml b/.github/workflows/stale.yml
new file mode 100644
index 0000000..46276bb
--- /dev/null
+++ b/.github/workflows/stale.yml
@@ -0,0 +1,91 @@
+name: Stale
+
+on:
+  schedule:
+    - cron: "17 4 * * *"
+  workflow_dispatch:
+
+permissions: {}
+
+jobs:
+  stale:
+    permissions:
+      issues: write
+      pull-requests: write
+    runs-on: ubuntu-latest
+    steps:
+      - name: Ensure stale label
+        env:
+          GH_TOKEN: ${{ github.token }}
+        run: gh label create stale --description "Inactive issue or pull request" --color ededed --force
+
+      - name: Mark stale unassigned issues and pull requests
+        uses: actions/stale@v10
+        with:
+          days-before-issue-stale: 14
+          days-before-issue-close: 7
+          days-before-pr-stale: 14
+          days-before-pr-close: 7
+          stale-issue-label: stale
+          stale-pr-label: stale
+          exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale
+          exempt-pr-labels: maintainer,no-stale
+          operations-per-run: 1000
+          ascending: true
+          exempt-all-assignees: true
+          remove-stale-when-updated: true
+          stale-issue-message: |
+            This issue has been automatically marked as stale due to inactivity.
+            Please add updated clawpatch details or it will be closed.
+          stale-pr-message: |
+            This pull request has been automatically marked as stale due to inactivity.
+            Please update it or it will be closed.
+          close-issue-message: |
+            Closing due to inactivity.
+            If this still affects clawpatch, open a new issue with current reproduction details.
+          close-issue-reason: not_planned
+          close-pr-message: |
+            Closing due to inactivity.
+            If this PR should be revived, reopen it with current context and validation.
+
+      - name: Mark stale assigned issues
+        uses: actions/stale@v10
+        with:
+          days-before-issue-stale: 30
+          days-before-issue-close: 10
+          days-before-pr-stale: -1
+          days-before-pr-close: -1
+          stale-issue-label: stale
+          exempt-issue-labels: enhancement,maintainer,pinned,security,no-stale
+          operations-per-run: 1000
+          ascending: true
+          include-only-assigned: true
+          remove-stale-when-updated: true
+          stale-issue-message: |
+            This assigned issue has been automatically marked as stale after 30 days of inactivity.
+            Please add an update or it will be closed.
+          close-issue-message: |
+            Closing due to inactivity.
+            If this still affects clawpatch, reopen or file a new issue with current evidence.
+          close-issue-reason: not_planned
+
+      - name: Mark stale assigned pull requests
+        uses: actions/stale@v10
+        with:
+          days-before-issue-stale: -1
+          days-before-issue-close: -1
+          days-before-pr-stale: 27
+          days-before-pr-close: 7
+          stale-pr-label: stale
+          exempt-pr-labels: maintainer,no-stale
+          operations-per-run: 1000
+          ascending: true
+          include-only-assigned: true
+          ignore-pr-updates: true
+          remove-stale-when-updated: true
+          stale-pr-message: |
+            This assigned pull request has been automatically marked as stale after being open for 27 days.
+            Please add an update or it will be closed.
+          close-pr-message: |
+            Closing due to inactivity.
+            If this PR should be revived, reopen it with current context and validation.