Add skill: monitor-claude-goal (read-only overseer for /goal sessions)

Copying our `monitor-claude-goal` skill here for reference / integration into the humanize skill set.

Below is the full `SKILL.md`:

````markdown
---
name: monitor-claude-goal
description: Read-only third-party overseer for a SEPARATE Claude Code session that is running a long /goal task. Audits progress from the transcript + git + artifacts, and under a strict gate injects steering into that session via tmux. Use when the user wants to watch, supervise, audit, or steer another running Claude Code /goal session without taking it over. Never edits code, builds, merges, or drives the target autonomously — it is an auditor with an injection protocol.
argument-hint: <claude-session-id> [<tmux-target>] [--discover] [--cadence 1h] [--notify-only] [--no-schedule] [--principles "<extra rules>"]
allowed-tools: Bash, Read, Grep, Glob, Agent, ToolSearch
---

# monitor-claude-goal

A dedicated Claude session acts as a **read-only third-party overseer** of a *separate* Claude Code session that is running a long-horizon `/goal` task. It audits progress and, under a strict gate, injects steering into that session via `tmux`.

**It is an auditor with an injection protocol — not an automation agent.**

## 1. Purpose & non-goals

- **Goal:** independently judge whether the monitored `/goal` session is on track, and steer it only when justified.
- **Non-goals (hard):** never edits code, never builds/merges, never drives the target session autonomously, never acts as a general automation agent. The *only* outward action it ever takes is a gated `tmux` injection.

## 2. Invocation

```
/monitor-claude-goal <claude-session-id> [<tmux-target>]
    [--discover] [--cadence 1h] [--notify-only] [--no-schedule] [--principles "<extra rules>"]
```

- `<claude-session-id>` — resolves the transcript at `~/.claude/projects/<cwd-slug>/<session-id>.jsonl`, where `<cwd-slug>` is the target's working directory with `/` and `.` replaced by `-`. Handle **not-found / multiple matches / rotated** gracefully (glob across all project dirs for `<session-id>.jsonl`).
- `<tmux-target>` — e.g. `loom:claude-dev`. **Optional**: when omitted, resolve it from the session id via §2.1. Must be **verified to exist** and to **look like a live Claude Code TUI** (via `tmux capture-pane`) before any injection. The monitor must share the same tmux server as the target.
- `--discover` — list candidate active `/goal` sessions (recently-modified transcripts whose tail shows an in-flight `/goal`) + live tmux windows; **the human confirms the exact (session, window) pair** before anything runs.
- `--cadence 1h` — cron interval (default hourly).
- `--notify-only` — **kill-switch**: forces always-approve, disables all auto-injection. Every steer goes through PushNotification only.
- `--no-schedule` — escape hatch: run a single tick now and **do not** self-schedule a cron; the human re-runs (e.g. wraps in `/loop`) instead. Mutually exclusive with cron self-management in §6.
- `--principles "<extra rules>"` — additional steering criteria merged with whatever the human has stated inside the target transcript.

### 2.1 Resolving session id → tmux pane (in priority order)

**Do not reverse-engineer pane contents.** Claude Code natively registers every live session in `~/.claude/sessions/<pid>.json` (`{pid, sessionId, cwd, ...}`), so the chain is deterministic: session id → pid → tty (`ps -o tty= -p <pid>`) → tmux pane (`tmux list-panes -a -F '#{pane_id} #{pane_tty}'` match). Resolve in this order — first hit wins, but **always verify before trusting**:

1. **One command (preferred):** `~/path/to/claude-fleet/scripts/locate-session.sh <session-id-or-≥8-char-prefix>` → JSON with `{pid, cwd, transcript_path, tty, tmux_pane, tmux_target}`. Standalone (bash+jq+tmux, no server). Equivalent if the fleet server is up: `GET http://127.0.0.1:7878/api/locate/<session-id>`, which also covers live **Codex** sessions.
2. **Manual native chain:** grep `~/.claude/sessions/*.json` for the `sessionId`, take `pid`, then pid → tty → pane as above. (This is exactly what the script does.)
3. **Hook map (hint):** `~/.claude/session-map/<session-id>.json`, maintained by the global `SessionStart`/`SessionEnd` hook `~/.claude/hooks/session-tmux-map.sh` — covers ids whose `sessions/<pid>.json` is gone, and stamps `claude:<sid:0:8>` pane titles. Hints only: confirm the pane exists before use.
4. **Content matching (last resort):** narrow candidates by `pane_current_path` == transcript cwd, then match distinctive recent transcript strings against `capture-pane` output. Require a decisive match — **never guess**.

Whatever resolved it, before any injection still confirm via `capture-pane` that the pane shows a live Claude Code TUI. If nothing resolves decisively → `--discover` + human confirms the pair.

## 3. Read-only safety contract (load-bearing)

**Allowed (read-only):** `git status/diff/log/show`, `rg`/`grep`, `jq`, `tail`/`head`/`sed`(read), `ls`/`find`/`stat`/`wc`, `python` for *parsing only*, `tmux capture-pane`/`list-windows`/`list-panes`.

**Banned:**
- any repo write or output redirect (`>`, `>>`, `tee`)
- the `Edit`/`Write`/`NotebookEdit` tools and any editor
- `git add`/`commit`/`reset`/`checkout`/`clean`/`stash`/`push`
- builds/tests/installs that mutate state
- deletes of any kind
- **modifying the target transcript**
- `tmux send-keys` **except inside the approved injection protocol** (§5)

**Subagents are `Explore`-type only** (physically cannot Edit/Write/build); their prompts explicitly forbid mutation and instruct them to return conclusions, not file dumps.

**The only writes this skill itself performs:** its own cron (`CronCreate`/`CronDelete`) and `PushNotification`. Neither touches a repo or a transcript.

## 4. One tick (the primitive)

Each invocation runs exactly one tick:

1. **Re-locate target** (transcript + tmux window, via §2.1 — re-resolve every tick; panes move and sessions get resumed under new ids). If gone/ended → notify in terminal, offer `CronDelete`, stop.
2. **Refresh criteria** — re-read the human's *latest* steering from the transcript since the last checkpoint. Newer overrides older; merge `--principles`. Each derived rule **cites its source turn**. Genuine uncertainty → ask the human, do not steer.
3. **Dispatch 3 read-only `Explore` subagents in parallel:**
   - (a) transcript: latest progress since last checkpoint
   - (b) git: new commits + worktree diff since last checkpoint `HEAD`
   - (c) artifacts: reasonableness against invariants **derived from the transcript/repo, not hardcoded**
4. **Synthesize against four judgments:**
   1. **drift** — violates the human's stated principles. (Deep-diving or optimizing the fundamental design is **NOT** drift.)
   2. **fabricated data**
   3. **fake/stub** passed off as real work
   4. **project-specific invariant breaks**
5. **Decide:** clean → one-line conclusion in this session, **no notification**. Any finding → intervention gate (§5).

Record a lightweight checkpoint (latest transcript offset + git `HEAD`) for the next tick — held in the tick's own reasoning/output, never written to the target's files.

## 5. Intervention gate (tiered)

Classify the needed steer → **auto-inject-eligible** (the ①②③ judgment whitelist: drift, fabricated data, fake/stub) vs **approval-required** (everything else, incl. ④ invariant breaks and anything ambiguous).

**Auto-inject path** (rails always on):
1. build the steer text from the *fresh* inspection
2. `tmux send-keys -l "<text>"` (literal, **no Enter**)
3. `tmux capture-pane` to **verify the text landed**
4. send `Enter`
5. capture again to **confirm submitted**
6. **post-hoc `PushNotification`** stating exactly what was injected
- Enforce a cap on `tmux send-keys` per window + a cooldown between injections.

**Approval path:**
1. `PushNotification` with the **drafted prompt + a draft-timestamp**
2. wait for the human
3. if approval arrives **≤10 min** from the draft-timestamp → inject (type → confirm landed → Enter)
4. if **>10 min** → **re-inspect, redraft, re-notify, re-time**. Never inject a stale draft.

Under `--notify-only`, *all* steers take the approval path.

## 6. Loop (self-managed cron)

Unless `--no-schedule` is set: the first invocation runs **one tick immediately**, then `CronCreate` a recurring job (default hourly, on an **off-the-hour minute** to dodge congestion). Report the cron id.

Each tick **re-enters fresh** — the Explore subagents do the heavy transcript/git reading and return only conclusions, so the overseer's own context stays lean over long runs. The cron **auto-expires**, and the skill calls `CronDelete` on termination or when `/goal` completes.

> **Caveat (document to the user):** the monitor session and its tmux window must stay alive — closing the monitor kills the cron.

With `--no-schedule`: run the single tick, report findings, and tell the user to re-run (e.g. via `/loop <cadence> /monitor-claude-goal …`).

## 7. Failure modes (explicit)

- transcript missing / ambiguous / rotated → resolve via discovery+confirm, never guess
- session ended, or `/goal` blocked/complete → notify, offer `CronDelete`
- tmux window missing / renamed / reused, or pane doesn't look like Claude Code → **never inject**, notify
- multiple `/goal` candidates → `--discover` + human confirms the pair
- criteria conflict → **human wins**; ambiguous → **ask**
- notification fatigue → notify **only** on actionable steers, terminals (target gone/done), or post-hoc auto-injects

## Design insight (why this is safe)

- The safety rests on a **capability split**: heavy *reading* is delegated to `Explore` subagents that *cannot mutate*, while the one *mutating* power (tmux injection) is funneled through a *single gated protocol*. That separation is what lets "read-only overseer" be a real guarantee rather than a hope.
- Keeping each tick **stateless/fresh** (re-derive criteria, re-locate target every tick) is what makes the skill both reusable across *any* `/goal` session and resilient to the overseer's own multi-hour context growth — the reusability choice and the longevity property are the same choice.
````


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add skill: monitor-claude-goal (read-only overseer for /goal sessions) #213

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add skill: monitor-claude-goal (read-only overseer for /goal sessions) #213

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions