Skip to content

Dashboard v2 PR3: hooks, subagents, curated settings, live install, mcp aliases#59

Open
mcheemaa wants to merge 5 commits intomainfrom
feat/pr3-hooks-settings-subagents
Open

Dashboard v2 PR3: hooks, subagents, curated settings, live install, mcp aliases#59
mcheemaa wants to merge 5 commits intomainfrom
feat/pr3-hooks-settings-subagents

Conversation

@mcheemaa
Copy link
Copy Markdown
Member

Summary

Closes Project 3 Dashboard v2. Every meaningful Claude Agent SDK extensibility surface is now either editable from the dashboard or read-only with an honest reason.

Five commits, each independently revertable:

  1. runtime plugin init snapshot. When the SDK returns an init system message, publish a typed plugin_init_snapshot event on the dashboard SSE bus so optimistically-installing plugin cards can settle to their real state. Wrapped in try/catch inside the agent main loop so telemetry can never abort message processing.

  2. MCP tool aliases. Register phantom_list_sessions and phantom_memory_search as new tools alongside the original phantom_history and phantom_memory_query. Old names keep their existing schemas unchanged. New names accept optional channel and days_back filters. Zero breaking change for any external MCP client.

  3. Subagents tab. New dashboard tab for ~/.claude/agents/<name>.md flat files with YAML frontmatter matching AgentDefinition (tools, model, effort, color, memory). Storage layer is a deliberate duplicate of src/skills/ rather than a refactor so existing skills tests pass byte-for-byte.

  4. Hooks tab. The breakthrough surface. A visual rule builder for the 26 Claude Agent SDK hook events. Three-column Linear-Automations-style flow (trigger, matcher, action) with type-specific forms for command, prompt, agent, and http hook types. Live JSON preview pane. Trust modal on first install. Every install and uninstall writes an audit row with the full previous and new slice as JSON. All writes route through the atomic src/plugins/settings-io.ts helper and only touch the hooks slice; a test asserts enabledPlugins stays byte-for-byte identical pre- and post-install.

  5. Curated settings form + self-awareness updates. Whitelist Zod schema over Settings that rejects unknown fields at parse time, diff-based writes that only touch the top-level keys the form actually changed, per-field audit log. Deny-list includes apiKeyHelper, modelOverrides, hooks, enabledPlugins, autoMemoryDirectory, and every other risky or double-write field. Dashboard-awareness prompt block and show-my-tools skill extended additively so the agent knows about subagents, hooks, and settings alongside the PR1/PR2 surfaces.

Safety floor

  • Every settings.json write routes through src/plugins/settings-io.ts atomic tmp+rename. No direct writeFileSync.
  • Hooks editor writes only the hooks slice. Settings form writes only touched keys. Plugin editor writes only enabledPlugins. No editor touches another's data.
  • Whitelist Zod schemas with .strict() on every form entry point. Unknown fields rejected at parse time.
  • allowedHttpHookUrls allowlist enforced on http hook install.
  • Trust modal on first hook install, recorded to the audit log so it persists across sessions.
  • PR1 skills tests pass byte-for-byte (39/39) before and after commit 3.
  • PR2 plugins tests pass byte-for-byte (84/84) before and after commit 4.
  • No new TypeScript guardrail in src/agent/hooks.ts against any settings field write. Cardinal Rule.

Test plan

  • bun test green (1279 pass, 10 skip, 0 fail; baseline 1141 pass; delta +138)
  • bun run lint clean
  • bun run typecheck clean
  • PR1 bun test src/skills/__tests__/ byte-for-byte identical (39/39)
  • PR2 bun test src/plugins/__tests__/ byte-for-byte identical (84/84)
  • Open the dashboard, walk through each of the six live tabs as a user would. Skills, memory files, plugins (with trust modal), subagents (new), hooks (the visual builder), settings (the curated form).
  • Install a hook via the dashboard, verify enabledPlugins in settings.json is byte-for-byte identical before and after.
  • Create a subagent, trigger the Task tool from the agent, verify the subagent is invoked.
  • Change a field in the settings form, save, verify the field is written and every other key is byte-for-byte identical.

When the Claude Agent SDK returns an init system message we now publish a
typed plugin_init_snapshot event on the existing /ui/api/events bus so the
dashboard plugins tab can flip optimistically-installing cards to their
real settled state. The helper is wrapped in try/catch inside the agent
main loop so a telemetry bug can never abort message processing.

Dashboard-side, dashboard.js opens an EventSource on mount, registers a
plugin_init_snapshot listener, and dispatches to a new
PhantomPluginsModule.onInitSnapshot hook on plugins.js that walks the
rendered card grid and flips matching keys.
PR3 follow-up on the PR2 rename question. Both old names and new names
register on the external MCP server. The originals phantom_history and
phantom_memory_query keep their current schemas unchanged so any external
client that already adopted them continues to work.

The new aliases accept a richer parameter set:
- phantom_list_sessions adds optional channel and days_back filters
- phantom_memory_search adds optional days_back recency filter applied
  client-side via started_at timestamps so the underlying memory recall
  APIs stay untouched

Both old and new names route to shared core helpers so future fixes land
in one place. Tool count on the SWE server moves from 17 to 19.
Adds a new dashboard tab for managing Claude Agent SDK subagents: flat
markdown files at ~/.claude/agents/<name>.md with YAML frontmatter
matching AgentDefinition (tools, model, effort, color, memory, etc.).

The storage layer is a deliberate duplicate of src/skills/ rather than a
refactor into a shared generic. Q4 research decided this: skills tests
must pass byte-for-byte, and the directory layouts are different enough
(directory-per vs flat file) that a premature abstraction would need
4-5 parameters and risk regressing PR1. Both src/skills/__tests__/ and
src/plugins/__tests__/ pass unchanged after this commit.

Shipping:
- src/subagents/{paths,frontmatter,linter,storage,audit}.ts with Zod
  .strict() frontmatter, atomic tmp+rename writes, reserved-stem
  rejection, 50 KB body cap, shell red-list lint
- src/ui/api/subagents.ts with six cookie-auth-gated routes
- public/dashboard/subagents.js inherits the skills.js list+editor
  pattern with subagent-specific fields (model, effort, color dropdowns)
- subagent_audit_log migration
- index.html sidebar entry, route div, script tag

45 new subagents tests. PR1 skills tests still pass at 39.
The breakthrough surface of PR3. The first visual hooks editor in the
Claude Code ecosystem. Three-column Linear-Automations-style builder:
pick a trigger (any of 26 hook events), pick a matcher (tool name,
subagent name, MCP server name, filename glob - depending on the
event), pick an action (command, prompt, agent, or http with
type-specific forms). Live preview pane renders the JSON slice.

Safety floor is strict:
- Zod .strict() discriminated union per hook type, timeouts bounded
  1-3600 seconds, http URLs URL-validated, env var names matching
  [A-Z_][A-Z0-9_]*
- All writes route through src/plugins/settings-io.ts for atomic
  tmp+rename. The hooks editor writes ONLY the Settings.hooks slice;
  enabledPlugins and every other field stay byte-for-byte identical.
  A regression test locks this in.
- allowedHttpHookUrls allowlist is enforced on http hook install; bad
  URLs return 403 with a clear error.
- Trust modal on first install before any rule lands. Acceptance is
  recorded to hook_audit_log so it persists across sessions.
- Every install, update, uninstall, and trust acceptance writes an
  audit row with the full previous and new slice as JSON so a human
  can diff and recover.

Shipping:
- src/hooks/{paths,schema,storage,audit}.ts
- src/ui/api/hooks.ts with six cookie-auth-gated routes
- public/dashboard/hooks.js with the builder, list view, trust modal,
  and audit timeline
- dashboard.css tokens for the three-column builder
- hook_audit_log migration
- Sidebar entry and route div in index.html

42 new hooks tests. PR1 skills (39) and PR2 plugins (84) tests pass
byte-for-byte.
… updates

Closes the dashboard v2 loop. Every meaningful Claude Agent SDK
extensibility surface is now either editable from Phantom's dashboard
or read-only with an honest reason.

The settings form is the last surface. It exposes ~50 whitelisted
fields from Settings (sdk.d.ts:2576-3792) grouped into seven sections:
permissions, model, MCP, hooks security, memory, session, UI, and
updates. Every field has a tooltip; requires-review fields carry a
visible badge.

Safety floor:
- Zod .strict() whitelist. Unknown fields in a request body are
  REJECTED at parse time. That is how the deny-list is enforced.
  apiKeyHelper, modelOverrides, hooks, enabledPlugins,
  autoMemoryDirectory, extraKnownMarketplaces, and every other
  risky field are deliberately NOT in the schema.
- Diff-based writes: compute which top-level keys actually changed,
  write only those keys through src/plugins/settings-io.ts atomic
  tmp+rename. Every untouched field, including enabledPlugins and
  hooks owned by their dedicated editors, stays byte-for-byte
  identical. A test locks this in.
- Per-field audit log rows capture previous and new JSON values.
- The form NEVER touches hooks or enabledPlugins even if the operator
  somehow submits them; those are double-write surfaces owned by the
  Hooks and Plugins tabs.

Self-awareness:
- dashboard-awareness prompt block extended additively to cover
  subagents, hooks, and settings alongside the PR1/PR2 sections.
  Every previous line preserved.
- show-my-tools built-in skill extended with steps 2.6/2.7/2.8 that
  read hook counts, subagent list, and a settings summary directly
  via Read and Glob. No new MCP tool.

28 new settings-editor tests. PR1 skills (39) and PR2 plugins (84)
tests still byte-for-byte. Baseline 1141 -> 1279, delta +138.
@mcheemaa mcheemaa marked this pull request as ready for review April 14, 2026 16:00
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8e8e4fa671

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

var dirty = dirtyKeys();
if (dirty.length === 0) return;
var payload = {};
dirty.forEach(function (k) { payload[k] = state.draft[k]; });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve untouched nested settings slices when saving

The save path sends each dirty top-level key from state.draft as-is, but state.draft only contains the nested fields the user edited (for example just permissions.allow). Because the backend write path does shallow top-level replacement, this overwrites the entire object and drops untouched siblings like permissions.deny, permissions.ask, or permissions.defaultMode. A single nested edit can therefore silently remove existing safety settings from settings.json.

Useful? React with 👍 / 👎.

}
var flipped = [];
state.catalog.plugins.forEach(function (p) {
if (seen[p.key] && !p.enabled) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match plugin init snapshots against actual plugin identifiers

The snapshot handler checks seen[p.key], but plugin cards are built from marketplace entries that carry name/marketplace fields, not a key property. In practice p.key is undefined, so plugin_init_snapshot events never flip cards from installing to installed, and the new live-settle behavior does not work unless the tab fully reloads.

Useful? React with 👍 / 👎.

Comment on lines +514 to +516
promise = ctx.api("PUT", "/ui/api/hooks/" + encodeURIComponent(state.editing.event) + "/" + state.editing.groupIndex + "/" + state.editing.hookIndex, {
definition: cleaned,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Persist edited hook trigger fields on update

In edit mode, the UI lets users change trigger/matcher, but the update request is always sent to the original event/groupIndex/hookIndex and only includes definition. Any changed event or matcher in the form is silently discarded, so users can believe they moved a rule while it still fires on the old trigger.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant