Conversation
Add the on-disk storage layers for the PR1 dashboard's skills and memory files tabs. Both subsystems validate paths, parse and serialize content atomically via tmp-then-rename, cap content size, and record every edit in a new SQLite audit log. Skills at /home/phantom/.claude/skills/<name>/SKILL.md are parsed with a Zod-validated strict YAML frontmatter schema (name, description, when_to_use required; allowed-tools, argument-hint, arguments, context, disable-model-invocation optional) and linted for missing fields, body size, and shell red-list patterns. Memory files are any markdown under /home/phantom/.claude/ excluding skills/, plugins/, agents/, and the settings.json pair. Adds skill_audit_log and memory_file_audit_log tables with indices. Updates the migration test for the four new migrations.
Wire /ui/api/skills and /ui/api/memory-files into the existing serve.ts request pipeline. Both route sets live behind the phantom_session cookie check and return JSON bodies. Every mutating call records a row in the appropriate audit log table. Routes: - GET /ui/api/skills list - POST /ui/api/skills create - GET /ui/api/skills/:name read - PUT /ui/api/skills/:name update - DELETE /ui/api/skills/:name delete - GET /ui/api/memory-files list - POST /ui/api/memory-files create (body includes path) - GET /ui/api/memory-files/<encoded-path> read - PUT /ui/api/memory-files/<encoded-path> update - DELETE /ui/api/memory-files/<encoded-path> delete Adds setDashboardDb() to hand the database into the api dispatch so the audit log writes go through the already-open connection used elsewhere.
Expose phantom_memory_search and phantom_list_sessions to the agent itself as a new in-process MCP server (phantom-reflective). This is what makes the five reflective built-in skills actually fireable: they can query memory with temporal filters and enumerate recent sessions without having to round-trip through the external MCP endpoint at /mcp. phantom_memory_search wraps MemorySystem.recallEpisodes and recallFacts with an optional days_back filter that maps to RecallOptions.timeRange and the 'temporal' strategy. phantom_list_sessions reads the sessions SQLite table directly with channel and days_back filters. Also adds a small 'dashboard awareness' prompt block wired into the environment section of the system prompt. The agent now knows the dashboard exists at /ui/dashboard/, that its skills and memory files are editable there, and that it should point the operator at those URLs when asked 'what can I edit' or similar.
The hand-crafted operator dashboard at /ui/dashboard/. A single static HTML shell with a sticky nav, a sidebar of eight tabs, and a main content area that hash-routes between live tabs and Coming Soon placeholders. Two tabs are live in PR1: - Skills: left column of skill cards grouped by source (built-in vs yours) with substring search. Right column is a full-fidelity editor with a structured YAML frontmatter form (name, description, when_to_use, allowed-tools chip input with autocomplete, argument hint, context select, user-invoke-only toggle) and a large JetBrains Mono body textarea. Tab inserts two spaces, Cmd/Ctrl+S saves, dirty-state dot pulses next to the title, lint hints render under the body, delete has a confirm modal. New skill modal offers blank or duplicate-from-mirror templates. - Memory files: same split layout over any .md file under /home/phantom/.claude/. New file modal accepts any path ending in .md, creates nested directories automatically, and opens the editor on the new file. CLAUDE.md gets a small info banner noting it is the top-level memory loaded every session. Six Coming Soon placeholders (sessions, cost, scheduler, evolution, memory explorer, settings) render a serif headline, the expected PR, and a link back to skills. No React, no build step. Vanilla JS with one namespaced helper per tab. Tailwind and DaisyUI tokens are not loaded on this shell; the dashboard inherits the phantom- vocabulary spiritually by declaring its own phantom-nav, phantom-chip, phantom-mono, phantom-meta, and phantom-muted classes from the same token values as the base template. Light and dark themes share the same primary indigo with cream or warm-deep-dark surfaces. Also adds a Dashboard quick-link to the existing /ui/ landing page. beforeunload guards unsaved edits and the router respects the dirty state on navigation.
Ship a small catalog of genuinely novel reflective skills that make a fresh phantom feel alive from message one. Each is a full SKILL.md with a strict YAML frontmatter, a Goal, numbered Steps with per-step success criteria, and Rules. - mirror: weekly self-audit playback. Pulls the last 7 days of memory, anchors observations to specific episodes, renders three sections (what I noticed, what I am unsure about, one question for you). - thread: evolution of thinking on a topic. Takes a topic, clusters mentions chronologically, identifies turning points, renders a short narrative with callouts. - echo: prior-answer surfacer. Before deriving a new answer to a substantive question, checks memory for semantically similar prior questions and surfaces the conclusion if the match is strong. - overheard: promises audit. Scans the last 14 days for commitment phrases, checks for follow-through, surfaces the top 3-5 open promises with draft followup offers. - ritual: latent patterns to scheduled jobs. Finds recurring behaviors in 60 days of sessions, verifies them against memory, proposes formalization as phantom_schedule jobs. - show-my-tools: utility skill that lists current skills, memory files, and the dashboard URLs. The discovery path for everything the operator can edit. All five reflective skills list the new in-process MCP tools (mcp__phantom-reflective__phantom_memory_search, mcp__phantom-reflective__phantom_list_sessions, mcp__phantom-scheduler__phantom_schedule) in their allowed-tools field so they can actually fire. The skills ship in /app/skills-builtin/ inside the image. The docker entrypoint copies each directory to /home/phantom/.claude/skills/ on first boot only. Existing directories are preserved, so operator edits survive container rebuilds. Dockerfile copies the skills-builtin tree in both the builder and runtime stages.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 96a018f9f7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| results.episodes = await memory.recallEpisodes(input.query, opts).catch(() => []); | ||
| } | ||
| if (input.memory_type === "semantic" || input.memory_type === "all") { | ||
| results.facts = await memory.recallFacts(input.query, { limit }).catch(() => []); |
There was a problem hiding this comment.
Pass temporal filters to semantic fact recall
When days_back is provided, phantom_memory_search builds a temporal RecallOptions for episodes, but semantic facts are still fetched with { limit } only, so facts are not time-bounded. In reflective skills that request recent windows (for example weekly reviews), this leaks older facts into the result set and contradicts the tool contract that days_back limits returned memory items.
Useful? React with 👍 / 👎.
| context: SkillContextSchema.optional(), | ||
| "disable-model-invocation": z.boolean().optional(), | ||
| }) | ||
| .strict(); |
There was a problem hiding this comment.
Permit built-in source marker in frontmatter schema
The schema is strict, so any extra key is rejected at parse time, but source detection later relies on an optional x-phantom-source marker (detectSource in src/skills/storage.ts) to classify built-in/agent skills. As written, that marker can never be present without causing a parse failure, so skills are always treated as user-sourced and built-in-specific UI behavior (grouping/guardrails) cannot work.
Useful? React with 👍 / 👎.
1. Pass temporal filters to semantic fact recall in the in-process
reflective MCP server. phantom_memory_search was building a temporal
RecallOptions for episodes when days_back was set, but recallFacts
was being called with only { limit }. Reflective skills like mirror
that ask for a 7-day window were leaking older facts into the result
set. Fix is one word: pass the same opts object to recallFacts. The
downstream semantic.recall already honors timeRange via a Qdrant
range filter on valid_from.
2. Permit the x-phantom-source provenance marker in the SKILL.md
frontmatter schema. The Zod schema was strict and detectSource() in
src/skills/storage.ts read frontmatter['x-phantom-source'] without
the field being declared, so any built-in skill that set the marker
would have been rejected at parse time. Added the field to the
schema as an optional enum of "built-in" | "agent" | "user", added
the marker to all six built-in SKILL.md files (echo, mirror,
overheard, ritual, show-my-tools, thread), and added tests for both
the schema acceptance and the source classification.
Quality gates: bun test 1044 pass / 0 fail (+4 new tests), bun run
lint clean, bun run typecheck clean.
Summary
Ships PR1 of Project 3: the operator dashboard for Phantom.
Two tabs are live and production-grade in this PR:
.claude/skills/tree. Structured YAML frontmatter form plus a Monaco-quality body textarea with keyboard save, dirty-state tracking, atomic writes, and every edit audited in SQLite..mdfiles under the user-scope.claude/tree (excluding skills, plugins, agents, and settings JSON). CLAUDE.md, rules, and free-form memory all live here.Six additional tabs (sessions, cost, scheduler, evolution, memory explorer, settings) ship as Coming Soon placeholders in the same dashboard shell.
Architecture
src/skills/,src/memory-files/): path validation, Zod-validated YAML frontmatter, linter, atomic tmp-then-rename writes, audit log tables.src/ui/api/skills.ts,src/ui/api/memory-files.ts): JSON CRUD routes wired behind the existing cookie auth check insrc/ui/serve.ts. Every mutating call records a row inskill_audit_logormemory_file_audit_log.src/agent/in-process-reflective-tools.ts): a new in-process MCP server (phantom-reflective) that exposesphantom_memory_search(semantic + temporal) andphantom_list_sessionsdirectly to the agent, so the built-in reflective skills can actually fire.src/agent/prompt-blocks/dashboard-awareness.ts): a short block added to the environment section of the system prompt so the agent knows the dashboard exists and can direct the operator to it.public/dashboard/): a single static HTML shell with a sidebar, hash router, and two JS modules. Vanilla JS, no React, no build step. Tailwind v4 tokens inherited from the existing phantom design vocabulary.skills-builtin/):mirror,thread,echo,overheard,ritual,show-my-tools. Seeded into the user-scope skills volume on container first boot; existing edits are preserved.Test plan
bun testpasses (1040 pass, 0 fail, +62 new tests vs main)bun run lintcleanbun run typecheckclean~/.claude/skills/on first bootRollback
Single commit-range revert on the branch. No schema rollback is needed because the two new migrations are additive tables with indices; leaving them in an inactive deployment is safe. Existing functionality in the dashboard has no coupling to the pre-existing
/ui/surface, so removing the new/ui/dashboard/tree and the new API routes is a clean undo.