feat(v3): daemon + WS protocol + client SDK (Phases 0-3)#26
Open
Codename-11 wants to merge 17 commits into
Open
feat(v3): daemon + WS protocol + client SDK (Phases 0-3)#26Codename-11 wants to merge 17 commits into
Codename-11 wants to merge 17 commits into
Conversation
…ndow drag - Add `instructions`/`instructionsFile` fields on Profile with `resolveInstructions()` helper; injected as ARC_AGENT_INSTRUCTIONS env var at launch time - Add `arc instructions` CLI (show/set/edit/clear) for managing per-profile system prompts - Add `openai-compat` auth type and `ProviderConfig` (baseUrl, model, apiKeyEnvVar) on Profile - Add OpenAI Compatible adapter with full lifecycle (spawn, terminate, health, output) - Add `arc provider` CLI (set/show/clear/presets) with 7 presets: OpenRouter, Ollama, LM Studio, Together AI, Groq, MiniMax, DeepSeek - Fix TUI window drag on Windows by removing unused mouse tracking ANSI sequences - Update docs: CLAUDE.md, FEATURES.md, getting-started.md, authentication.md - Update tests: adapter count 5→6, resolveInstructions tests, openai-compat adapter test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… queue Ship four concurrent workstreams developed via parallel agent worktrees. - **Dashboard enhancements** — launch history store at ~/.arc/history.json (recordLaunch/getRecentLaunches), live DashView right column polling recent launches + activity log every 4s, ToastProvider + ToastContainer with auto-dismiss for in-app notifications. - **Backup/export/import** — `arc backup create/restore/list` with a custom gzipped archive format (ARCBAK01, no new deps), `arc profile export` and `arc profile import-file` for single-profile JSON transport with inlined instructions. Credentials excluded by default; path traversal validated on restore. - **Profile cloning** — `cloneProfile()` core fn (deep copy + configDir recursive copy), `arc profile clone <src> <dst> [--no-copy-dir]` CLI, Shift+C inline clone keybind in ProfilesView. - **Interactive sidebar queue** — combined nav+profile selection in Sidebar with ↑/↓ navigation and Enter-to-launch on profile rows; owner of all sidebar input moved to Dashboard. Also restores ProviderConfig export + resolveInstructions fn + openai-compat auth type that were lost via older-base worktrees. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Check off profile cloning, launch history on Dash, toast notifications, interactive sidebar queue, and backup/export/import. Remove the same items from the Remaining UX Backlog section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architecture decisions (approved): - Permission default: supervised - Backend: CLI-spawn (not HTTP LLM client) — orchestrate claude/codex/gemini CLIs via Agent-Forge pattern; MCP is the tool-use interop surface - Session storage: per profile - Roundtable composition: both real-profile and virtual-agent modes - Dangerous tools allowed with confirm modal, always logged Plan covers 10 phases (0, 0.5, 1-9) through Phase 9 docs + 0.4.0 release. Phase 0.5 adds launchMode toggle (native/worker) so Claude's native TUI chrome (statusLine, etc.) renders when not orchestrating. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 0.5 — Launch mode toggle (native vs worker):
- Profile.launchMode ('native' | 'worker', default native) lets users pick
between full TTY handoff (spawnSync + inherit) or ARC-supervised worker
mode (spawnManagedProcess). Native mode lets Claude paint its own TUI
(statusLine, slash commands, etc.); worker mode is for orchestration.
- arc launch <profile> --native / --worker CLI flags override the profile.
- Doctor warns on deprecated CLAUDE_CODE_NO_FLICKER env var.
- ProfilesView 'm' keybind toggles mode inline.
- LaunchOptions extended so Phase 5+ orchestrators can force worker mode.
Phase 0+1 — Agent client (CLI-spawn) foundation:
- New packages/core/src/agent-client/: AgentClient interface, per-tool
clients (Claude claude -p --output-format stream-json, Codex
codex exec --json, Gemini gemini -p), registry ported from
agent-forge's agents.json with mcpMode variants.
- MCP config injection per mode: config-file (Claude), config-args
(Codex TOML-literal), mcp-add pre-launch (Gemini).
- Stream parsers tolerate version drift (Codex kind/type discriminators,
Claude event envelope unwrap).
- 48 unit tests covering parsers, registry, MCP injection, dispatcher.
Unblocks Phase 2 (tool registry + agent loop), Phase 4 (arc chat),
Phase 5 (roundtable orchestrator), and the Phase 7 dashboard chat.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…γ/δ) Four streams merged via parallel agent worktrees. Stream α — Phase 0.7: Bare launch + clearable active profile - ArcConfig.activeProfile: string | null (new installs start null) - arc run <tool> and arc launch --bare <tool> skip ARC overlay entirely - Tool-name inference: arc launch claude with no matching profile infers bare - arc profile switch none / arc profile clear-active clear the active pointer - TUI: x key clears active in ProfilesView; Dash/Session empty-state copy - Doctor handles null activeProfile gracefully - New unit tests: null-active-profile.test.ts Stream β — Phase 2: Tool registry + agent loop - packages/core/src/agent/: Tool/ToolRegistry/PermissionMode/runAgent - 16 ARC tools wired to core fns: 11 read, 4 write, 1 dangerous (list_profiles, clone_profile, switch_active_profile, delete_profile, ...) - Three permission modes (read-only / supervised / autonomous) with confirm callback for writes in supervised mode - zod added as first runtime dep of core for tool schema validation - 43 new unit tests Stream γ — Phase 3: Knowledge endowment - packages/core/src/knowledge/: static ARC catalog (architecture, 52 command entries across 6 categories, 16-term glossary) - FEATURES_INDEX: 33 curated entries with status/summary/links - buildSystemPrompt() composes 6-section prompt under 4K tokens (identity / capabilities / architecture / glossary / live state / behavior rules per permission mode); 5135 chars ~1284 tokens typical - 27 new unit tests Stream δ — Docs + user-site - user-docs: launch modes section, launch-without-profile section, getting-started fast vs full path, architecture overview of agent-client/agent/knowledge modules, configuration reference - FEATURES.md: shipped items checked, roadmap entries added for Phases 2-8 with plan doc references - CLAUDE.md: architecture bullets for all three new core modules, launch modes, bare launch - README: quickstart split into fast path (arc run) and full path - site/: Features.tsx copy updated for native vs worker + bare Merge conflict resolutions: cli.ts (combined --native/--worker/--bare flags and run command), launch.ts (both launchMode and bare options in LaunchOptions), ProfilesView.tsx (both m-toggle and x-clear keybinds), CLAUDE.md (preserved existing bullets and added new ones). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 — arc chat interactive REPL
- packages/core/src/chat/: ChatSession with append/serialize/load,
per-profile session store at ~/.arc/profiles/<name>/chat-sessions/,
list/load/delete/save helpers with atomic writes
- packages/cli/src/commands/chat.ts: arc chat with --profile, --mode,
--once, --no-tools, --session, --new flags. REPL supports /exit,
/save, /new, /mode, /clear, /sessions, /resume, /help
- constructPromptFromSession collapses full transcript per turn
(v1 O(n^2) limit documented; soft-truncates at 15k tokens via
estimateTokens from context-manager)
- Supervised mode blocks writes with [y/N] readline confirm;
--once auto-denies writes to avoid non-TTY hangs
- 22 unit + integration tests (mocked agent client with scripted
chunk stream + real ToolRegistry/runAgent wiring)
Phase 5 — Roundtable orchestrator
- packages/core/src/orchestration/:
- delivery-policy.ts: AgentDeliveryPolicy with per-model profiles
(Gemini 18s/1.6x, Claude 12s/1.45x, Codex 8s/1.2x), EMA latency
tracker, MessagePriorityQueue for coalescing
- staged-workflow.ts: PLAN/EXEC/VERIFY state machine, cursor-based
StagedMessageBus + InMemoryMessageBus, DEFAULT_COMPLETION_PATTERNS
- watchdog.ts: pure tick() nudge-at-3min / stall-at-5min protocol,
injected deps for testability
- roundtable.ts: RoundtableOrchestrator drives the existing hook
(dedicated HookBus + HookStateStore per run); forces launchMode
worker; adaptive pacing between turns; synthesizer JSON parsing
with graceful fallback; virtual agents throw (Phase 5.1)
- All three Agent-Forge ports attributed via top-of-file comments
- 59 new unit tests (delivery / staged / watchdog / roundtable) +
10 first tests for the roundtable hook itself (prior coverage: 0)
Build 554 KB; 1231/1232 tests pass (same tui-interactive flake that
passes in isolation). Typecheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…6+7+9)
Phase 6 — arc roundtable CLI + MCP team contract
- arc roundtable <topic> --agents a,b,c --rounds 2 --synthesizer
--roles --format plain|json: streaming colored transcript +
synthesis + consensus score
- MCP: arc_chat (one-shot chat on active profile, read-only by default)
- MCP: arc_roundtable (headless roundtable, returns transcript +
synthesis + consensusScore + keyPoints + roundtableId + durationMs)
- MCP team_say / team_read / team_status / team_done / team_plan /
team_ask — shared in-memory TeamSessionStore (process-wide; per-
team isolation requires separate MCP server processes; documented)
- Existing 5 supervision MCP tools unchanged; integration tests
relaxed from exact-tool-count to arrayContaining
Phase 7 — Dashboard chat view
- packages/dashboard/src/ws.ts: session routing added on top of
legacy broadcast. Clients send { type: "hello", sessionId } on
connect; ws.broadcastTo(sessionId, event, data) targets one
client. broadcast() preserved for fan-out (roundtable view).
- 5 new routes: POST /api/chat/message (streams chat-chunk events
over WS), POST /api/chat/confirm (answer chat-confirm-needed
events; 60s auto-deny timeout), GET/DELETE /api/chat/sessions[/id]
- Chat core loaded via dynamic import(@axiom-labs/arc-core) with
503 fallback if exports are missing — defensive, safe now that
Phase 4/5 exports are in core
- public/components/chat.js: session list sidebar, streaming
message thread, expandable tool-call cards, confirmation modal,
bearer-token bootstrap
- ~240 lines of chat-view CSS using existing tokens
- vitest.config.ts + tsconfig.json: include packages/*/tests/**
- 13 new dashboard tests (ws-session + api-chat)
Phase 9 — Docs + 0.4.0 release
- New user docs: chat.md, roundtable.md, multi-agent-pipelines.md
- Extended architecture page; updated VitePress sidebar
- FEATURES.md: Phases 2-7 checked off; Phase 8 remains [ ]
- CLAUDE.md: orchestration-layer bullet extended with arc chat +
RoundtableOrchestrator + StagedWorkflowManager + AgentWatchdog
- README quickstart: chat path added alongside fast/full paths
- CHANGELOG.md: 0.4.0 entry
- Version bump 0.3.0 -> 0.4.0 via scripts/version.js (syncs
packages/cli/src/version.ts, root package.json, site/package.json)
Build 573 KB; typecheck clean; 1263/1263 tests pass; web:build
succeeds in 10.83s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Completes all 10 phases of the AI chat + roundtable plan for 0.4.0.
Phase 8 — Dashboard roundtable + pipelines
- packages/dashboard/public/components/roundtable.js: configure form
(topic, agents with role picker, rounds, synthesizer), live turn
blocks, synthesis block with consensus score bar, history sidebar
- packages/dashboard/public/components/pipelines.js: PLAN/EXEC/VERIFY
phase toggles + per-phase timeout inputs, live phase tracker,
phase-scoped transcript, history sidebar
- packages/dashboard/src/api.ts: 6 new routes
- POST /api/roundtable/run: validates topic + agents + roles,
runs RoundtableOrchestrator in background, broadcasts
roundtable-event via ws.broadcast (everyone sees progress),
persists to ~/.arc/roundtables/<uuid>.json
- POST /api/pipeline/run: StagedWorkflowManager + InMemoryMessageBus,
broadcasts pipeline-event per onPhaseChange, persists to
~/.arc/pipelines/<uuid>.json
- GET /api/{roundtable,pipeline}/history: reads dir, sorts desc
- GET /api/{roundtable,pipeline}/:id: strict UUID regex gate
blocks path traversal
- packages/core/src/paths.ts: getRoundtablesDir() + getPipelinesDir()
- Dynamic import of core orchestration exports (mirrors chat loader)
for testability + graceful degradation
- Atomic writes (tmp + rename), bearer-token auth via existing
MUTATION_METHODS gate
- 11 new dashboard integration tests (api-roundtable + api-pipeline)
All 10 phases complete:
0 — Scaffolding
0.5 — Launch modes
0.7 — Bare launch + clearable active profile
1 — Agent client foundation
2 — Tool registry + agent loop
3 — Knowledge endowment
4 — arc chat REPL
5 — Roundtable orchestrator + adaptive delivery + staged workflow
6 — arc roundtable CLI + 8 MCP tools (arc_chat, arc_roundtable,
6 team_*)
7 — Dashboard chat view with per-session WS streaming
8 — Dashboard roundtable + pipelines views with live WS progress
9 — Docs + 0.4.0 version bump
Build 579 KB; 1273/1274 tests pass (1 pre-existing tui-interactive
parallelism flake, passes in isolation); web:build clean in 12.20s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stream 1 — wiring fixes Root cause: hello-race silenced the chat stream. chat.js sendMessage called ensureHello() synchronously; if the WS wasn't yet readyState=1, hello was a no-op, and the server's broadcastTo(sessionId, ...) silently dropped every chunk because the session was unregistered. - public/components/chat.js: added waitForHello(timeoutMs=5000); sendMessage now awaits it before POSTing /api/chat/message, closing the race. - src/ws.ts: broadcastTo now logs a warning and falls back to broadcast when the target sessionId is unknown. Makes silent failures observable and recoverable. - public/scripts/api.js: postJson / deleteJson / patchJson now attach the bearer token via cached /api/auth/token bootstrap, matching chat.js's authFetch pattern. Fixes 401s on existing sidebar actions (switchProfile, removeAgent, etc.) outside the new views. Stream 2 — visual polish - New public/styles/orchestration.css (~330 lines): toasts, skeleton loaders, CSS-only button spinner ([data-loading="true"]), role pill badges (advocate/critic/neutral/synthesizer), phase-bar animation, empty-state glyphs, help-icon tooltips, kbd styling, focus-visible outlines, .is-hidden utility. - public/index.html: linked orchestration.css. - public/components/chat.js: empty-state glyphs, sidebar skeleton, help icon on MODE, role=log / aria-live on transcript, proper role="dialog" on confirmation modal, #chat-banner placeholder for error surfaces, improved button copy. - public/components/roundtable.js: colored role pill badges in turn headers, synthesizer pill on synthesis block, role="progressbar" on consensus meter, empty states, sidebar skeleton, help icons, #rt-banner placeholder. - public/components/pipelines.js: empty log/history states with glyphs, sidebar skeleton, #pipe-banner placeholder, role=log / aria-label on tracker. Accessibility basics: all interactive elements focusable, aria labels on icon-only buttons, Enter sends in chat, Esc dismisses modals, focus-visible outlines on dark theme. All IDs + data-* hooks preserved. 1274/1274 tests pass; typecheck clean; build 580 KB. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Silent crashes were making the dashboard look alive when dead, and lack of request logs made it impossible to confirm the server was actually receiving anything. Firefox "can't establish connection to ws://" was the symptom of a silently-died process. Server-side: - src/server.ts: access log per HTTP request (method, status, path, duration). Noisy static assets (/scripts, /styles, /components, favicon) suppressed unless ARC_DASHBOARD_LOG=verbose. Color-coded 2xx/4xx/5xx. Set ARC_DASHBOARD_LOG=off to disable entirely. - src/server.ts: log every WS upgrade attempt with pathname; wrap handleUpgrade in try/catch so invalid upgrades produce a clear stderr line instead of a silent socket destroy. - src/server.ts: clientError handler on the HTTP server surfaces TCP-level protocol errors. - src/ws.ts: log WS session-registration when a client sends hello. Dev-mode crash visibility: - src/dev.ts: process-level uncaughtException + unhandledRejection handlers. EADDRINUSE, unhandled async errors, and anything else that would silently kill the process now print a full stack trace so the user can see why dev:dashboard died. Client-side: - public/scripts/ws.js: after 3 failed WS reconnects, render a fixed banner at top of the page reading "Disconnected from ws://... — is pnpm dev:dashboard still running?" Auto-removes on successful reconnect. Stops users from wondering why the UI is stale. Verified via live test: HTTP, upgrade, and hello frames all log to stderr in dev mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Root cause of the EADDRINUSE error on hot-reload: server.close() only stops *new* connections; keep-alive sockets (from the browser) linger seconds or minutes, keeping the listening socket bound. tsx --watch spawns the replacement child before the OS releases the port, and it crashes with EADDRINUSE. Fixes: - src/server.ts stop(): call server.closeAllConnections() + closeIdleConnections() (Node 18.2+) before server.close() to drop lingering browser/client sockets immediately. - src/dev.ts SIGTERM/SIGINT handler: hard-exit fallback after 1.5s so graceful close can't hang the process. tsx's spawn-next-child races the OS port release; faster exit = smaller window. - scripts/kill-stale.js: add --port <n> flag that kills any PID holding the given TCP port, not just tsx/vite/vitepress by command line. Catches zombie processes from prior dev runs. Uses netstat on Windows and lsof on unix. - package.json: predev:dashboard now runs `kill-stale.js tsx --port 3700` so the dashboard always starts from a clean :3700 even after a crashed previous session. Verified via manual restart race: old process SIGTERM → new process binds cleanly → HTTP 200 on /api/health. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 0–3 of the v3 daemon pivot (see docs/plans/arc-v3-daemon.md): - packages/daemon: long-running Node service on :7272 with SQLite canonical store, auth.json + pair-ready client table, PID/signal handling, /health endpoint, and `arc daemon start|stop|status|restart|logs` CLI. - packages/client: @axiom-labs/arc-client SDK — binary-mux frame codec, Zod-validated control envelopes, RPC + subscribe/unsubscribe + terminal channel, auto-reconnect with resubscribe. - packages/relay: placeholder (Phase 10). - Protocol v1: channel 0 control (Zod envelope), channel 1 terminal bytes, host-header DNS-rebinding defense, token auth. - Version bump to 1.0.0-alpha.0. Archive tag archive/v0.4.x for pre-daemon tree. End-to-end smoke + 4 Vitest tests cover connect/auth/health/agents.list/ subscribe. Phase 4 (port TUI+CLI+dashboard to the SDK) is the next batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SQLite WAL/SHM files and the daemon log stream briefly hold Windows file handles after close(), so the afterEach cleanup races with the file-system and fails with ENOTEMPTY on CI. Adding maxRetries + delay lets those handles release before we try again. Linux/macOS unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Retries alone aren't enough — SQLite WAL handles can linger past any reasonable retry budget on Windows runners. Cleanup is best-effort; OS reclaims tmp on its own. Tests themselves pass (1249 green); this just stops the afterEach race from marking them red. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stage A of the v3 daemon pivot (see
docs/plans/arc-v3-daemon.md) — three packages that become the spine of every future v3 surface:packages/daemon— long-running Node service on:7272.SQLite canonical store, signal-handled PID lifecycle,
/healthHTTP endpoint, binary-mux WebSocket server, host-header DNS-rebinding defense, per-client token auth, andarc daemon start|stop|status|restart|logsCLI group.packages/client(@axiom-labs/arc-client) — the SDK every client will use.Binary-mux frame codec, Zod-validated control envelopes,
call/subscribe/unsubscribe/attachTerminal, auto-reconnect with resubscribe.packages/relay— placeholder for Phase 10.Version bumped to
1.0.0-alpha.0;archive/v0.4.xtag preserves the pre-daemon tree.Protocol v1 at a glance
ch=0— JSON control envelopes (Zod schema inpackages/client/src/protocol.ts)ch=1— raw terminal bytes (Phase 4+ producer)ch=2/3— reserved (file transfer, audio)Methods shipped:
auth.login,health.get,profile.list/get,agent.list(read-only).agent.run/stop/sendare registered but stub tounimplemented— they light up in Phase 4 when adapters move behind the daemon.What's NOT in this PR (intentionally)
arc run,arc ls,arc attach, Docker-style verbs (Phase 4)~/.arc/history.json→ SQLite (Phase 14)Test plan
npx tsc --noEmit— cleannpx vitest run packages/daemon/tests/daemon.test.ts— 4/4 pass (health, empty agent list, bad-token reject, subscribe/unsubscribe)packages/daemon/tests/smoke.ts) — daemon starts, client auths, RPCs round-triparc daemon start --foreground→/healthreturns JSON → signal-stop cleanNotes
5BBB5DC85B11→C5AB0DC85B11, correct per RFC 6455).better-sqlite3added topnpm-workspace.yaml:onlyBuiltDependenciesso its native build runs without interactive approval.🤖 Generated with Claude Code