Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions openspec/changes/restructure-agent-platform/design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Design — restructure-agent-platform

## Context

archmax currently presents itself as a database-semantics manager: connections are federated through DuckDB, semantic models are authored by a Deep Agents builder, and an arbitrary number of "test agents" (each with its own LLM credentials) can be exercised in a playground and batch test runs. The product vision is a **semantic process layer**: a project's output is an **agent scaffold** — a plugin-style filesystem consumed by an agent harness — and the project has exactly one agent whose quality is measured by the test suite. This change is mostly an information-architecture and configuration-model restructure; the existing LangChain Deep Agents playground/test-runner already provides the required test harness.

Constraints:

- Single-user system, MongoDB + YAML-files-on-disk storage, typed Hono RPC client (no raw fetch in frontend).
- The active change `add-llm-prompt-caching` touches `agent.ts` / `playground-agent.ts`; implementation must be sequenced against it.
- Spec conventions: settings pages use inline label+input grids; popups use `--popover` (page-grey) backgrounds; filters use ghost styling.

## Goals / Non-Goals

- **Goals**
- Navigation that mirrors the process: Connections → Builder → Agent → Testing → MCP Access → Settings.
- One agent per project, configured in Settings, used by playground and test runs.
- Per-project LLM configuration for the builder with env fallback.
- Formalize the project directory as an exportable, agent-authored plugin scaffold.
- Surface failing tests where improvement work happens (Builder panel).
- **Non-Goals**
- APIs connections and API Models (both ship as visible-but-disabled "soon" placeholders only).
- A scaffold *generation pipeline* — the builder agent authors scaffold files directly with its existing filesystem tools.
- Hosting or executing external harnesses; the export is a downloadable artifact.
- Multi-agent support of any kind.

## Decisions

### D1 — Agent and builder LLM config live on the `Project` document

Two optional subdocuments: `builderLlm { baseUrl?, encryptedApiKey?, model? }` and `agentLlm { baseUrl, encryptedApiKey, model, systemPrompt }`. API keys reuse the AES-256-GCM encryption already used by `TestAgent`/`github.encryptedToken`, with the same SSRF validation rules for base URLs. A dedicated `llm-settings` route family handles GET (masked) / PUT (re-encrypt on new key) / test-connection, rather than overloading `PUT /api/projects/:id`, so key masking and partial updates stay isolated.

- *Alternative considered:* a singleton `ProjectAgent` collection — rejected; it resurrects the TestAgent shape and adds a join for no benefit in a single-user system.

### D2 — Builder resolution is per-field project → env; the agent requires explicit config

The builder keeps working out of the box via `AGENT_*` env vars; project values override field-by-field. The **agent** has no env fallback: it is the project's deliverable and its credentials are an explicit choice. Playground input and run-creation are blocked with a pointer to Settings → Agent until configured. This also makes the migration story honest (see D6).

### D3 — Scaffold lives at the project-directory root; data models move to `data_models/`

The existing project dir (`<ARCHMAX_DATA_DIR>/projects/<projectId>/`) *is* the agent filesystem and remains the `FilesystemBackend` root, so the builder can author scaffold files with no new tooling. The scaffold entries (`commands/`, `agents/`, `skills/`, `hooks/`, `scripts/`, `.mcp.json`) and `AGENTS.md` live at the root. The semantic model YAML files move out of the current `src/` directory into a dedicated **`data_models/`** subdirectory, matching the new "Data Models" product vocabulary and reserving room for a future `api_models/` sibling. Export excludes internal entries (`.git/`, `large_tool_results/`, `uploads/`, `duckdb.db*`, temp files) and includes `data_models/` plus the scaffold dirs.

- *Why rename `src/` → `data_models/`:* "src" is opaque and conflicts with the scaffold's plugin vocabulary; "Data Models" is the user-facing label in both the sidebar (Agent Scaffold → Data Models) and the dashboard card, so the on-disk directory should match. A sibling `api_models/` slot can later hold the "API Models (soon)" content without overloading `src/`.
- *Alternative considered:* keeping `src/` on disk and only relabeling it "Data Models" in the UI — rejected; the path leaks into the agent prompt, exports, and docs, so a mismatch between the on-disk name and the product term is a lasting source of confusion.
- *Alternative considered:* a single `scaffold/` subdirectory holding everything — rejected; it splits the agent filesystem from the project root and complicates the Git story which already versions the whole project dir.

### D4 — `.mcp.json` is platform-seeded with a token placeholder

Seeded on project creation and refreshed on slug change: an `archmax` MCP server entry pointing at the project's MCP endpoint, with `Authorization: Bearer ${ARCHMAX_MCP_TOKEN}`. Real tokens never reach the file (it is Git-versioned and exported). The builder may edit the file to add further servers; JSON validation on write protects `hooks/hooks.json` and `.mcp.json` the same way YAML validation protects models.

### D5 — Browser and Console become full-size overlay dialogs with `?tool=` deep links

The sidebar loses both entries; the Data Sources header hosts Browser (icon+text), Console (icon-only), Re-initialize schemas (icon-only). The dialogs are near-viewport-size with shadow, reusing the existing page components. A `tool=browser|console` search param on `/connections` opens the corresponding dialog so old routes can redirect losslessly and the dialogs stay deep-linkable.

### D6 — Migration drops all TestAgents (user decision: manual reconfiguration)

A schema migration **first backfills `TestRun.testAgentName`** from each run's `testAgent.name`, then soft-deletes every `TestAgent` document and unsets `TestCase.testAgent`. `TestRun.testAgent` becomes optional and is never populated afterwards (the soft-deleted ref would return nothing); run lists/detail render the `llmModel` snapshot for new runs and the `testAgentName` snapshot for legacy runs, falling back to a neutral "Legacy agent" label. Conversations: playground conversations are identified by a `playground: true` flag going forward; legacy `testAgent` references remain readable. Builder and Agent conversation histories are partitioned by the `playground` flag (Builder filters `playground: { $ne: true }`, replacing the old `testAgent: null` filter, so playground chats cannot leak into Build history).

Per-project agent/builder *configured* state is exposed on project-scoped, authenticated endpoints (the llm-settings GETs and the project payload), **not** on the global unauthenticated `/api/config` route, which has no `projectId` and would be wrong under multiple projects. API key material (including env secrets like `AGENT_API_KEY`) is never returned in any form — masked or otherwise; the UI shows only presence/source via `apiKeySet`/`apiKeySource`.

### D7 — "Failing tests" = latest result per test case

A test case is *failing* when the most recent `TestRun` embedded result for it has status `failed` or `error`. A new endpoint aggregates this (`latest-results`), powering the **Improvements & Testing** panel. No new persistent state is introduced — the registry is derived from existing `TestRun` data, so it can never drift.

### D8 — Remove the build artifact, but keep the publish gate via Git HEAD

The disk **build step** (`PublishService.assemble()` writing fully-inlined single-file YAMLs to `build/`) is removed entirely. It only ever materialized a derived snapshot; MCP model tools surface compact **markdown** via `SemanticModelDigest`, and `execute_query`/`get` assemble a model in memory on demand — none of them need a materialized full-YAML artifact. **Production MCP remains gated by publishing**, but the gate is the Git commit rather than a `build/` directory:

- **Production MCP serves the last published state** = the project repo's latest commit (HEAD). Tools assemble models in memory from the committed `data_models/` tree (read via `isomorphic-git`), so uncommitted working-directory edits are NOT exposed. There is no `build/` artifact.
- **Testing MCP serves the live working directory** `data_models/` (in-memory assembly), so it reflects the latest unpublished edits. The tool registration, digest generation, and scope filtering code stay shared between both modes; only the source (committed tree vs working dir) differs.
- **Publish = Git commit (the gate)**: ensure repo → pull → stage `data_models/`+scaffold → commit → record `PublishEvent` → optional push. Committing is what makes models visible to production MCP. `hasUnpublishedChanges` means "working-directory `data_models/` differs from HEAD/last `PublishEvent`" — i.e. there are models not yet published to MCP. Publish UI copy keeps the "make available via MCP" meaning.
- `PublishService.assemble()`/`cleanStaleFiles()` and the `build/` read in the MCP route are deleted; production MCP instead reads the committed tree. `computeSourceHash()` is retained but hashes `data_models/` (+ scaffold) source only.
- A **startup migration** removes any existing `build/` directory from project dirs (mirrors the existing startup `AGENTS.md` cleanup). `.gitignore` no longer needs to exclude `build/`.

- *Alternative considered:* serve production MCP from the live working dir (no gate) — rejected by product requirement; production must only expose explicitly published (committed) models. Reading the committed tree preserves that gate while still eliminating the derived `build/` artifact.
- *Alternative considered:* keep writing a `build/` snapshot on publish — rejected; the materialized full-YAML artifact is redundant now that tools consume markdown/in-memory assembly, and the Git commit already is the publish boundary.

## Risks / Trade-offs

- **Large rename surface** (routes, labels, docs, e2e selectors) → mitigated by keeping all backend route prefixes except `test-agents` stable, and adding redirects for moved frontend routes.
- **Removing TestAgent breaks the builder's `list_test_agents` tool and prompt flow** → tool removed and prompt updated in the same change; `create_test_case` is simplified rather than left referencing dead concepts.
- **Concurrent edit conflict with `add-llm-prompt-caching`** → no overlapping spec requirements; sequence implementation (rebase whichever lands second).
- **Export could leak secrets** → export uses a runtime denylist *and* secret-file exclusions (`.env*`, key/credential files) *and* a content scan that fails the export closed on secret patterns; `.mcp.json` is placeholder-only by construction, and `.mcp.json` writes reject literal credential values (only `${VAR}` placeholders allowed). Tests assert the fail-closed path and absence of `encryptedApiKey`/token material.
- **Blocking playground/test-runs until the agent is configured adds first-run friction** → mitigated by prominent empty states deep-linking to Settings → Agent.

## Migration Plan

1. Ship schema migration `00X-drop-test-agents`: backfill `TestRun.testAgentName` from each run's `testAgent.name`, then soft-delete all `TestAgent` docs and `$unset` `TestCase.testAgent`; otherwise leave `TestRun` documents intact.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migration omits playground backfill

High Severity

The migration plan for dropping test agents never backfills playground: true on legacy playground conversations, though D6 requires Builder and Agent histories to partition on that flag. Legacy chats with a testAgent reference would appear under Build and stay out of Agent history after testAgent is removed.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d400443. Configure here.

2. Ship filesystem migration `migrate-data-models-layout.ts` (replaces `migrate-src-layout.ts`): on startup, for any project dir lacking `data_models/`, move model YAMLs into `data_models/` from the legacy `src/` directory (or, for very old projects, from the project root). `uploads/` and scaffold dirs are left in place. Idempotent and safe to re-run.
3. Ship startup `build/` cleanup: on startup, recursively remove any `build/` directory under each project dir (idempotent; mirrors the existing startup `AGENTS.md` cleanup). Drop `build/` from the `.gitignore` template.
4. Seed `.mcp.json` for existing projects lazily (on first builder-agent start or settings save) and for new projects on creation.
5. Frontend redirects: `/testing/playground → /agent`, `/testing/agents → /settings/agent`, `/connections/data → /connections?tool=browser`, `/connections/console → /connections?tool=console`, `/data → /connections?tool=browser`.
6. Rollback: the TestAgent migration is destructive by user decision; the `data_models/` move is reversible by moving files back to `src/`; the `build/` cleanup is non-destructive to source (build was a derived artifact). Rollback otherwise restores routes/UI only.

## Open Questions

- Should the scaffold export embed a generated `README.md` describing harness setup? (Lean yes, deferred to implementation detail of the export task.)
- Whether `uploads/` should optionally be includable in the export for harnesses that want source documents — excluded for now.
Loading
Loading