Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
28fc344
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
6d65d28
refactor(platform-server): streamline teams grpc client
casey-brooks Mar 9, 2026
c99d79d
fix(proto): drop local teams schema
casey-brooks Mar 9, 2026
2d528bd
fix(proto): split teams grpc defs
casey-brooks Mar 9, 2026
333bdba
refactor(grpc): migrate runner clients
casey-brooks Mar 11, 2026
44b78e1
fix(tests): close runner sessions
casey-brooks Mar 11, 2026
86078f0
fix(runner): stabilize exec teardown
casey-brooks Mar 11, 2026
4dc3ad5
chore(platform-server): update devspace (#1386)
casey-brooks Mar 12, 2026
ad7043f
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
2cc14c4
chore(platform-server): update devspace (#1386)
casey-brooks Mar 12, 2026
4f0e6c0
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
3ab9fc3
feat(platform-ui): migrate entity CRUD
casey-brooks Mar 12, 2026
a561b87
fix(platform-server): align teams grpc types
casey-brooks Mar 12, 2026
779ecec
chore(storybook): drop removed graph story
casey-brooks Mar 12, 2026
eef20bc
chore(platform-server): update devspace (#1386)
casey-brooks Mar 12, 2026
6967444
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
e88f9ae
chore(platform-server): update devspace (#1386)
casey-brooks Mar 12, 2026
de61dcf
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
c9794e9
fix(platform-ui): align team api contracts
casey-brooks Mar 12, 2026
ada29bb
fix(platform-ui): cap team page size
casey-brooks Mar 13, 2026
3070c98
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
e627e3f
feat(platform-server): integrate Teams graph
casey-brooks Mar 13, 2026
993c3c1
test(platform-server): add Teams graph tests
casey-brooks Mar 13, 2026
5f2615e
refactor(platform): remove docker-runner (#1399)
casey-brooks Mar 14, 2026
619b7dc
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
bc73fbf
refactor(grpc): migrate runner clients
casey-brooks Mar 11, 2026
914aec0
fix(runner): stabilize exec teardown
casey-brooks Mar 11, 2026
b1e1b4f
chore(platform-server): update devspace (#1386)
casey-brooks Mar 12, 2026
c39d234
feat(platform-server): add teams grpc client
casey-brooks Mar 9, 2026
161a2dd
feat(platform-ui): migrate entity CRUD
casey-brooks Mar 12, 2026
de33b66
chore(platform-server): update devspace (#1386)
casey-brooks Mar 12, 2026
659a922
fix(platform-ui): align team api contracts
casey-brooks Mar 12, 2026
b94bf68
refactor(graph): switch to Teams persistence
casey-brooks Mar 15, 2026
e7941b1
fix(docker-runner): restore sources
casey-brooks Mar 15, 2026
142612e
fix(graph): align teams snapshot updates
casey-brooks Mar 15, 2026
fc4a327
refactor(graph): drop full graph persistence
casey-brooks Mar 15, 2026
7613480
test(platform-server): update memory docs mocks
casey-brooks Mar 15, 2026
67dc925
fix(platform-server): align proto request names
casey-brooks Mar 15, 2026
3691491
fix(ui): resolve graph typecheck errors
casey-brooks Mar 15, 2026
165c8d4
refactor(graph): prune builder UI and repos
casey-brooks Mar 16, 2026
fe2998c
chore(storybook): remove graph stories
casey-brooks Mar 16, 2026
2d5f692
chore(prisma): remove graph state migration
casey-brooks Mar 16, 2026
c73aa71
chore: merge main
casey-brooks Mar 16, 2026
cd71d43
chore: update pnpm lockfile
casey-brooks Mar 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,11 +179,9 @@ Key environment variables (server) from packages/platform-server/.env.example an
- LITELLM_MASTER_KEY — admin key for LiteLLM
- Optional LLM:
- OPENAI_API_KEY, OPENAI_BASE_URL
- Graph store:
- GRAPH_REPO_PATH (default ./data/graph)
- Graph metadata:
- GRAPH_BRANCH (default main)
- GRAPH_AUTHOR_NAME, GRAPH_AUTHOR_EMAIL (deprecated; retained for compatibility)
- GRAPH_LOCK_TIMEOUT_MS (default 5000)
- Vault:
- VAULT_ENABLED (default false), VAULT_ADDR (default http://localhost:8200), VAULT_TOKEN (default dev-root)
- Workspace/Docker:
Expand Down Expand Up @@ -289,7 +287,7 @@ pnpm --filter @agyn/platform-server run prisma:generate

## API Docs
- See docs/api/index.md for current HTTP and socket endpoints:
- /api/templates, /api/graph, /graph/templates, /graph/nodes/:nodeId/status, /api/agents/runs/:runId/events, /api/agents/context-items, dynamic-config schema, Vault proxy routes, Nix proxy routes, socket events.
- /api/templates, /graph/templates, /graph/nodes/:nodeId/status, /graph/nodes/:nodeId/discover-tools, /api/graph/variables, /api/agents/runs/:runId/events, /api/agents/context-items, Vault proxy routes, Nix proxy routes, socket events.
- No OpenAPI/Swagger spec checked in; discover via docs/api/index.md and controllers under packages/platform-server/src/graph/ (GraphApiModule wiring).

## Deployment
Expand Down Expand Up @@ -346,5 +344,4 @@ Secrets handling:
- Production Vault: dev auto-init script (vault/auto-init.sh) is not suitable; confirm production secret management approach and policies.
- UI Storybook deployment: CI builds and smoke-tests Storybook, but no public hosting config is present. Confirm desired publishing workflow.
- NCPS in production: ops/k8s manifests are examples; confirm production deployment/monitoring design.
- Filesystem-backed graph store (GRAPH_REPO_PATH=./data/graph, GRAPH_BRANCH=main) assumes the path is writable and durable. Confirm persistence strategy in production (persistent volumes/NFS) and keep legacy git repos out of the configured path; the server now reads/writes directly to the working tree without migrations.
- Confirm whether the general postgres service (5442) is used by other components or is purely for convenience; server uses agents-db (5443).
8 changes: 8 additions & 0 deletions buf.gen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,11 @@ plugins:
out: packages/platform-server/src/proto/gen
opt:
- target=ts
- plugin: buf.build/bufbuild/es
out: packages/docker-runner/src/proto/gen
opt:
- target=ts
- plugin: buf.build/bufbuild/connect-es
out: packages/docker-runner/src/proto/gen
opt:
- target=ts
8 changes: 4 additions & 4 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

How these docs are organized
- Product specification: end-to-end features, behaviors, and operations.
- API and graph store: HTTP/Socket APIs and persistence internals.
- API and graph sources: HTTP/Socket APIs and persistence internals.
- Containers and security: workspace lifecycle and secret handling.
- Observability and UI: traces, spans, and the graph builder.
- Contributing and ADRs: internal engineering references.
Expand All @@ -12,8 +12,8 @@ Index
- API Reference: [api/index.md](api/index.md)
- Realtime notifications: socket payloads are published via the notifications service and served by notifications-gateway when running docker-compose.yml.
- Graph
- Filesystem Store: [graph/fs-store.md](graph/fs-store.md)
- Status Updates: [graph/status-updates.md](graph/status-updates.md)
- Teams graph source & status updates: [graph/status-updates.md](graph/status-updates.md)
- Legacy filesystem store (deprecated): [graph/fs-store.md](graph/fs-store.md)
- Containers
- Workspaces: [containers/workspaces.md](containers/workspaces.md)
- Env Overlays: [config/env-overlays.md](config/env-overlays.md)
Expand All @@ -37,4 +37,4 @@ Index
Slack integration
- For Slack-triggered flows and outbound messages, see:
- Secrets and tokens: [security/vault.md](security/vault.md)
- Graph UI templates (SlackTrigger, SendSlackMessageTool): [ui/graph/README.md](ui/graph/README.md)
- Graph UI templates (SendSlackMessageTool): [ui/graph/README.md](ui/graph/README.md)
49 changes: 8 additions & 41 deletions docs/api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,47 +14,21 @@ Templates
curl http://localhost:3010/api/templates
```

Graph state (filesystem-backed)
- GET `/api/graph`
- 200: Persisted graph document: `{ name, version, updatedAt, nodes, edges }`
- Example:
```bash
curl http://localhost:3010/api/graph
```
- POST `/api/graph`
- Body: `PersistedGraphUpsertRequest` → `{ name='main', version, nodes, edges }`
- Headers (optional): `x-graph-author-name`, `x-graph-author-email` are retained for compatibility but no longer influence persistence.
- Success: returns updated persisted graph `{ name, version, updatedAt, nodes, edges }`
- Errors (status → body):
- 409 `{ error: 'VERSION_CONFLICT', current?: PersistedGraph }`
- 409 `{ error: 'LOCK_TIMEOUT' }`
- 409 `{ error: 'MCP_COMMAND_MUTATION_FORBIDDEN' }` (enum value GraphErrorCode.McpCommandMutationForbidden)
- 500 `{ error: 'PERSIST_FAILED' }`
- 400 `{ error: 'Bad Request' | string }` (includes deterministic edge check; see notes)
- Notes:
- A provided `edge.id` must match the deterministic id `${source}-${sourceHandle}__${target}-${targetHandle}`. If it doesn't, the server returns `400` with `{ error: 'Edge id mismatch: expected <id> got <id>' }`.
- Persistence failures surface as `500 { error: 'PERSIST_FAILED' }`.
- Lock acquisition timeout surfaces as `409 { error: 'LOCK_TIMEOUT' }`.
- Example:
```bash
curl -X POST http://localhost:3010/api/graph \
-H 'content-type: application/json' \
-H 'x-graph-author-name: Jane Dev' \
-H 'x-graph-author-email: jane@example.com' \
-d '{"name":"main","version":1,"nodes":[],"edges":[]}'
```

Templates alias
- GET `/graph/templates` → same as `/api/templates`

Node status and actions
- GET `/graph/nodes/:nodeId/status`
- 200: `{ isPaused?, provisionStatus?, dynamicConfigReady? }`
- 200: `{ provisionStatus? }`
- POST `/graph/nodes/:nodeId/actions`
- Body: `{ action: 'pause'|'resume'|'provision'|'deprovision' }`
- Body: `{ action: 'provision'|'deprovision' }`
- 204: no body on success; server also emits a `node_status` socket event
- 400 `{ error: 'unknown_action' }`
- 500 `{ error: string }`
- POST `/graph/nodes/:nodeId/discover-tools`
- 200 `{ tools: Array<{ name: string; description?: string }>, updatedAt?: string }`
- 400 `{ error: 'node_not_mcp' }`
- 404 `{ error: 'node_not_found' }`

Agent runs timeline
- GET `/api/agents/runs/:runId/events`
Expand All @@ -75,12 +49,6 @@ Context items
- 200 `{ items: Array<{ id, role, contentText, contentJson, metadata, sizeBytes, createdAt }> }`
- Empty `ids` returns `{ items: [] }`.

Dynamic-config schema (read-only)
- GET `/graph/nodes/:nodeId/dynamic-config/schema`
- 200: `{ ready: boolean, schema?: JSONSchema }`
- 404: `{ error: 'node_not_found' }`
- 500: `{ error: 'dynamic_config_schema_error' | string }`

Vault proxy (enabled only when VAULT_ENABLED=true)
- GET `/api/vault/mounts` → `{ items: string[] }`
- GET `/api/vault/kv/:mount/paths?prefix=` → `{ items: string[] }`
Expand All @@ -93,13 +61,12 @@ Vault proxy (enabled only when VAULT_ENABLED=true)

Sockets
- Default namespace (no custom path)
- Event `node_status`: `{ nodeId, isPaused?, provisionStatus?, dynamicConfigReady?, updatedAt }`
- Event `node_config`: `{ nodeId, config, dynamicConfig, version }` (emitted after successful /api/graph save with changes)
- Event `node_status`: `{ nodeId, provisionStatus?, updatedAt? }`
- See docs/graph/status-updates.md and docs/ui/graph/index.md

Notes
- Route handlers surface structured errors and emit socket events on state changes.
- The filesystem store enforces deterministic edge IDs and uses a dataset-scoped file lock plus atomic writes.
- Graph snapshots are sourced from the Teams service; the platform no longer exposes `/api/graph`.
- MCP mutation guard prevents unsafe changes to MCP commands.
- Error codes align with the error envelope described above.
Nix proxy
Expand Down
2 changes: 1 addition & 1 deletion docs/contributing/style_guides.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Our repo currently uses:
- Prefer functional, pure modules. Side effects live in service classes.

### Node.js server
- Keep services injectable and stateless. IO is abstracted behind services (e.g., PrismaService). Note: Slack no longer uses a global service; Slack integration is configured per node (see SlackTrigger and SendSlackMessageTool static configs).
- Keep services injectable and stateless. IO is abstracted behind services (e.g., PrismaService). Note: Slack integration is configured per node (see SendSlackMessageTool static configs).
- Configuration comes from `ConfigService` reading env. No direct `process.env` reads inside business logic.
- Log with structured messages. Avoid console.log in code; use Nest's `Logger` (per-class instance).
- Graceful shutdown handlers must close external connections.
Expand Down
4 changes: 4 additions & 0 deletions docs/graph/fs-store.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# Filesystem-backed Graph Store (format: 2)

> **Deprecated:** Filesystem-backed graph persistence has been removed. The platform now sources graph
> configuration from the Teams service and `/api/graph` has been removed. This document is retained for
> legacy reference only.

Overview
- Graph persistence writes directly to the filesystem under `GRAPH_REPO_PATH` (default `./data/graph`).
- The layout matches the legacy git working tree: `graph.meta.yaml`, `nodes/`, `edges/`, and `variables.yaml` live at the path root.
Expand Down
41 changes: 15 additions & 26 deletions docs/graph/status-updates.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,15 +7,13 @@ Transport: socket.io
- Payload:
{
nodeId: string,
isPaused?: boolean,
provisionStatus?: { state: string; [k: string]: unknown },
dynamicConfigReady?: boolean,
updatedAt: string
updatedAt?: string
}

Client guidance
- Connect to the default namespace and subscribe to `node_status`.
- Server emits `node_status` for relevant changes: pause/resume, provision status updates, dynamic-config readiness.
- Server emits `node_status` for provision status changes.
- Initial render can still use HTTP GET /graph/nodes/:nodeId/status; subsequent updates should come via socket.io push.

Example (client)
Expand All @@ -27,38 +25,29 @@ socket.on('connect', () => {
});

socket.on('node_status', (payload) => {
// { nodeId, isPaused?, provisionStatus?, dynamicConfigReady?, updatedAt }
// { nodeId, provisionStatus?, updatedAt }
updateUI(payload);
});

Notes
- HTTP endpoints remain for actions (pause/resume, provision/deprovision) and configuration updates.
- HTTP endpoints remain for actions (provision/deprovision).
- Remove any polling loops (e.g., 2s intervals) for status; rely on socket events.

Config persistence
- Graph configuration changes persist via POST /api/graph (full-graph updates).
- The per-node dynamic-config save endpoint was removed; only the schema endpoint remains for rendering purposes.
Graph source and persistence
- Graph configuration is sourced from the Teams service; the platform no longer exposes a `/api/graph` snapshot endpoint.
- UI edits to layout are local-only; the backend does not accept full-graph writes.
- Node state is not persisted; node status reflects runtime provisioning only.
- Graph variables are managed via the Teams service and exposed via `/api/graph/variables`.
- MCP tool lists refresh via `POST /api/graph/nodes/:nodeId/discover-tools`.

## Template Capabilities & Static Config (Updated)
## Template Schema (Updated)

Each template now advertises its capabilities and optional static configuration schema via the `/api/templates` and `/graph/templates` endpoints. UI palette entries can introspect:
The `/api/templates` and `/graph/templates` endpoints return the palette schema:

- `capabilities.pausable`: Node supports pause/resume (triggers, agents).
- `capabilities.provisionable`: Node exposes provision/deprovision lifecycle (Slack trigger, MCP server).
- `capabilities.staticConfigurable`: Node accepts an initial static config that is applied through `setConfig` (agent, container provider, call_agent tool, MCP server).
- `capabilities.dynamicConfigurable`: Node exposes a dynamic runtime config surface (MCP server tool enable/disable) once `dynamicConfigReady` is true.
- `name`, `title`, `kind`
- `sourcePorts`, `targetPorts`

Static config schemas (all templates now expose one – some are currently empty placeholders to allow forward-compatible UI forms):
- `simpleAgent`: title, systemPrompt, summarization options.
- `containerProvider`: image, env map.
- `callAgentTool`: description, name override.
- `mcpServer`: namespace, command, workdir, timeouts, restart strategy.
- `shellTool`: (empty object for now).
- `githubCloneRepoTool`: (empty object for now).
- `sendSlackMessageTool`: (empty object for now).
- `slackTrigger`: debounceMs, waitForBusy (note: presently setConfig is a no-op; values must be supplied at creation time until runtime reconfiguration is implemented).

Dynamic config (currently only MCP server) becomes available after initial tool discovery; UI should check `dynamicConfigReady` before rendering its form.
Capability flags and config schemas are not included in the palette response.

Wiring timing and run state visibility
- During server bootstrap, globalThis.liveGraphRuntime and globalThis.__agentRunsService must be assigned before applying any persisted graph to the runtime.
Expand Down
17 changes: 8 additions & 9 deletions docs/product-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,9 @@ Architecture and components
- Checkpointing via Postgres (default); streaming UI integration planned.
- Server
- HTTP APIs and Socket.IO for management and status streaming.
- Endpoints manage graph templates, graph state, node lifecycle/actions, dynamic-config schema, reminders, runs, vault proxy, and Nix proxy (when enabled).
- Endpoints manage graph templates, graph snapshots, node lifecycle/actions, MCP tool discovery, reminders, runs, vault proxy, variable CRUD, and Nix proxy (when enabled).
- Persistence
- Graph store: filesystem dataset (format: 2) with deterministic edge IDs, dataset-level file locks, and staged working-tree swaps. Each upsert builds a full graph tree in a sibling directory, fsyncs it, and atomically swaps it into place (conflict/timeout/persist error modes preserved).
- Graph store: Teams service snapshot via gRPC; platform-server keeps no local graph persistence.
- Container registry: Postgres table of workspace lifecycle and TTL; cleanup service with backoff.
- Containers and workspace runtime
- Workspaces via container provider; labeled hautech.ai/role=workspace and hautech.ai/thread_id; optional hautech.ai/platform for platform-aware reuse. Networking is managed by the runner. Optional DinD sidecar with DOCKER_HOST=tcp://localhost:2375. Optional HTTP-only registry mirror reachable at http://registry-mirror:5000.
Expand All @@ -54,11 +54,11 @@ Features and capabilities

Core data model and state
- Graph
- Nodes: typed by template; static config schema; capabilities (pausable, provisionable, static/dynamic configurable).
- Nodes: typed by template; configs applied via setConfig; template metadata provides kind and ports.
- Edges: deterministic IDs; reversible by PortsRegistry knowledge.
- Version: monotonically increasing; optimistic locking on apply; single graph “main”.
- Runtime status
- Per-node paused flag; provision status (not_ready, provisioning, ready, deprovisioning, error); per-node dynamic-config readiness.
- Per-node provision status (not_ready, provisioning, ready, deprovisioning, provisioning_error, deprovisioning_error).
- Containers
- container_id, node_id, thread_id, image, status, last_used_at, kill_after_at, termination_reason, metadata.labels, metadata.platform, metadata.ttlSeconds.
- Observability
Expand Down Expand Up @@ -95,10 +95,10 @@ Performance and scale
- Observability storage relies on Postgres; add indices on spans by nodeId, traceId, timestamps.

Upgrade and migration
- Graph store now writes directly to the working tree at `GRAPH_REPO_PATH` using staged swaps; legacy git guards and migration tooling have been removed. Ensure any old `.git` directories are deleted or copied elsewhere before pointing the server at the path.
- Graph configuration and variables are sourced from the Teams service via gRPC; node state is in-memory only; filesystem-backed graph storage is retired.
- UI dependency on change streams is retired alongside Mongo.
- MCP heartbeat/backoff planned; non-breaking once added.
- See: docs/graph/fs-store.md
- See: docs/graph/fs-store.md (legacy filesystem format reference)

Configuration matrix (server env vars)
- Required
Expand All @@ -111,9 +111,8 @@ Configuration matrix (server env vars)
- Optional
- LLM_PROVIDER (defaults to `litellm`; set to `openai` to bypass LiteLLM). Other values are rejected.
- LiteLLM tuning: LITELLM_KEY_ALIAS (default `agents/<env>/<deployment>`), LITELLM_KEY_DURATION (`30d`), LITELLM_MODELS (`all-team-models`)
- GRAPH_REPO_PATH (default ./data/graph)
- GRAPH_BRANCH (default main)
- GRAPH_AUTHOR_NAME / GRAPH_AUTHOR_EMAIL
- TEAMS_SERVICE_ADDR (Teams gRPC endpoint for graph source)
- GRAPH_BRANCH / GRAPH_AUTHOR_NAME / GRAPH_AUTHOR_EMAIL (legacy; ignored)
- VAULT_ENABLED: true|false (default false)
- VAULT_ADDR, VAULT_TOKEN
- DOCKER_MIRROR_URL (default http://registry-mirror:5000)
Expand Down
4 changes: 2 additions & 2 deletions docs/security/vault.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,15 @@ Workspace env vars
- `env: Array<{ key: string; value: string | SecretRef | VariableRef }>`
- Plain strings are injected verbatim.
- Vault references use `{ kind: 'vault', path: 'services/slack', key: 'BOT_TOKEN', mount?: 'secret' }`.
- Graph variable references use `{ kind: 'var', name: 'SLACK_BOT_TOKEN', default?: 'fallback' }`.
- Graph variable references use `{ kind: 'var', name: 'SLACK_BOT_TOKEN', default?: 'fallback' }` (resolved via Teams-managed graph variables).
- On provision, the server resolves vault-backed entries and injects values into the container environment.
- Legacy compatibility removed: envRefs is no longer supported. Providing envRefs will fail validation. A legacy plain env map may still be accepted by the server for convenience, but new configurations should use the array form.

GitHub Clone Repo auth
- New: `token?: string | SecretRef | VariableRef`
- Plain strings are used directly.
- Vault references are resolved server-side before cloning.
- Variable references allow graph variables to supply tokens.
- Variable references allow Teams-managed graph variables to supply tokens.
- Fallbacks: if not provided or resolution fails, server falls back to `ConfigService.githubToken`.
- Backward compatibility: legacy `authRef` remains supported at runtime but is not shown in templates.

Expand Down
Loading
Loading