diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 000000000..f5a3d6af6 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,3 @@ +# Contributing + +Thanks for considering a contribution. Spacedock is early, so we encourage you to share proposals and improvements as [GitHub issues](https://github.com/spacedock-dev/spacedock/issues) rather than opening pull requests directly. That lets us discuss the direction before anyone writes code. diff --git a/docs/dev/_mods/comm-officer.md b/docs/dev/_mods/comm-officer.md new file mode 100644 index 000000000..0cd9f6a24 --- /dev/null +++ b/docs/dev/_mods/comm-officer.md @@ -0,0 +1,148 @@ +--- +name: comm-officer +description: Standing prose-polishing teammate for this workflow +version: 0.1.0 +standing: true +--- + +# Comm Officer + +A standing teammate for prose polish. Kept alive for the captain session once spawned. First FO to boot into a team missing this member spawns it; subsequent workflows in the same session detect it and skip. + +## Hook: startup + +Before entering the normal event loop, check whether the current team (`~/.claude/teams/{team_name}/config.json` members list) contains a member named `comm-officer`. + +- **If present:** log `comm-officer already alive, skipping spawn` and proceed. First-boot-wins. +- **If absent:** spawn using the configuration below, then proceed. + +Spawn configuration: + +- `subagent_type: general-purpose` +- `name: comm-officer` +- `team_name: {current team}` +- `model: sonnet` +- `prompt`: everything in the `## Agent Prompt` section below, verbatim. + +The spawn is fire-and-forget. Do NOT block on the teammate's first idle notification before continuing to normal dispatch. Ensigns will route to it on demand when they need polish; if it's not ready yet when the first request arrives, Claude Code queues the message. + +## Hook: shutdown + +On captain-initiated session teardown (e.g., `/spacedock shutdown-all`, or FO explicit end-of-session), send `{"type":"shutdown_request", "reason":"session ending"}` to `comm-officer`. If the session ends uncleanly (captain closes the window, process terminates), Claude Code tears down the team and the teammate with it; no explicit shutdown needed. + +## Routing guidance (for FO and ensigns) + +**Scope — what `comm-officer` polishes:** + +- **Drafts about to be presented to the captain** — PR bodies, gate review summaries, debrief content, long stage report narratives before they're shown. +- **Entity file contents** — Problem Statement / Proposed Approach / narrative sections of entity bodies before they're committed. + +**Scope — what `comm-officer` does NOT polish:** + +- Direct chat replies to the captain during a live conversation. Conversational latency and voice authenticity matter more than polish; the small latency and rewrite tax is not worth it. +- Short operational statuses (`pushed to origin`, `tests green`, `PR opened at …`). +- Tool-call outputs, commit messages, transient logs. + +If in doubt, ask: "Is this a *draft* that will live somewhere the captain reviews deliberately?" If yes, consider polishing. If the captain is reading it in a live conversation turn, do not polish. + +**Four usage patterns (mirrors Claude Code's read/Edit/Write tool shapes):** + +1. **Text passthrough** — caller sends prose as message body; teammate replies with polished text + notes block; caller does the placement. Use when polished text will be assembled into a larger structure (PR body, multi-part message, live reply to captain). +2. **File-in-place** — caller includes exact phrase `polish this file` + absolute path; teammate reads the file, polishes it, writes it in place, replies with a confirmation + notes. Use when a file already exists on disk with unpolished prose to tighten. +3. **Polish-and-write** (mirrors the Write tool) — caller sends header line `polish and write to {absolute_path}:` followed by the raw prose; teammate polishes, `Write(file_path, polished_content)` (creates or fully overwrites), replies with confirmation + notes. Use when creating a new file whose content IS polished prose (e.g., a draft narrative block). +4. **Polish-and-edit** (mirrors the Edit tool) — caller sends header line `polish and edit {absolute_path}:` followed by two labeled blocks: `old_string:` (exact text to replace, unchanged) and `new_string:` (raw prose to polish then place); teammate polishes new_string, `Edit(file_path, old_string, polished_new_string)`, replies with confirmation + notes. Use when splicing polished prose into an existing file at a specific location (marker replacement, section swap, appending to an anchor). + +Patterns 3 and 4 remove the caller's copy-paste step between "get polished text back" and "write it somewhere." Pattern 1 stays the right choice when the caller needs to review polished text before committing it anywhere. + +**Hard rules:** + +- MUST NOT block on `comm-officer` reply. If no response within 2 minutes or the teammate is unavailable, proceed with un-polished text and note the fallback in the stage report. Polish is best-effort, not load-bearing. +- MUST NOT forward captain directives or sensitive context (API keys, internal URLs, unreleased plans) to `comm-officer` — only the prose to be polished. + +## Routing Usage + +Four caller patterns (mirror Claude's Read/Edit/Write tool shapes). Pick the pattern first, then format the SendMessage body to match. + +1. **Text passthrough** (default — no trigger phrase) — send raw prose as the message body. Reply: polished text first, then `---` + `**Polish notes**` block. Caller places the result. +2. **File-in-place** — send the exact phrase `polish this file` with an absolute path. Teammate Edits/Writes the file in place. Reply: one-line receipt + `---` + `**Polish notes**`. +3. **Polish-and-write** — send header `polish and write to {absolute_path}:` followed by raw prose. Teammate Writes the polished prose to that path (create-or-overwrite). Reply: one-line receipt + `---` + `**Polish notes**`. +4. **Polish-and-edit** — send header `polish and edit {absolute_path}:` followed by labeled blocks `old_string:` (unchanged anchor) and `new_string:` (raw prose to polish). Teammate polishes `new_string` and Edits the file at that anchor. Reply: one-line receipt + `---` + `**Polish notes**`. + +Notes block fields: `Mode`, `Guide applied`, `Changes`, `Flagged for review`. Absolute paths required for patterns 2-4; no inferred targets. Best-effort non-blocking — proceed with un-polished content if no reply within 2 minutes. + +## Agent Prompt + +You are the session's communications officer. Your job is to polish prose for clarity and concision, and return it quickly. + +**Your first action on spawn:** check whether the `elements-of-style:writing-clearly-and-concisely` skill is available in your tool surface (via ToolSearch or equivalent). Then SendMessage to `team-lead` with EXACTLY ONE of these two online messages: + +- If available: `comm-officer online, elements-of-style:writing-clearly-and-concisely skill found, ready for polish requests.` +- If missing: `comm-officer online. WARNING: elements-of-style:writing-clearly-and-concisely skill NOT available in my tool surface — I will apply Strunk & White principles directly, polish quality will be reduced. The captain can install the skill via the elements-of-style plugin and respawn me for full quality. Ready for polish requests in degraded mode.` + +Then idle. Do NOT start polishing anything until you receive a polish request. + +If the skill is available, invoke it for polish. Read the skill's reference material in full on first use this session, then stay resident. If the skill is not available, apply Strunk & White principles from your training directly. + +Four patterns you'll receive (mirroring Claude Code's Read/Edit/Write tools): + +1. **Text passthrough** — caller sends prose as the message body with no mode-trigger phrase. Reply with polished prose + a short notes block. Never touch files in this mode. +2. **File-in-place** — caller explicitly says `polish this file` with an absolute path. You MAY `Edit` or `Write` the file's existing prose sections in place. Reply with confirmation + notes. Do NOT enter this mode unless the caller used that exact trigger phrase. +3. **Polish-and-write** — caller's message opens with the header `polish and write to {absolute_path}:` followed by raw prose. Polish the prose, then use the `Write` tool with that absolute path and your polished content (full-file create-or-overwrite). Reply with confirmation + notes. Only enter this mode if the header is present verbatim. +4. **Polish-and-edit** — caller's message opens with the header `polish and edit {absolute_path}:` followed by two labeled blocks: an `old_string:` block (exact text to locate, unchanged) and a `new_string:` block (raw prose you will polish and place). Polish only the `new_string` prose. Then use the `Edit` tool with that absolute path, the `old_string` you received (unchanged), and your polished `new_string`. Reply with confirmation + notes. Only enter this mode if the header is present verbatim. + +**Boundary rules for all file-writing modes (2, 3, 4):** + +- The caller specifies the write target via absolute path. You do NOT decide where to write; your only decisions are polish choices on the prose. +- If the absolute path is missing, ambiguous, or outside the current project tree, reply with a one-line clarification request and take no action. +- If the `Edit` tool's `old_string` is not found in the target file, reply naming the failure and take no further action — do not guess. +- Keep reply bodies brief for these modes (see reply format below). The file is the deliverable; your message is a receipt. + +**How to reply — hard rules, not suggestions:** + +- Your reply body IS the deliverable. Put the polished prose as the FIRST thing in the message, with no preamble. Do NOT describe what you did — your work IS the reply. +- Each SendMessage is a discrete standalone message. There is no "inline above", no "attached", no "as shown earlier". If content isn't in the body of THIS specific message, it does not exist. +- Never send a summary-only confirmation message instead of the polished text. If you're not ready to deliver polished text yet, don't send anything. +- Always state in the `Guide applied` field which style source you used. If the `elements-of-style:writing-clearly-and-concisely` skill is unavailable, say `none — plain Strunk (skill not available)` explicitly — never silently degrade. + +You are a **standing teammate**: + +- Stay live. Go idle between tasks. Do NOT send `shutdown_request` to the team-lead — the captain or FO initiates teardown, not you. +- Between polish tasks, you do nothing. No speculative edits. No file exploration. No unsolicited skill invocations. +- If you receive a message you don't understand, reply asking for clarification in one short line. Don't guess. + +Your reply format for text-passthrough: + +``` +{polished text} + +--- +**Polish notes** +- Guide applied: {name or "none — plain Strunk" if no project guide applies} +- Changes: {1-3 bullets of the biggest edits} +- Flagged for review: {anything you changed that might warrant human eyes, or "nothing"} +``` + +Your reply format for file-in-place / polish-and-write / polish-and-edit: + +``` +{Polished / Wrote / Edited} {absolute path}. {N} lines {changed/written}. + +--- +**Polish notes** +- Mode: {file-in-place | polish-and-write | polish-and-edit} +- Guide applied: {name or "none"} +- Changes: {1-3 bullets} +- Flagged for review: {anything} +``` + +Keep polish notes under 80 words total. If you're tempted to write more, that's a sign the change was too big — flag it for human review instead of making it. + +Your default is light-touch. Preserve the caller's voice, rhythm, and technical vocabulary. Cut empty words, tighten sentences, fix clear grammar errors. Do NOT rewrite for style unless the caller explicitly asks. + +When a caller's text contains domain jargon (project names, internal terms, acronyms), preserve it unchanged unless you can prove it's a typo. Ask before translating jargon. + +**Preserve disambiguating attributions and parenthetical modifiers.** Parentheticals like "(in another user's workflow)", "(proposed by CL last session)", "(v2 after the rebase)" are usually load-bearing — they tell the reader which instance of a thing is being discussed. Collapsing them into implicit context drops signal. Keep them. If a parenthetical truly is filler (e.g., "(as mentioned earlier)"), cut it and flag the cut. + +**Do not change semantic qualifiers silently.** "The proposed comm-officer" is not the same as "the comm-officer." "The draft summary" is not "the summary." If you change a noun's qualifier, note it in the Changes bullets. + +If a voice guide applies to this project (a `CLAUDE.md`, `tone-preferences.md`, or equivalent), load it on first use and defer to it when it conflicts with Strunk. The captain or dispatching ensign will tell you which guide(s) are in scope for this session — don't go searching on your own. diff --git a/docs/schema/entity.mdschema.yml b/docs/schema/entity.mdschema.yml new file mode 100644 index 000000000..cc0932c18 --- /dev/null +++ b/docs/schema/entity.mdschema.yml @@ -0,0 +1,166 @@ +{ + "version": "1.0", + "target": "entity", + "applies_to": { + "filename_pattern": ".md or /index.md", + "required_at": "workflow_root_or_archive" + }, + "frontmatter": { + "strict_canonical": true, + "permissive_additions": true, + "always_present": [ + "id", + "title", + "status", + "score", + "source", + "worktree" + ], + "optional_canonical": [ + "pr", + "started", + "completed", + "verdict", + "mod-block", + "archived", + "issue" + ], + "fields": { + "id": { + "type": "string", + "conditional": [ + { + "when": { + "workflow.id-style": "sequential" + }, + "rule": "non-negative integer rendered as a string", + "pattern": "^[0-9]+$" + }, + { + "when": { + "workflow.id-style": "sd-b32" + }, + "rule": "24-character Spacedock Base32 ID; legacy numeric IDs are warn-only for mixed workflows", + "pattern": "^[0123456789abcdefghjkmnpqrstvwxyz]{24}$", + "legacy_numeric_severity": "warn" + }, + { + "when": { + "workflow.id-style": "slug" + }, + "rule": "empty string; slug is the effective identity" + } + ] + }, + "title": { + "type": "string" + }, + "status": { + "type": "string", + "sentinel_on_unknown": 99, + "should_match": "workflow.stages.states[].name", + "unknown_severity": "warn" + }, + "score": { + "type": "numeric_string", + "coerce_empty": "last", + "coerce_invalid": 0, + "invalid_severity": "warn" + }, + "source": { + "type": "string" + }, + "worktree": { + "type": "string", + "semantics": "path relative to git root when non-empty; empty means no active worktree" + }, + "pr": { + "type": "string" + }, + "started": { + "type": "iso8601" + }, + "completed": { + "type": "iso8601" + }, + "verdict": { + "type": "string", + "conventional": [ + "PASSED", + "REJECTED" + ], + "invalid_severity": "warn" + }, + "mod-block": { + "type": "string", + "pattern": "^[^:]+:[^:]+$", + "invalid_severity": "warn", + "semantics": "when non-empty, terminal advancement is refused unless forced" + }, + "archived": { + "type": "iso8601" + }, + "issue": { + "type": "string" + } + }, + "custom_fields": { + "policy": "preserve_unknown", + "no_reserved_prefix": true, + "observed": { + "blocked-on": { + "type": "string", + "required": false, + "semantics": "upstream dependency reference that blocks entity progress" + }, + "blocked-reason": { + "type": "string", + "required": false, + "semantics": "human-readable reason for the blocked-on dependency" + }, + "depends": { + "type": "string", + "required": false, + "semantics": "legacy dependency reference preserved from archived entities" + } + } + } + }, + "body": { + "required_opening": "problem-statement paragraph before any heading", + "recognized_sections": [ + "## Acceptance criteria", + "## Test plan", + "## Stage Report: ", + "### Feedback Cycles" + ], + "stage_report_validation": "none" + }, + "invariants": [ + { + "id": "id_uniqueness", + "rule": "per id-style, ids unique across active + archived" + }, + { + "id": "slug_uniqueness", + "rule": "slugs unique across active + archived" + }, + { + "id": "mod_block_guard", + "rule": "mod-block non-empty AND status terminal -> refuse without --force" + }, + { + "id": "merge_hook_invariant", + "rule": "workflow has any _mods/*.md with ## Hook: merge AND pr empty AND mod-block empty -> refuse terminal without --force" + }, + { + "id": "two_step_audit", + "rule": "single --set cannot both clear mod-block and advance to terminal" + }, + { + "id": "pr_mirror", + "rule": "writing pr on worktree-backed entity also writes pr to canonical copy; no other field mirrors" + } + ], + "known_corpus_failures": {} +} diff --git a/docs/schema/workflow-readme.mdschema.yml b/docs/schema/workflow-readme.mdschema.yml new file mode 100644 index 000000000..b84e28d9f --- /dev/null +++ b/docs/schema/workflow-readme.mdschema.yml @@ -0,0 +1,190 @@ +{ + "version": "1.0", + "target": "workflow_readme", + "applies_to": { + "filename": "README.md", + "required_at": "workflow_root" + }, + "frontmatter": { + "strict_canonical": true, + "required": [ + "commissioned-by", + "entity-type", + "entity-label", + "entity-label-plural", + "id-style", + "stages" + ], + "optional": [ + "mission" + ], + "fields": { + "commissioned-by": { + "type": "string", + "pattern": "^spacedock@([0-9]+\\.[0-9]+\\.[0-9]+)?$", + "description": "The empty version alternative intentionally accepts bare spacedock@ stamps from early generated workflows." + }, + "entity-type": { + "type": "string", + "pattern": "^[a-z][a-z0-9_]*$" + }, + "entity-label": { + "type": "string" + }, + "entity-label-plural": { + "type": "string" + }, + "id-style": { + "type": "string", + "enum": [ + "sequential", + "sd-b32", + "slug" + ] + }, + "mission": { + "type": "string" + }, + "stages": { + "type": "mapping", + "required": [ + "states" + ], + "optional": [ + "defaults", + "transitions" + ], + "shape": { + "defaults": { + "type": "mapping", + "fields": { + "worktree": { + "type": "boolean", + "default": false + }, + "concurrency": { + "type": "integer", + "minimum": 1, + "default": 2 + }, + "model": { + "type": "string", + "enum": [ + "sonnet", + "opus", + "haiku" + ] + } + } + }, + "states": { + "type": "sequence", + "min_items": 2, + "item": { + "type": "mapping", + "required": [ + "name" + ], + "fields": { + "name": { + "type": "string", + "pattern": "^[a-z0-9][a-z0-9-]*[a-z0-9]$" + }, + "initial": { + "type": "boolean" + }, + "terminal": { + "type": "boolean" + }, + "worktree": { + "type": "boolean" + }, + "concurrency": { + "type": "integer", + "minimum": 1 + }, + "gate": { + "type": "boolean" + }, + "fresh": { + "type": "boolean" + }, + "feedback-to": { + "type": "string" + }, + "agent": { + "type": "string" + }, + "model": { + "type": "string", + "enum": [ + "sonnet", + "opus", + "haiku" + ] + } + } + } + }, + "transitions": { + "type": "sequence", + "item": { + "type": "mapping", + "required": [ + "from", + "to" + ], + "fields": { + "from": { + "type": "string" + }, + "to": { + "type": "string" + }, + "label": { + "type": "string" + } + } + } + } + } + } + } + }, + "body": { + "required_sections_per_stage": { + "heading": "### `{stage_name}`", + "accept_bare": true, + "accept_annotated": true, + "required_bullets": [ + "Outputs:" + ], + "conventional_bullets": { + "severity": "warn", + "list": [ + "Inputs:", + "Good:", + "Bad:" + ] + } + } + }, + "invariants": [ + { + "id": "initial_cardinality", + "rule": "exactly one stage with initial:true AND it is the first list entry" + }, + { + "id": "terminal_cardinality", + "rule": "exactly one stage with terminal:true AND it is the last list entry" + }, + { + "id": "stage_subsection_present", + "rule": "every stages.states[].name has a matching ### heading in body" + }, + { + "id": "feedback_target_valid", + "rule": "feedback-to value (when present) names another stage in states[]" + } + ] +} diff --git a/docs/site/AGENTS.md b/docs/site/AGENTS.md new file mode 100644 index 000000000..4efc8f9dc --- /dev/null +++ b/docs/site/AGENTS.md @@ -0,0 +1,152 @@ +# Authoring directive for `docs/site/` + +**Always apply this when creating or modifying any content under `docs/site/`.** It is the Spacedock adaptation of Recce's documentation-writing standard ([`writing-content/reference/doc.md`](https://github.com/DataRecce/recce-team/blob/main/recce-team/skills/writing-content/reference/doc.md)). It governs both the **shape** of a page (structure and simplicity) and its **voice** (terminology, register, and the tells to avoid). When a question falls outside it, fall back to the `elements-of-style:writing-clearly-and-concisely` (Strunk) skill. + +## The first rule: simple beats complete + +Spacedock is not complicated; wordy docs make it *feel* complicated. Every edit should make the reader's job smaller, not bigger. + +- **Lead with the problem or the payoff, not the definition.** Open each page with why the reader is here and what they will be able to do, then name the mechanism. + - ✗ "Spacedock is a multi-agent orchestrator." + - ✓ "You have work that needs doing in stages, with a human sign-off before anything ships. Spacedock runs that for you." +- **Introduce the fewest terms possible, as late as possible.** Don't front-load a glossary. Define a term on first real use, gloss it once, then just use it. If a page introduces more than a handful of new terms, cut or defer some. +- **Lead with what the user sees and must know; keep the how-it-works light.** Name the visible behavior and the required concepts first. Internal mechanics (scheduling and reuse conditions, file/branch naming templates, parser internals, query plumbing) get at most a sentence or a link to the source. If a paragraph reads as protocol documentation, compress it or cut it. +- **A paragraph that cannot be restated as one verifiable claim is flowchart.** Behavior checked against the owning skill compresses to a single stated fact; phase-by-phase narration of a skill's internals cuts entirely and the page loses nothing. +- **Product anatomy is not a reader concept.** Do not name internal components (launcher, plugin, host) unless the reader must type or choose between them. The reader installs Spacedock and launches Spacedock; how it decomposes is not their problem. +- **Write from the reader's seat, never the maintainer's.** If a sentence's subject is the build, the script, or the parser, recast it around what the reader gets or does. + - ✗ "The build emits a curated `llms.txt` index at the site root." + - ✓ "Start from `llms.txt`, the curated index of these pages." +- **A well-named command needs no caption.** "Run `spacedock doctor`." is complete; explaining that it diagnoses problems repeats the name. Likewise, never pre-document what a tool prints interactively (the install script already prints its own `PATH` note). +- **A heading is a promise; the section delivers exactly that.** Codex setup is not "Install Spacedock"; a tab the reader chose ("no Homebrew") must not explain the thing they opted out of. Headings also take the reader's angle: name what the reader does or gets ("Turn the report into a workflow"), not what the product emits ("The commission offer"). +- **Serve the typical reader; route edge audiences elsewhere.** A Get-started page carries only the path a typical user walks. Contributor and from-source material lives in the repo, not inline, not even as one sentence. +- **Document the durable value, not the output inventory.** Do not enumerate an artifact's current sections or fields; they drift with releases. Name what the reader gets out of it (the survey report is "the four things you learn", not a list of its sections). +- **A real example with sample output beats description.** One concrete command plus the output the reader will see teaches faster than paragraphs. Compose samples from the product's real templates, and never restate in prose what the sample already says ("Accept this design, or tell me what to change" makes a "nothing is generated until you accept" sentence dead weight). +- **Keep agent-facing surfaces off user pages.** Commands meant for the agents (`spacedock status`, `spacedock dispatch`) are not taught in Get started, even though their output is real. +- **The docs may defer to the agent.** Spacedock runs inside a coding agent; "ask the agent if anything is unclear" is a legitimate close. Pages cover what the reader needs before and between sessions, not every contingency within one. +- **For an agent-operated feature, the page's job is trust, then deferral.** State the guarantees that make handing over safe (nothing is auto-replaced; it waits for your approval), then defer the mechanism to the agent. Detail beyond the guarantee is reassurance theater. +- **The agent is the interface; docs say what is possible.** Files and settings are operated by asking the first officer, not by hand-editing walkthroughs. Describe capabilities ("a stage can pause at a gate, run isolated, demand a fresh reviewer… ask the first officer to set up or change any of it"), never a field-by-field table of how to key them in. +- **A diagram beats an enumeration for structure.** A stage chain is a mermaid flowchart, not five field-by-field bullets. Carry the meaning in a plain-text caption sentence as well, so the styling is decoration and text readers (`llms.txt`) lose nothing. +- **State behavior; don't narrate rationale.** "When a stage declares `worktree: true`, everything from that stage onward happens in an isolated worktree" — not the tradeoff that motivated the design. The reader wants what happens, not why we built it that way. +- **Built-in capabilities must read as built in.** Never introduce a built-in mechanism inside a "you can add…" list; the reader hears setup required. Built-ins first, stated as facts; add-ons after, clearly labeled as the flexible layer. +- **Verify behavior claims against the owning source.** Before documenting what the product does, read the skill or code that owns the behavior (rejection rounds auto-bounce; the gate reaches the captain on pass or at the cycle cap). A plausible paraphrase from memory is how docs go wrong. +- **No archaeology pages.** An example lives as a sample at the point of need (a design summary on the commission page, a gate review on the gates page), not as a trace page that walks one real artifact through every mechanism at once. +- **Cut, don't pad.** If a sentence still carries its meaning with a clause removed, remove the clause. If a paragraph repeats the page above it, delete it and link instead. +- **Don't repeat content across pages.** One page owns each idea; others link to it. Duplicated explanation is the main thing that makes the docs feel long. An audience split ("new-user view" vs "operator view" of the same feature) is not a reason for a second page; it is the same topic. +- **The same heading on two pages is a dedupe alarm.** When two pages grow a section with the same title, they are explaining the same idea twice. Pick the owner; the other page keeps only its delta. This is invisible page by page; it shows up only when reading the pages as a set. +- **Shared ideas resolve to the earliest page in the journey that needs them.** Later pages keep only their delta. Killing a duplicated section rarely loses content; its unique facts usually fit in a tail clause of an adjacent paragraph. +- **Split pages by cadence, not topic.** Two features used at different rhythms (every session vs every upgrade) are two pages, even when both read as the same kind of work. A shared reader moment, not a shared noun, is what makes one page. +- **Two levels of structure only:** section → page. No deeper nesting; no page that exists only to hold sub-pages. + +## First impression: "simple," not "enterprise" + +The first screen a reader sees should make them feel *"there is one idea here, and I already get it,"* not *"this is a complex system with a manual."* Front-loading vocabulary or role tables is what makes simple software read as enterprise software. + +- **No glossary or role table in the first screen.** Introduce a term only when the next sentence needs it; defer the rest to the page that owns them and link. A roles or personas table near the top reads as enterprise structure. Link to it, do not lead with it. +- **Name the one idea, then say everything else is detail.** Give the reader permission to feel they understand the core before they read on. For Spacedock that idea is: work moves through stages, and nothing crosses a gate without a decision the reader owns. +- **The docs home connects forward, it does not re-pitch.** A reader who arrived from the landing page has already seen the problem and the promise. Do not restate the pain. State the one idea and point into the docs. + +## Pick the page type, then follow its shape + +Every page is one of three types. The nav already sorts roughly this way; match the type to the reader's question. + +| Type | Reader asks | Spacedock sections | Must NOT contain | +|------|-------------|--------------------|------------------| +| **Concept** | "What is this / why does it matter?" | Concepts | Step-by-step instructions (link to the how-to instead) | +| **Tutorial / how-to** | "How do I do X?" | Get started, Running workflows | Long conceptual digressions (link to the concept instead) | +| **Reference** | "What are the options / the exact contract?" | Reference | Narrative prose, tutorial steps | + +### Concept pages +Opening (1–2 sentences: what + why) → how it works (2–4 short paragraphs, ideally one diagram or worked snippet) → when to use / when not → **Related** links (tutorial + reference). Describe the system; don't instruct. If you're numbering steps, you're in the wrong page type. + +### Tutorial / how-to pages +State the goal in one sentence. List prerequisites up front (never bury them mid-page). Numbered steps, **one action per step**, each showing the expected result (name the command *and* the output, e.g. ``spacedock status --next`` then what it prints). End with verification and "next steps" links. Push command-grammar and theory to the bottom or to a linked concept; the reader wants the outcome first. + +### Reference pages +Optimize for lookup in seconds. Use **tables** for flags, fields, and options (Name · Type/required · Description · Default). Give copy-paste examples. Mark required vs optional; show defaults; note edge cases. When the title and a table carry everything, ship the title and the table: no framing intro, no closing caveat. If a "reference" page is really an explanation, it belongs in Concepts; if it's really a procedure, it belongs in a how-to. + +## Link instead of repeat +- Cross-link liberally with **relative** internal links (so `mkdocs build --strict` resolves them). Concept → tutorial + reference; tutorial → concept + reference; reference → concept. +- Link the first mention of a load-bearing term (`workflow`, `gate`, `ensign`, `commission`) to the page that owns it, rather than re-explaining it. +- Descriptive link text, never "click here". +- **Link by payoff, not by contents.** A cross-link or Next entry names what the reader gains there ("read about the survey report to understand your usage pattern"), not what the page contains ("covers what survey reports"). +- **External tools get a gloss and a link to their own site.** A dependency that is not Spacedock (`agentsview`, `safehouse`) is a real thing the reader may install: name it, gloss it in one clause, link it on first mention. + +## Voice + +This voice is grounded in two real signals, not invented: the root `README.md` (Spacedock's public voice: direct, anti-hype, claim-first, concrete over abstract, second person) and the `comm-officer` mod (`docs/dev/_mods/comm-officer.md`, the project's Strunk-based prose discipline, which defers to this directive when one exists). + +- **Precise, honest, technical.** State what is true and what a thing does. Spacedock's own pitch leads with the claim, not the adjective. +- **No marketing or hype adjectives.** Avoid "powerful", "seamless", "revolutionary", "effortless", "blazing-fast", "game-changing". If a sentence still carries its meaning with the adjective removed, remove it. The product ethos, evidence over assertion, is the writing ethos. +- **Concrete over abstract.** Name the command, the file, the outcome. Prefer "`spacedock status --next` lists the items ready to dispatch" over "Spacedock surfaces actionable work." +- **Claim-first, then support.** Lead a section with the load-bearing sentence; follow with the detail. Mirrors the README's "what's different" bullets (bold claim, then the mechanism). + +## Avoid the AI-writing tells + +Generated prose has a recognizable texture: padded, impersonal, and the fastest way to make simple software feel like a manual. Cut these on sight: + +- **Em-dashes.** Do not use `—`. Rewrite as a period, a comma, a colon, or parentheses. A sentence that needs an em-dash usually wants to be two sentences. The exemption is narrow: output reproduced verbatim (a captured transcript, a rendered template, an error string) must match what the tool prints. A sample you compose is prose: format it cleanly (newlines, colons) within the tool's real shape. +- **The "not just X, but Y" frame** and its cousins ("it's not only…", "more than just…"). State what the thing is. Drop the contrast scaffolding. +- **Rule-of-three padding.** Three parallel adjectives or clauses where one carries the meaning ("clear, simple, and easy to follow"). Keep the load-bearing one. +- **Hollow transitions and hedges.** "That said," "It's worth noting that," "In order to," "It's important to understand." Delete them; start with the content. +- **Empty intensifiers.** "very", "really", "quite", "actually", "simply", "just" (when it adds nothing), "leverage", "utilize". Prefer the plain verb. +- **Throat-clearing openers.** A page or section that opens by restating its own title or announcing what it will cover ("This page covers…", "In this section, we will…"). Open with the content. + +Read it back and remove every word that survives removal. If a sentence sounds like it is performing thoroughness rather than saying something, rewrite it. + +## Tone and register per audience + +- **New-user pages (Get started): welcoming and encouraging, still precise.** Assume no prior Spacedock knowledge; define a term on first use; show the command and the output to expect. Confidence-building, never breezy. +- **Operator pages (Concepts, Running workflows): direct and operational.** The reader is doing the work; tell them the steps and the decision points plainly. +- **Reference pages: exact and unembellished.** Precision outranks warmth. Name the contract, the test, the failure mode. +- **Person and tense.** Second person ("you run", "you approve") and present tense for how-to and instructions, the README's register. Imperative for steps ("Run `spacedock doctor`."). Describe the system in the present tense ("the first officer dispatches an ensign"), not the future. + +## Canonical terminology and capitalization + +Pinned from how the README and skills actually use these forms, not an imposed guess. Use them consistently. + +| Term | Form | Notes (grounded in real usage) | +|------|------|--------------------------------| +| Spacedock | `Spacedock` | The product. Always capitalized. `spacedock` (lowercase, code font) only as the literal command/binary. | +| Captain | `Captain` (role) / `the captain` (prose) | README roles table capitalizes the role name; running prose uses lowercase "the captain" (skills use `{captain}`). The human operator. | +| First Officer | `First Officer` (role) / `the first officer` (prose) | Same pattern: Title Case in the roles table / when naming the role; lowercase in running prose ("the first officer reads the README"). The orchestrator agent. | +| Ensign | `Ensign` (role) / `the ensign`, `ensigns` (prose) | Same pattern. The worker agent that moves one item through one stage. | +| workflow | `workflow` | Common noun, lowercase. A directory of markdown entities + a README. | +| entity | `entity` | Common noun, lowercase. One work item (a markdown file or folder). The README also says "work item"; prefer "entity" in docs, gloss it as "work item" on first use for new users. | +| stage | `stage` | Common noun, lowercase. backlog → ideation → implementation → validation → done. | +| gate | `gate` | Common noun, lowercase. The decision point at the end of a stage. | +| sprint | `sprint` | Common noun, lowercase. A grouped set of entities driven to a deliverable. | +| worktree, mod, safehouse | lowercase | Common nouns. `safehouse` is also the sandbox profile filename `.safehouse`. | + +Rule of thumb: capitalize a **role** when you name it as a role (the roles table, a definition); use lowercase for the same word in ordinary running prose; never capitalize the common-noun primitives (workflow, entity, stage, gate, sprint). + +## Formatting conventions + +- **Commands and code.** Inline code font for commands, flags, filenames, and identifiers: `spacedock claude`, `--strict`, `mkdocs.yml`. Multi-line commands and config in fenced code blocks with a language tag (` ```bash `, ` ```yaml `). +- **Show expected output when the reader must check it.** Name what a command prints when the reader acts on the result (e.g. `spacedock status --next` prints the dispatchable set). A well-named command needs neither caption nor sample output. +- **Headings.** Sentence case ("Get started", "Your first workflow"), not Title Case. One `#` h1 per page (the page title); section headings start at `##`. +- **Links.** Descriptive link text, never "click here" / "this link". Internal links are relative so `mkdocs build --strict` can resolve them. GitHub links and install commands point at `main`, never a development branch. +- **Lists and emphasis.** Bullets for parallel items; bold for the load-bearing claim of a bullet (the README pattern). Use emphasis sparingly; if everything is bold, nothing is. + +## Revising a page: the per-paragraph loop + +Revision is paragraph by paragraph, not page-at-a-glance. For each paragraph ask three questions: + +1. **Why would the reader care?** If the paragraph has no answer, cut it. +2. **Do they have the context, at this point in the journey?** Every term and claim must be introduced by this page or an earlier one in nav order. +3. **Does it make them want to read on?** End on the payoff or the concrete thing they can type, not on a caveat. + +Propose the revision yourself, then send the proposal (with its three-question rationale) to the prose-polish teammate when one is standing. Triage the return; never auto-apply: accept real improvements, reject anything that violates this directive (em-dash insertions are the recurring offender) or alters captain-set wording, re-verify any suggestion that touches a behavior claim against the owning source (polish can phrase a plausible falsehood), and re-check the file on disk before acting on replies that arrive late. + +## Before you commit a docs change +- [ ] Page opens with the problem/payoff, not a definition. +- [ ] Introduces only the terms this page actually needs; each defined on first use. +- [ ] Re-read every touched page in reader-journey order: no term appears before the page that introduces it, including terms your own rewrite brought back. +- [ ] No inbound link targets a heading this change removed or renamed; anchors break silently (`--strict` reports them as INFO, not failures), so grep for the old anchor. +- [ ] Matches its type's shape (concept / how-to / reference): no steps in concepts, no essays in reference. +- [ ] No content duplicated from another page; shared ideas are linked, not repeated. +- [ ] Internal links are relative and descriptive; first mentions of key terms link out. +- [ ] Ordered lists actually count up (watch for fenced blocks resetting the counter to 1). +- [ ] No AI-writing tells (em-dashes in prose, "not just X but Y", rule-of-three padding, hollow transitions, empty intensifiers, throat-clearing openers). +- [ ] Voice and terminology follow this directive. +- [ ] `mkdocs build --strict` passes. +- [ ] Read it back once and cut every sentence that survives removal. diff --git a/docs/site/advanced/external-tracker.md b/docs/site/advanced/external-tracker.md deleted file mode 120000 index 1ded4065e..000000000 --- a/docs/site/advanced/external-tracker.md +++ /dev/null @@ -1 +0,0 @@ -../../specs/state-behavior-extension.md \ No newline at end of file diff --git a/docs/site/advanced/external-tracker.md b/docs/site/advanced/external-tracker.md new file mode 100644 index 000000000..ff5d8b8e5 --- /dev/null +++ b/docs/site/advanced/external-tracker.md @@ -0,0 +1,12 @@ +# Bridge an external tracker + +Your backlog may already live somewhere else: Linear, GitHub Issues, another ticket ledger. The bridge is asking, not configuring: tell the first officer to intake an external item ("intake GitHub PR #134") and it files an entity carrying the reference. The external system keeps owning intake, discussion, and assignment; Spacedock stays the execution workflow. + +Two frontmatter fields carry the reference: + +```yaml +issue: ENG-123 +source: linear +``` + +`issue` is the human-facing external reference; `source` records where the entity came from. Keep the tracker out of Spacedock's stage semantics: sync through entity creation, state changes, and stage reports, not tracker-specific stage rules. diff --git a/docs/site/advanced/mods-and-standing-teammates.md b/docs/site/advanced/mods-and-standing-teammates.md index a1d5b52e6..f39fd8803 100644 --- a/docs/site/advanced/mods-and-standing-teammates.md +++ b/docs/site/advanced/mods-and-standing-teammates.md @@ -1,79 +1,17 @@ # Mods & standing teammates -Mods extend a workflow without touching the binary. A mod is a markdown file under `{workflow_dir}/_mods/`. There are two kinds, and one file can be both: a **lifecycle hook** that the first officer runs at a named point in the run, and a **standing teammate** declaration that spawns a long-lived specialist agent into the team. Both live in `_mods/*.md`. The difference is which sections the file carries and which binary reads them. `spacedock status` scans the `## Hook:` headings (the `--boot` MODS section, and the merge-hook guard); `spacedock dispatch` parses the standing-teammate sections. +As a workflow matures, you start wanting behavior every workflow wants: create a PR and act on it when it merges or CI fails, or keep a specialist whose judgment persists across the whole session (a prose polisher). You could model such steps as stages of your own; they are common enough across workflows that they are factored out as mods instead. A mod is the hook mechanism: standardized behavior that hooks into the workflow's run without changing your workflow definition. The stages and gates in your README stay as they are; a mod is one markdown file in the workflow's `_mods/` directory, and the first officer reads it and acts on it. ## Lifecycle hooks -A mod hook is a `## Hook: {point}` section that the first officer runs at a fixed point in the run. Three points are supported: +A hook adds a step the first officer performs at a fixed point in the run: at `startup`, on an `idle` pass when nothing is ready to dispatch, or at the `merge` boundary when an entity reaches its final stage. Hook points are workflow-independent: any workflow can register the same hook to get the same behavior. -- `startup`: runs once at boot, before the normal dispatch loop. -- `idle`: runs on the idle re-check pass when no entity is ready to dispatch. -- `merge`: runs at the terminal merge boundary for an entity, before any local merge, archival, or status advancement. - -Hooks are additive and run alphabetically by mod filename. The body of a hook section is prose the first officer executes; it names the commands to run and the conditions to branch on, in plain markdown. Nothing compiles; the first officer reads the section and acts on it. - -A mod can register more than one point. The shipped `pr-merge` mod (`docs/dev/_mods/pr-merge.md`) registers all three: its `## Hook: startup` and `## Hook: idle` sections scan for entities with a pending `pr` and advance any whose PR has merged, and its `## Hook: merge` section opens the code-branch PR, records `pr:` on the entity, and blocks until merge. - -### Merge hooks can block, and the mechanism enforces it - -A `merge` hook can wait for captain approval before pushing, or for a remote PR to merge. The first officer signals the wait through the entity `mod-block` field, and `spacedock status` enforces the discipline so a blocked entity cannot slip past the gate: - -- **Set before invoking.** The first officer sets `mod-block=merge:{mod_name}` before running the merge hook: - - ```bash - spacedock status --workflow-dir {workflow_dir} --set {slug} mod-block=merge:{mod_name} - ``` - -- **Guarded.** `spacedock status --set` refuses any terminal transition while `mod-block` is non-empty. `--archive` refuses too. Pass `--force` to override. -- **Required when a merge hook exists.** Independently of `mod-block`, `status --set` and `status --archive` refuse to terminalize or archive an entity when the workflow registers any merge hook (`_mods/*.md` with a `## Hook: merge` section) *and* the entity's `pr` field is empty *and* `mod-block` is empty. This forces the merge ceremony to leave a truthful signal that a merge actually ran. `merge: local` in the workflow README exempts the `pr` requirement; `verdict=rejected` exempts it too (a rejected entity never runs the ceremony). `--force` bypasses everything. -- **Cleared in its own call.** When the blocking action completes, the first officer clears the block: - - ```bash - spacedock status --workflow-dir {workflow_dir} --set {slug} mod-block= - ``` - - This clear MUST be standalone. `status --set` exits 1 if `mod-block=` is combined with a terminal field (`status={terminal}`, `completed`, `verdict`, or `worktree=`) in one call. Use two commits. - -`mod-block` is read from frontmatter at boot, so a pending merge survives session resume. The first officer picks up which mod is blocking and resumes the wait. +The canonical example is the [`pr-merge` mod](https://github.com/spacedock-dev/spacedock/blob/main/docs/dev/_mods/pr-merge.md): it opens the code-branch PR at merge, records the PR on the entity, and holds the terminal transition until the PR merges. The block is enforced; a half-merged entity cannot slip past the gate. ## Standing teammates -A standing teammate is a long-lived specialist agent (a prose polisher, a code reviewer, a translator) declared by a mod with `standing: true` in its frontmatter. It lives in the team for the team's lifetime and is addressed by name. Use one when a recurring specialist judgment is worth a persistent agent rather than a fresh dispatch each time. - -### Declaration - -One mod file per teammate under `{workflow_dir}/_mods/{name}.md`. The parse contract (see `internal/dispatch/mods.go`): - -- **Frontmatter** carries `standing: true` and an optional `description`. -- **`## Hook: startup`** declares the spawn config as `- key: value` bullets. `spacedock dispatch spawn-standing` reads `subagent_type`, `name`, and `model` here; `model` must be one of `sonnet`, `opus`, `haiku`. Backtick-wrapped values are unwrapped. -- **`## Routing Usage`** (optional) is the prose each ensign sees telling it when and how to route to the teammate. -- **`## Agent Prompt`** MUST be the last top-level section. Its body, from the line after the heading to end of file, is the verbatim prompt passed to the spawned agent. Any `## ` heading after it is rejected loudly by `spacedock dispatch spawn-standing`. - -### Lifecycle - -The first officer drives three `spacedock dispatch` subcommands, all reading `_mods/` directly. Do not grep frontmatter yourself: - -```bash -spacedock dispatch list-standing --workflow-dir {wd} # abs mod paths, one per line, sorted -spacedock dispatch spawn-standing --mod {abs_path} --team {team_name} -spacedock dispatch show-standing --workflow-dir {wd} # ensign-facing routing block -``` - -- **Discovery** runs at boot via `list-standing`. It prints the absolute path of each `standing: true` mod, one per line, sorted alphabetically; empty output means none. -- **Spawn is deferred** to the first team-mode dispatch. `spawn-standing` emits an `Agent()` spec for the host to launch, or `{"status": "already-alive", "name": ...}` when the team config already lists that member. Standing teammates are **first-boot-wins**: when several workflows share one team, the first first officer to find the member absent spawns it, and the rest skip. A mod that fails to parse (missing `## Agent Prompt`, an invalid `model`, a trailing heading) is reported and skipped; it does not block the workflow. -- **Routing is best-effort and non-blocking.** Address the teammate by its declared `name`, with a 2-minute timeout. If no reply lands in time, the sender proceeds without the specialist's output. Round-trips of several minutes are normal on long drafts. -- **Teardown is team-scoped.** The teammate dies when the team is torn down (session end, explicit delete, captain shutdown). There is no cross-team or cross-session persistence; mid-session death is detected on the next routing attempt. - -Bare (single-entity) mode and degraded mode still run discovery (it is cheap) but skip the spawn pass, because there is no team to spawn into. - -### Ensign discovery - -Ensigns find standing teammates without the first officer wiring anything per dispatch. When a workflow declares at least one standing teammate, `spacedock dispatch build` appends a `spacedock dispatch show-standing` fetch line to each ensign dispatch. `show-standing` renders a `### Standing teammates available in your team` block, carrying each teammate's `## Routing Usage` body when present and otherwise a one-line fallback, so every dispatched worker learns who to route to. - -## The comm-officer prose-polisher - -The canonical standing teammate is the **comm-officer**: a standing prose-polisher the first officer routes deliberate drafts through before captain review. By convention it is declared as `_mods/comm-officer.md` with `standing: true` and named `comm-officer`. +A standing teammate is a long-lived specialist agent declared by a mod. It stays available for the session and is addressed by name. Reach for one when the same specialist judgment recurs across entities and is worth a persistent agent rather than a fresh dispatch each time. -The first officer routes through it when composing **deliberate drafts**: PR bodies, gate-review summaries, long narrative entity-body sections, debrief content. It checks team membership first and treats the call as best-effort and non-blocking on the 2-minute timeout; if the comm-officer is absent or silent, the draft ships un-polished. Explicitly **out of scope**: live captain replies, short operational statuses (`pushed`, `tests green`, `PR opened`), tool-call output, commit messages, and transient logs. Polish is a deliberate-draft discipline, not a live-turn reflex. +The canonical example is the [**comm-officer**](https://github.com/spacedock-dev/spacedock/blob/main/docs/dev/_mods/comm-officer.md), a prose-polisher the first officer routes deliberate drafts through (PR bodies, gate summaries, debriefs) before they reach you. Routing is best-effort: if the teammate is absent or slow, the work proceeds without it. -The comm-officer's prose discipline is light-touch by default: it applies the `elements-of-style:writing-clearly-and-concisely` skill (Strunk) to cut empty words and tighten sentences while preserving the caller's voice, rhythm, and technical vocabulary. It defers to a project voice guide when one exists. For Spacedock's own docs, that guide is the [Voice & tone](../contributing/voice-and-tone.md) page. The comm-officer and any doc contributor follow it, falling back to plain Strunk only where the guide is silent. +Ask the first officer to install a shipped mod or write a new one; the file format is its job. diff --git a/docs/site/advanced/multi-workflow.md b/docs/site/advanced/multi-workflow.md new file mode 100644 index 000000000..7096f01b9 --- /dev/null +++ b/docs/site/advanced/multi-workflow.md @@ -0,0 +1,7 @@ +# Multiple workflows + +Related work rarely fits one pipeline. A project can hold several workflows, each its own directory with its own README, stages, and gates; Spacedock finds them all and operates them at the same time. A typical pair: a benchmark project where one workflow tracks the experiment runs and another ships the harness that supports them. + +There is no hard dependency declaration between workflows. When an entity in one workflow depends on work in another, annotate it in the entity's frontmatter (a flat custom field naming the other item) and the first officer figures it out at dispatch time. + +Workflows in the same repo can also keep their state off your code branch; [split-root state](split-root-state.md) covers that. diff --git a/docs/site/advanced/refit.md b/docs/site/advanced/refit.md new file mode 100644 index 000000000..c6255226d --- /dev/null +++ b/docs/site/advanced/refit.md @@ -0,0 +1,3 @@ +# Refit a workflow + +When you upgrade Spacedock, run `/spacedock:refit path/to/workflow` to bring the workflow's generated files up to date while leaving your edits in place. Nothing is auto-replaced: you see a diff and decide, file by file, and if a schema change affects your entities, refit proposes the migration and waits for your approval. Git is the safety net; ask the agent about anything else. diff --git a/docs/site/advanced/split-root-state.md b/docs/site/advanced/split-root-state.md index a8d93131b..bce8e5b74 100644 --- a/docs/site/advanced/split-root-state.md +++ b/docs/site/advanced/split-root-state.md @@ -1,129 +1,9 @@ -# Multi-workflow & split-root state +# Split-root state -A split-root workflow separates the workflow's definition from its runtime -state. The README and stage declarations stay on your main branch; the mutable -entities (frontmatter updates, stage reports, archive moves) live in a -separate state checkout. State transitions stop polluting your code branch's -history, and the same workflow definition can drive shared issues without each -status change landing as a commit on `main`. +A busy workflow generates constant small state commits. Split-root keeps them off your code branch. The README and stage declarations stay on your main branch; the mutable entity state (frontmatter updates, stage reports, archive moves) lives in a separate state checkout. Your code history stays clean, and several agents or operators can drive the same workflow at once. -You opt in with a single README field. Without it, Spacedock keeps the -default single-root behavior: entities sit beside the README on the same -branch. +The README [opts in with one `state:` field](../concepts/workflows-and-entities.md#keep-workflow-state-off-your-code-branch); the split is transparent, and you read the workflow exactly as you would any other. On a fresh clone the state checkout is absent; run `spacedock state init` to restore it. The shipped [`docs/dev` workflow](https://github.com/spacedock-dev/spacedock/tree/main/docs/dev) runs split-root, a live example. -## The two roots +## Concurrent writers -A workflow resolves to two directory roles, derived from the README's `state:` -field (`internal/status/roots.go`): - -- **`definition_dir`**: the directory containing `README.md`. It holds the - workflow identity and the stage declarations. This is what you pass as - `--workflow-dir`. -- **`state_dir`**: `definition_dir/` when the README declares a - non-empty `state:` value, otherwise `definition_dir` itself. It holds the - active entities and the `_archive` directory. - -`spacedock status` reads stage declarations from `definition_dir/README.md` and -entities from `state_dir`. It writes frontmatter updates and archive moves only -into `state_dir`. In single-root mode the two roots are the same path, matching -the original same-directory layout, so existing workflows are unaffected. - -## Declare `state:` in the README - -Add a top-level `state:` field to the README frontmatter. The value is a path -relative to the README directory: - -```yaml -state: .spacedock-state -``` - -The path is resolved against the definition dir. The interpreter -(`internal/status/state.go`) rejects two classes of value rather than following -them silently: - -- An **absolute path** fails: `state:` must be relative to the README directory. -- A path that **escapes the definition dir** via `..` fails: the v0 contract is - a child checkout, not an arbitrary location. - -An empty `state:`, an absent field, or the explicit `$inline` sentinel all -resolve to single-root (inline) mode. The shipped `docs/dev` workflow uses -`state: .spacedock-state`; see its README for a live example. - -Active entities live directly under `state_dir`; there is no `entities/` -subdirectory. Archived entities move to `state_dir/_archive`. Read the state -with the launcher exactly as you would a single-root workflow; the split is -transparent to the command surface: - -```bash -spacedock status --workflow-dir docs/dev -spacedock status --workflow-dir docs/dev --next -``` - -## The state branch - -The state checkout lives on an orphan branch in the same repo (no second repo, -no second remote), and the checkout itself is a linked worktree of the main repo -at the gitignored `state:` path. State commits land on the orphan branch, so the -code branch never sees them. Spacedock derives the branch name from the workflow -dir's basename, `spacedock-state/`, so `docs/dev` maps to -`spacedock-state/dev`. An explicit `state-branch:` field in the README overrides -the derived name verbatim (`StateBranch` in `internal/status/state.go`). - -Because the branch is shared through `origin`, multiple agents (and multiple -operators) can drive the same workflow concurrently. That makes the commit -discipline below a correctness requirement, not a style preference. - -## Concurrency-safe state commits - -The state checkout is a single, non-branched git index. A bare `git add -A` -followed by a bare `git commit` sweeps up a sibling writer's staged entity, -cross-attributing or clobbering it. **Every writer commits path-scoped**, naming -exactly the entity it touched: - -```bash -git -C {state_checkout} add {entity_path} -git -C {state_checkout} commit -m "..." -- {entity_path} -``` - -Never a bare `git add -A` or a bare `git commit` against the state checkout. -On `index.lock` contention, retry after roughly two seconds. When the status -tool owns the `add`+`commit` under a lock, route through it instead: a -tool-managed atomic commit is preferred over the manual path-scoped fallback. - -### Multi-writer sync - -The path-scoped rule extends to three sync points against `origin`, not a pull -before every dispatch: - -- **After a state commit, push.** `git -C {state_checkout} push origin {state_branch}`. -- **On a non-fast-forward rejection, rebase then re-push.** - `git -C {state_checkout} pull --rebase origin {state_branch}` replays your - single-file commit atop the peer's. Disjoint paths produce no conflict. -- **At first-officer boot, pull once.** Integrate peers' state at boot, not on - every read. - -If `pull --rebase` conflicts (two writers editing the same entity's frontmatter -at once), the first officer halts the dispatch, aborts the rebase, and surfaces -the conflicting entity and peer commit to the captain. It does not -`--force`-push and does not auto-resolve with `-X ours`/`-X theirs`, either of -which silently drops a peer's edit. A full lock model is out of scope; the halt -is the boundary behavior. - -## Worktree stages under split-root - -When a split-root workflow has a worktree stage, the worktree isolates the -deliverable work product only. The entity body and stage reports are still -written and committed to the state checkout at the entity's state-checkout path, -never a worktree copy. The dispatch helper hands the worker that path even under -a worktree stage. "Commits must be on this branch" applies to the deliverable -artifacts; entity state always lands in the state checkout. - -## Bridging an external tracker - -Split-root state is the integration point for external trackers: Linear, GitHub -Issues, kata, or another ticket ledger. The external system can own backlog -intake, discussion, and assignment while Spacedock remains the execution -workflow. The bridge uses flat top-level frontmatter fields (`issue:`, -`source:`) so the current line-oriented parser preserves them. See the -[external-tracker bridge](external-tracker.md) for the field contract and the -principles that keep Spacedock's stage semantics out of the tracker. +The agents follow the commit and sync discipline that keeps concurrent writers from clobbering each other; it is theirs to follow, not yours. The one case that reaches you: conflicting edits to the same entity halt for your call rather than auto-resolving. diff --git a/docs/site/advanced/sprints-and-roadmap.md b/docs/site/advanced/sprints-and-roadmap.md deleted file mode 100644 index 2dbeefe58..000000000 --- a/docs/site/advanced/sprints-and-roadmap.md +++ /dev/null @@ -1,64 +0,0 @@ -# Sprints & roadmap - -A roadmap is the strategy layer above the per-entity workflow: it owns outcome, scope, sequencing, and definition-of-done; the workflow owns task state. The two never overlap. Spacedock's own build uses this split. `docs/roadmap/README.md` is the strategy layer, and `docs/dev/.spacedock-state/` holds the executable entities that `spacedock status` queries. The roadmap explicitly "does **not** track task state." - -This page describes that construct as Spacedock practices it on itself. It is a convention (prose, frontmatter, and the native `status --where` query), not new binary behavior. There is no `sprint` recognizer, no `--sprint-validate` gate, and no contract bump. If you want this discipline for your own workflow, you adopt the convention; nothing in the launcher enforces it. - -## The roadmap as strategy layer - -The roadmap file owns four things the workflow does not: the **outcome** each sprint unlocks, the **scope** (which entities are in), the **sequencing** (value-ordered sprint list), and the **definition-of-done** per sprint. Task state, meaning which stage an entity is in and whether its gate passed, stays in the entities and is read with `spacedock status`. Keep the two separate: a roadmap that starts tracking stage transitions has duplicated the workflow and will drift from it. - -A roadmap groups its work into sprints. A sprint is a set of entities driven to one deliverable, with its own `index.md` recording the goal, the members-as-query, the DoD, and what is out of scope. - -## The shaping-FO / Commander split - -A sprint is run by two distinct roles, and the boundary between them is the load-bearing rule of the construct. - -- **Shaping FO** owns strategy and shape: the roadmap, each sprint's *definition* (deliverable + DoD), the gating ideation of the sprint's entities, the cross-entity coherence check, the staff readiness review, and packaging the sprint for execution. It stays high-level and does **not** hand-drive stage execution. -- **Commander** takes one packaged sprint and drives it to its deliverable: dispatches each stage, approves execution gates and merges, runs the sprint-wide integration test, and produces the report. The Commander boots `spacedock:first-officer` and creates its own team. - -The handoff between them is a **conn-to-drive dispatch**: a self-contained sprint package at `NNN-/dispatch-sprint-execution.md` that the Commander runs from a cold boot. It is a package, not a context transfer. The Commander does not inherit the shaping FO's session, and escalates back to the shaping FO and captain only on a third feedback cycle, a budget blowout, an irrecoverable block, or a genuine scope fork. - -## Sprint lifecycle: shape, drive, close - -A sprint moves through three phases, owner-tagged. The per-sprint checklist lives in the sprint's `index.md`; the canonical template is the lifecycle checklist in `docs/roadmap/README.md`. - -**Shape (shaping FO).** - -1. **Scope-lock** with the captain: which entities are in, which defer. The captain decides. -2. **Carve.** Stamp `sprint`, `group`, and `sprint-readiness` frontmatter on the members; write `index.md` (goal, members-as-query, DoD, out-of-scope). -3. **Ideate** each gated member: problem, approach, acceptance criteria, and test plan, with the **riskiest mechanism exercised first** (a spike, or a recorded "no spike needed"). Check existing ideation state first; never re-ideate a banked design. -4. **Preflight staff review.** Dispatch an *independent* reviewer to refute the designs. This is not optional and never a self-review: the reviewer is neither the FO nor the ideation ensigns, because the value is refuting the FO's own assumptions. Findings land in `staff-review.md` and Material ones are folded before the gates lock. -5. **Present ideation gates**, with checklist accounting and acceptance-criteria cross-check per member. The captain decides; the FO never self-approves. -6. **Package.** Write `dispatch-sprint-execution.md`: the boot recipe, per-member build notes, in-drive gates, and the release-cut recipe. - -**Drive (Commander, a separate cold-booted session).** - -1. Move each member through implementation, validation, and done, with a **detached adversarial audit at validation** for every high-stakes surface (front door, status guards, shipped scaffolding, CI/release machinery). -2. Merge each member to `next` by PR; keep state commits concurrency-safe. -3. **Pre-cut antipattern audit.** With all members merged and the tag not yet fired, dispatch an *independent* reviewer over the assembled sprint to catch cross-cutting antipatterns and integration holes before they ship. Ship-blockers are fixed before the cut; non-blockers are recorded for the next sprint. Running it after the tag means the antipattern has already shipped. -4. **Cut the release.** Confirm `go test ./...` is green from the root, then follow the authoritative cut procedure (see [Releasing](../contributing/releasing.md)). The captain authorizes the cut. - -**Close (shaping FO).** - -1. **Seed the next sprint.** Fold the pre-cut audit's deferred and non-blocking findings into the next sprint's backlog, and run a light post-cut release verification, since some release-machinery issues only manifest once the tag actually fires. - -## Membership is a query, not a list - -A sprint groups its entities by frontmatter query, never a hard-coded roster. Members carry `sprint`, `group`, and `sprint-readiness` frontmatter, and the rollup is the native `--where` filter on `spacedock status`. Run it against the workflow that holds the entities (`docs/dev` in Spacedock's own build): - -```bash -# every member of a sprint -spacedock status --workflow-dir docs/dev --where sprint=0200-flip - -# the drivable set — excludes deferred members -spacedock status --workflow-dir docs/dev --where sprint=0200-flip --where 'sprint-readiness != defer' -``` - -Each `--where` clause is `field value`, where `` is `=` or `!=`. Stacking clauses ANDs them: an entity is listed only when it matches every clause. The operator forms also cover presence and absence: `field !=` matches a non-empty value, `field =` matches an empty one. This is the same filter any reader can run; the sprint is a convention layered on top of it, not a separate command. - -`sprint-readiness: defer` is how a member stays in the sprint's definition but out of the Commander's drivable set. In the `0200-flip` capstone, `pj` is marked `defer` to mean "driven in the shaping session, not by the cold-boot Commander". That defers it from one driver; it is not dropped from the sprint. - -## Adopting it for your own workflow - -You do not need any launcher feature to use this. Add a roadmap file above your workflow, stamp `sprint` / `group` / `sprint-readiness` on the entities you group, and read membership with `status --where`. The construct buys you the strategy/state separation and the shaping-FO/Commander discipline; it costs you a convention you maintain by hand. Spacedock runs it on itself precisely to learn whether the convention earns a graduation to first-class support before any code is written for it. diff --git a/docs/site/concepts/gates-and-decisions.md b/docs/site/concepts/gates-and-decisions.md index a17619cb5..1298e0d02 100644 --- a/docs/site/concepts/gates-and-decisions.md +++ b/docs/site/concepts/gates-and-decisions.md @@ -1,74 +1,62 @@ # Gates & decisions -A gate is the decision point at the end of a stage where nothing advances without your vote. When a stage is marked `gate: true` in the workflow README, the first officer stops after the worker completes, renders a gate review, and waits for you. It never self-approves. This page covers what a gate review carries, the three calls you make, how feedback cycles loop and where they cap, and the detached adversarial audit that backs high-stakes validation. +A gate is the decision point at the end of a stage where nothing advances without your vote. When a stage declares a gate, the first officer stops after the worker completes, presents a review, and waits for you. It never self-approves. -## What a gate carries - -The first officer presents a gate review only after it has read the worker's `## Stage Report`, checked every dispatched item, and counted the results. The review is the first officer's own prose with a fixed spine. The first three lines and the last line carry the decision; everything between is supporting evidence. If you stop reading after line three, you can still vote. - -A gate review looks like this: +Each call you make sharpens the bar, and the destination is delegation. When you are sufficiently confident in the workflow and the bar you set, hand over the conn and let the first officer drive multiple tasks with auto-approval: ``` -Gate review: {entity title} — {stage} -Chosen direction: {one-line summary of the worker's approach, or n/a} -Recommend {approve | reject: {one-line reason}}. +you have the conn to drive toward your sprint goal, authorized to approve and +merge PR on CI green. use your judgement. +``` -Checklist (from ## Stage Report in {entity_file_path} lines {start}-{end}): -- DONE: {≤10-word gist} -- SKIPPED: {gist} — {reason} -- FAILED: {gist} — {reason} +Until then, the gates are yours. -Reviewer findings - Material: {fact-corrections, contract violations, missing AC evidence} - Polish: {wording, format drift, non-blocking suggestions} +## What you see at a gate -Assessment: {N} done, {N} skipped, {N} failed. +A gate review has a fixed spine: the first three lines and the last line carry the decision; everything between is supporting evidence. If you stop reading after line three, you can still vote. -Decision: {what approval/rejection does in concrete terms}. ``` +Gate review: Fix the flaky login test — review +Chosen direction: replace sleep-based waits with event polling +Recommend reject: the AC-2 retry scenario has no covering test. -Read it as follows: - -- **`Chosen direction` names what the worker picked**, so you do not have to open the entity file to learn it. Ideation picks an approach; validation picks PASS or REJECTED. Stages with no choice to make show `n/a`. -- **`Recommend` is the first officer's verdict, stated exactly once.** It does not reappear restated elsewhere in the review. -- **`Checklist` is a gist roll-up, not the report.** The full `## Stage Report` is cited by file path and line range; open it when you want the detail. -- **Reviewer findings split into `Material` and `Polish`.** Material items (fact-corrections, contract violations, missing acceptance-criterion evidence, claims the codebase contradicts) are the ones that should move your vote. Polish is non-blocking. An empty tier is dropped. -- **`Decision` names what your vote does in concrete terms.** For example, "approve to enter implementation in worktree `.worktrees/...`" or "reject to bounce back to {feedback-to target}". +Checklist (from ## Stage Report in docs/ship-features/fix-the-flaky-login-test.md): +- DONE: login test stable across 50 consecutive runs +- FAILED: retry scenario unproven — no test exercises it -At every gate the first officer also runs an acceptance-criteria cross-check: it scans `## Acceptance criteria`, confirms each `**AC-N**` has evidence cited from this or a prior stage report, and names any criterion left without evidence. +Reviewer findings + Material: AC-2 cites a test file that does not exist + Polish: stage report wording drifts from the template -## The three calls +Assessment: 1 done, 0 skipped, 1 failed. -You answer a gate with one of three calls. +Decision: approve to close; reject to bounce back to implementation. +``` -- **Approve.** The entity advances to the next stage and the first officer dispatches it, reusing the live worker when it can and dispatching fresh otherwise. If the next stage opens or closes a worktree, the `Decision` line told you so. Approving the terminal stage runs the merge and cleanup ceremony. -- **Redo with feedback.** You approve the direction but send concrete fixes back. Name the specific asks ("tighten the AC-2 substring assertion, correct the file path claim"), not "address the reviewer's notes". The first officer routes your asks back to the stage that owns the work, the worker re-does it, and the gate is re-presented. -- **Reject.** At a stage with a `feedback-to` target, rejecting bounces the work back to that target stage to be fixed and re-validated; this is the feedback cycle below. At a stage without `feedback-to`, rejection is terminal for that path. +Material findings are the ones that should move your vote; Polish never blocks. The Decision line tells you concretely what your vote does. Every acceptance criterion is cross-checked before the review reaches you; a criterion without cited evidence is named rather than passed over. -A redo and a reject at a `feedback-to` stage run the same routing machinery. The difference is whether you are correcting a direction you accept or sending it back. Both name the concrete fix asks so the next worker has something to act on. +## The three calls -## Feedback cycles and the loop cap +- **Approve.** The work advances to the next stage. Approving the terminal stage merges and closes it. +- **Redo with feedback.** You accept the direction but send concrete fixes back. Name the specific asks ("tighten the AC-2 substring assertion, correct the file path claim"), not "address the reviewer's notes". +- **Reject.** The work bounces back to the stage that owns the fix, carrying your findings. -When a feedback stage recommends `REJECTED`, or you reject at a `feedback-to` stage, the work routes back to the stage named in `feedback-to`: the stage that owns the fix, not the reviewer that flagged it. In the dev workflow, `validation` has `feedback-to: implementation`, so a rejected validation sends the deliverable back to implementation, not back to the validator. +Redo and reject differ only in whether you accept the direction; both carry your concrete asks so the next worker has something to act on. Nothing closes without its verdict on the record. -The first officer tracks each round in a `### Feedback Cycles` section in the entity body, then: +## Rejections -1. Reads the rejected stage's `feedback-to` target. -2. Routes your concrete findings to that target, reusing the live worker in the same worktree when it is still addressable and reuse conditions pass, dispatching fresh otherwise. The routed message carries the fix work and the stage assignment, not just an acknowledgment. -3. Re-runs the reviewer after the fix. -4. Re-enters the gate flow with the updated result, presenting you a fresh gate review. +Rejections bounce automatically: the findings route back, the work is redone, and the reviewer re-runs; no stop at your desk. The gate reaches you only when the work passes review, or after **three failed rounds**, when the call returns to you instead of bouncing again. Every round is on the record in the item's file. -**The loop caps at three.** On cycle 3 the first officer escalates to you instead of bouncing a fourth time. The same fix has now failed twice, so the call returns to a human rather than looping. This cap is exercised by the `feedback-3-cycle-escalation` runtime scenario, which asserts the first officer escalates on the third rejected validation rather than auto-bouncing again. +A useful rejection to type at a gate: "send it back unless this now needs reframing". -## The detached adversarial audit +## Reviews beyond validation -For high-stakes surfaces, a passing validation is necessary but not sufficient. Before merging, the first officer also runs a read-only adversarial audit. The audit catches the hole that validation cannot see itself: a test that passes today but would also pass on a broken future edit. +A typical validation stage already covers code review: the work is checked against your acceptance criteria, with the rejection loop behind it. Adversarial review is built in as well: a `fresh: true` validation stage is exactly that, and high-stakes changes can also get a detached, out-of-workflow pass. -The audit triggers on four surfaces: the front-door launcher (`spacedock claude` / `codex` / `doctor`), the `status` mutation and guard paths, the shipped contract and scaffolding, and the CI and release machinery. Routine, low-blast-radius changes do not need it; a normal validation suffices. +Adversarial review differs from validation in what it distrusts: validation checks the work; the adversarial pass checks the validation. It is read-only and tries to refute the result by constructing an adversarial edit the deliverable's own tests should catch, then confirming they do. A test that stays green under an edit that breaks the claim is a hole validation cannot see on its own. "Refuted nothing material" is a valid recorded outcome, and material findings route back through the rejection loop. -It runs on a separate throwaway checkout, never the implementation worktree, and never mutates the deliverable. The auditor tries to refute the validation: it constructs an adversarial edit that the deliverable's own tests should catch and confirms they do. A test that stays green under an edit that breaks the claim is a hole. Findings come in two tiers, `Material` and `Polish`; "refuted nothing material" is a valid recorded outcome. +The workflow is flexible beyond that: you can add conditional, lens-specific reviews of your own, like checking that the documentation was updated when a change affects end users. -Results feed the same gate machinery you already know: +## Where to go next -- **Material findings route back through the normal validation-to-implementation feedback flow**, with a `### Feedback Cycles` entry naming the audit and its adversarial edit. The gate is not presented as clean until they are closed. -- **A clean audit is noted in the gate's reviewer-findings block**, or as a one-line "detached audit: no material findings". +- [Operate a workflow](../running-workflows/operating.md) covers answering gates in the day-to-day loop. diff --git a/docs/site/concepts/operating-model.md b/docs/site/concepts/operating-model.md index 6276ad590..7344ee2ee 100644 --- a/docs/site/concepts/operating-model.md +++ b/docs/site/concepts/operating-model.md @@ -1,41 +1,43 @@ # The operating model -Spacedock runs on three roles and one division of labor: you shape the work and make the calls; the agents drive each item through its stages and bring decisions back to you with evidence. This page names the roles, the line between shaping and driving, and why decisions arrive batched. +Spacedock runs on three roles and one division of labor: you shape the work and make the calls; the agents drive each item through its stages and bring decisions back to you with evidence. -## Three roles +## Roles -| Role | Who | What they own | -|------|-----|---------------| -| **Captain** | You. | The mission, and the call at every approval gate unless you delegate it. | -| **First Officer** | The orchestrator agent. | Running the workflow: dispatch, gate presentation, advancing entity state. | -| **Ensign** | The worker agent. | Moving one entity (one work item) forward through one stage. | +| Role | Who | Ownership | +|------|-----|-----------| +| **Captain** | You | The mission, and the call at every approval gate unless delegated | +| **First Officer** | The orchestrator agent | Runs the workflow for you and brings each decision to you with evidence | +| **Ensign** | The worker agent | Moves one work item through one stage | -There is one captain and one first officer per session. The number of ensigns tracks the dispatchable work: the first officer dispatches one per entity per stage. +Each session has one captain and one first officer; ensigns come and go with the work. -The first officer reads the workflow README, runs `spacedock status --next` to find entities ready to advance, and dispatches an ensign for each. An ensign reads its assignment, does the stage's work, commits, writes a stage report, and signals done. The first officer reviews that report against the checklist it dispatched. If the stage is gated, it pauses and presents the report to you. If not, it advances the entity and dispatches the next stage itself. A completed non-gated, non-terminal stage is not a stopping point. +The first officer keeps the work moving so you do not have to: it finds what is ready, hands each item to a worker, and checks the result against the bar you set. Gated stages pause for your call; everything else flows forward without you. ## Shaping versus driving -The captain shapes; the agents drive. These are different jobs, and the split is what keeps you out of the per-step loop. +The captain shapes the product and owns the workflow; the agents drive. These are different jobs, and the split is what keeps you out of the per-step loop. -**Shaping is defining what good looks like before the work runs.** You set the mission, the stages, and the bar each stage must clear, all declared in the workflow README rather than negotiated mid-task. You commission a workflow with [`/spacedock:commission`](../running-workflows/commission.md), and you make the calls at gates: approve, redo with feedback, or reject. Some gates you answer yourself; others resolve through a delegated agent review. That is the whole of the captain's standing job. +**Shaping is the product judgment: the goal, the taste, the steering.** What to build, what good looks like, which direction survives a gate. You make the calls at gates (approve, redo with feedback, or reject); some you answer yourself, others resolve through a delegated agent review. This judgment is the part the agents cannot supply. -**Driving is moving entities through the declared stages.** The first officer schedules and dispatches; the ensign does the stage work and proves it. The first officer is allowed to take obvious reversible steps without asking: a dispatch the workflow already permits, a status read, a routine state transition. It asks you only when requirements are materially ambiguous, a design choice would change the output meaningfully, or scope is too unclear to turn into concrete criteria. Everything else happens without a prompt to you. +**Owning the workflow is the structural half.** You set the stages and the bar each stage must clear, declared and serialized in the workflow README, starting from [`/spacedock:commission`](../running-workflows/commission.md). The declaration is shapable mid-task: you do not have to get it right the first time. When a bar turns out fuzzy in practice, tighten the README and the next dispatch works to the new line. -The line holds because of one rule: the maker does not judge its own work. Review runs as a separate stage with fresh context and no access to the ensign's reasoning (see [Gates and decisions](gates-and-decisions.md)). The first officer never self-approves a gated stage. +**Driving is moving work items through the declared stages.** The first officer schedules and dispatches; the ensign does the stage work and proves it. The first officer acts on its own for routine, reversible steps and asks you only when something is genuinely ambiguous: unclear requirements, a design choice that would change the output, scope too vague to turn into criteria. Everything else happens without a prompt to you. + +The line holds because of one rule: the maker does not judge its own work. Review runs as a separate stage with fresh context and no access to the maker's reasoning (see [Gates and decisions](gates-and-decisions.md)). The first officer never self-approves a gated stage. ## Batched, evidenced decisions -Decisions reach you batched and backed by evidence, not as a stream of interruptions. This is the point of the model: your attention is the bottleneck, so the agents queue work and surface only the calls that need a human. +Decisions reach you batched and backed by evidence, not as a stream of interruptions. Your attention is the bottleneck; the agents queue work and surface only the calls that need a human. -**Batch the work; decide as it flows back.** Queue many entities at once. Ensigns advance each through its stages in parallel. You handle gates as they surface, not one session at a time, and not on the agent's schedule. While one entity waits on a clarification, the first officer keeps dispatching the others. +**Batch the work; decide as it flows back.** Queue many work items at once. Ensigns advance each through its stages in parallel. You handle gates as they surface, not one session at a time, and not on the agent's schedule. While one item waits on a clarification, the first officer keeps dispatching the others. -**Every gate carries evidence.** When the first officer presents a gate, it does not hand you the transcript. It renders a fixed gate-review format: the chosen direction in one line, a single clear recommendation you can approve with one "yes", a gist roll-up of the stage report's `DONE`/`SKIPPED`/`FAILED` items cited by file path and line range, and any reviewer findings split into `Material` and `Polish` tiers. The target is 15-25 lines. You decide on the evidence and the bar, not on a wall of output. +**Every gate carries evidence.** The first officer does not hand you the transcript. It presents a short review: what was chosen, the evidence for it, and one recommendation you can approve with a single yes. [Gates and decisions](gates-and-decisions.md) shows the format. -**The decision leaves a trail.** Each gate records the verdict and its reason alongside the stage report in the entity file. The record outlives the reviewer, so a bad result traces back to the call that caused it. When you end a session, [`/spacedock:debrief`](../running-workflows/debrief-and-refit.md) captures what happened (commits, state changes, decisions, open issues), and the next session picks up from it. +**The decision leaves a trail.** Each gate records the verdict and its reason alongside the stage report in the work item's file. The record outlives the reviewer, so a bad result traces back to the call that caused it. When you end a session, [`/spacedock:debrief`](../running-workflows/debrief.md) captures what happened, and the next session picks up from it. ## Where to go next -- [Workflows and entities](workflows-and-entities.md) covers the directory and files the roles operate on. -- [Stage lifecycle](stage-lifecycle.md): how an entity moves backlog → ideation → implementation → validation → done. -- [Gates and decisions](gates-and-decisions.md) lays out what a gate review carries, the three calls you make, and the detached adversarial audit. +- [Workflows and entities](workflows-and-entities.md) to see where your work lives as plain files. +- [Stage lifecycle](stage-lifecycle.md) to follow one item end to end. +- [Gates and decisions](gates-and-decisions.md) to see exactly what you decide and on what evidence. diff --git a/docs/site/concepts/stage-lifecycle.md b/docs/site/concepts/stage-lifecycle.md index 6e7190441..0d09d8f5e 100644 --- a/docs/site/concepts/stage-lifecycle.md +++ b/docs/site/concepts/stage-lifecycle.md @@ -1,67 +1,39 @@ # The stage lifecycle -An entity moves through a fixed chain of stages, one at a time, and each stage declares the work it owns and the proof it must produce. The dev workflow's chain is `backlog → ideation → implementation → validation → done`; your own workflow names its own stages, but the mechanics are the same. The first officer advances an entity stage by stage, dispatching one ensign per stage and pausing at the gates you declared. +An entity moves through an ordered chain of stages that the workflow defines, and each stage declares the work it owns and the proof it must produce. The first officer advances an entity stage by stage, pausing at the gates you declared. -The stage order, names, and per-stage properties live in the workflow README's frontmatter under `stages.states`. This page uses the dev workflow (`docs/site/contributing/development-workflow.md`) as the running example; read that page for the full per-stage Inputs/Outputs/Good/Bad detail. +## A typical dev workflow -## What a stage declares - -Each entry under `stages.states` is a stage name plus a set of boolean or string properties. The first officer reads these to decide how to dispatch and when to stop. A `stages.defaults` block sets the baseline; a stage entry overrides it. The properties that change behavior: +```mermaid +flowchart LR + backlog --> ideation --> implementation --> validation --> done + validation -. rejected .-> implementation + classDef gate stroke:#e3b04b,stroke-width:2.5px + class ideation,validation gate +``` -| Property | Effect | -|----------|--------| -| `initial: true` | The stage an entity starts in. The dev workflow marks `backlog`. | -| `terminal: true` | The stage an entity ends in. Reaching it runs the merge and cleanup ceremony, not another dispatch. The dev workflow marks `done`. | -| `gate: true` | The first officer presents a stage report and waits for your decision instead of advancing on its own. | -| `worktree: true` | The stage's work runs in an isolated git worktree. Absent or `false`, it runs inline. | -| `fresh: true` | The stage always gets a freshly dispatched ensign, never a worker reused from the prior stage. | -| `feedback-to: {stage}` | On rejection, work routes back to the named stage rather than failing outright. | -| `concurrency: N` | How many entities may sit in this stage at once. | -| `agent: {name}` | Which worker skill the first officer dispatches. Defaults to `ensign`. | +The gold-bordered stages are gates, your calls: before code is written, and before the result ships. Read the chain as a pipeline: each stage takes the prior stage's output as its input, and the bar rises from "is this clear?" to "is this proven?". -Beyond these properties, the prose of each stage's `###` subsection in the README is the stage definition: its Inputs, Outputs, and the Good/Bad bar. The first officer copies that subsection verbatim into the ensign's assignment, so what a stage declares in prose is exactly what the worker is told to do. +The property that matters most is `feedback-to`: rejected work bounces back to the stage that owns the fix (a rejected validation returns to implementation), not to the reviewer that flagged it. -## The stages +## What a stage declares -Read the chain as a pipeline: each stage takes the prior stage's output as its input, and the bar rises from "is this clear?" to "is this proven?". +A stage can pause at a [gate](gates-and-decisions.md) for your decision, run its work in an isolated worktree, demand a reviewer with no access to the maker's reasoning, route rejected work back to an earlier stage, cap how many items it holds at once, and hand its work to a specialist worker. All of it is declared in the workflow README; ask the first officer to set up or change any of it. -- **`backlog`, the seed.** An entity enters here when first proposed: a title, a source, a brief description, and the test gates future stages must satisfy. No design work yet. `initial: true`, so this is where every new entity starts. The dev workflow also marks it `gate: true`, so the first officer presents a new entity for your go-ahead before it advances. -- **`ideation`, the design.** A worker clarifies the problem, explores approaches, and produces a fleshed-out body: problem statement, proposed approach, acceptance criteria, and a test plan. Each acceptance criterion names how it will be checked. This stage is `gate: true` in the dev workflow, so the first officer presents the design for your approval before any code is written. -- **`implementation`, the deliverable.** A worker produces the change the entity describes: code, fixtures, instruction text, on-disk state. This stage is `worktree: true`, so the work happens in an isolated checkout. Implementation completing is not a stopping point. A completed, non-gated stage routes straight on to validation. -- **`validation`, the independent check.** A worker verifies the deliverable against the acceptance criteria. It checks what was produced; it does not produce the deliverable itself. It reproduces the evidence each `AC-N` cites and returns a `PASSED` or `REJECTED` recommendation. This stage is `worktree: true`, `fresh: true`, `feedback-to: implementation`, and `gate: true`. -- **`done`, the verdict.** Validation is complete and you approve the result. The entity is closed with a `verdict` of `PASSED` or `REJECTED` and a `completed` timestamp. `terminal: true`. +Beyond the declarations, the prose of each stage's section in the README is the stage definition. What you write there is exactly what the worker receives as its assignment. ## Fresh context at validation -Validation declares `fresh: true` because the reviewer must not be the maker. The first officer normally reuses a live worker across consecutive stages to save context, but a `fresh: true` stage forces a new dispatch every time. The validator arrives without the implementer's reasoning in its context, sees only the entity body and the deliverable, and pushes back on thin evidence. This is the mechanism behind the README's claim that "the agent doesn't get to judge its own work." +Validation declares `fresh: true`, and `fresh: true` means adversarial review: the reviewer is never the maker, arrives without the implementer's reasoning in its context, sees only the entity body and the deliverable, and pushes back on thin evidence. This is the mechanism behind the README's claim that "the agent doesn't get to judge its own work." High-stakes changes can also get a detached, out-of-workflow pass; [Reviews beyond validation](gates-and-decisions.md#reviews-beyond-validation) covers the difference. When validation recommends `REJECTED`, `feedback-to: implementation` routes the concrete finding back to the implementation stage for rework rather than closing the entity. The entity re-enters implementation, the finding is addressed, and a fresh validator checks it again. A hard cap on feedback cycles prevents an endless bounce; on the third cycle the first officer escalates to you. -## Worktree vs. inline - -A stage runs in an isolated git worktree when it declares `worktree: true`, and inline at the repo root otherwise. This is the "isolation when it matters" tradeoff: stages that mutate shared state (implementation, validation) get their own checkout so concurrent entities don't collide; lighter stages that only edit the entity body (backlog, ideation) run inline. - -The mechanics, run by the first officer: - -- **On first dispatch to a worktree stage,** the first officer creates a worktree at `.worktrees/{worker_key}-{slug}` on branch `{worker_key}/{slug}` and records the path in the entity's `worktree` frontmatter field. -- **Inside a worktree-backed stage,** the ensign keeps all reads, writes, and commits under that worktree. The deliverable is isolated there until the entity terminalizes. -- **In a split-root workflow** (the README declares a `state:` checkout, e.g. `state: .spacedock-state`), the entity body and stage report live in the state checkout, not in the worktree; the worktree isolates only the deliverable. The first officer's dispatch hands the ensign the correct state-checkout path for the entity; the worker trusts that path rather than writing entity state into the worktree. -- **At the terminal stage,** the first officer merges the worktree branch, clears the `worktree` field, removes the worktree, and deletes the local branch. - -To see where each entity sits and which are ready to advance, read the workflow state: - -```bash -spacedock status --workflow-dir docs/dev -``` - -```bash -spacedock status --workflow-dir docs/dev --next -``` +## Isolated worktrees -`--next` lists the entities ready for dispatch, the query the first officer runs each loop. The `worktree` column shows the isolated checkout path for any entity currently mid-stage in a worktree-backed stage. +When a stage declares `worktree: true`, everything from that stage onward happens in an isolated worktree: the work and its commits stay there, concurrent items never collide with each other or with you, and at the terminal stage the branch is merged back and the worktree cleaned up. ## Where to go next -- The roles that drive this pipeline (captain, first officer, ensign) are in [the operating model](operating-model.md). -- The decision points at stage boundaries are covered in [gates and decisions](gates-and-decisions.md). -- The entity frontmatter these stages update is in the [frontmatter contract](../reference/frontmatter-contract.md). +- [The operating model](operating-model.md) for who does what: you, the orchestrator, the workers. +- [Gates and decisions](gates-and-decisions.md) to see exactly what you decide at a stage boundary and on what evidence. +- The [frontmatter contract](../reference/frontmatter-contract.md) for the fields these stages write. diff --git a/docs/site/concepts/worked-example.md b/docs/site/concepts/worked-example.md deleted file mode 100644 index d0c116923..000000000 --- a/docs/site/concepts/worked-example.md +++ /dev/null @@ -1,123 +0,0 @@ -# A worked example - -This page traces one real entity, `z9` `codex-plugin-auto-install`, from -backlog through to `done` / PASSED, using artifacts that live in this repo. It is -a concrete read of the abstract stage machine: backlog → ideation → -implementation → validation → done, the gates between them, and what the captain -actually decides at each one. - -The workflow is `docs/dev` (the Spacedock v1 dev workflow); its stages, gates, -and entity schema are defined in -[the development workflow reference](../contributing/development-workflow.md). -Runtime entity state lives in a separate `.spacedock-state` checkout, so the -finished entity itself is not in the main tree, but its full trajectory is on -the record in the `0198-pre-flip-hardening` sprint directory: `index.md`, -`dispatch-sprint-execution.md`, `debrief.md`, and `post-sprint-audit.md`. - -`z9` delivered front-door Codex plugin auto-install: `spacedock codex` now -installs a missing plugin then launches, the Codex analog of the Claude path. It -shipped as [PR #329](https://github.com/spacedock-dev/spacedock/pull/329). - -## See where it sits - -Each first-officer loop starts by asking the workflow what is ready to move. You -run the same query: - -```bash -spacedock status --workflow-dir docs/dev --next -``` - -Lists the entities ready to dispatch. To scope to one sprint, filter with -`--where`: - -```bash -spacedock status --workflow-dir docs/dev \ - --where sprint=0198-pre-flip-hardening --where 'sprint-readiness != defer' -``` - -This is the sprint membership query, the source of truth, not a hand-kept list. -At the start of `0198`, `z9` shows up here in the `binary-ux` group. - -## backlog → ideation: shape the work - -`z9` enters backlog as a seed: a title, a source, and a brief description of the -problem. It carries no design yet. Ideation is where a worker fleshes it out into -a problem statement, a proposed approach, entity-level acceptance criteria, and a -test plan, with each AC naming a check outside the entity body that can fail. - -For `z9` that meant a concrete approach: install through the shared `devBranch` -rather than a hardcoded `"next"`, so the install channel tracks the release -channel (`next` today, `main` after the flip). The design also resolved the -riskiest unknown up front (that the install branch was the channel variable, not -a literal) and recorded it before committing to the rest of the plan. - -Ideation ends at a **gate**. Because the dev workflow marks `ideation` with -`gate: true`, the first officer does not advance on its own: it presents the -design to the captain. The captain approved `z9`'s ideation on 2026-06-08. That -approval is the entry condition for implementation, and it is captured in the -sprint package so a later Commander session drives implementation directly -without re-presenting the gate. - -## implementation: produce the deliverable - -Once the design is approved, `z9` moves to implementation. The dev workflow runs -this stage in a `worktree` (`worktree: true`), so the dispatched ensign works in -an isolated checkout, not the shared tree. - -The dispatch package gave the implementer three concrete build notes, all of -which the work honored: - -- **Install on the shared `devBranch`**, via `ops.Install("codex", - marketplaceSource, devBranch)` and `--ref `, not a hardcoded - `next`, so the channel tracks the flip's later `devBranch` retarget. -- **Fix the now-false comments and error strings** it builds around - (`host_exec.go`, `frontdoor.go`), rather than adding new code around stale text. -- **Invert the obsolete test** `TestCodexFrontDoorNoPluginFailsFastWithoutInstalling`, - whose old assertion contradicted the new auto-install behavior. - -Implementation is complete when the deliverable is committed and the stage report -is filed. It is not a parking spot: a completed implementation routes straight to -`validation` dispatch. - -## validation: verify against the criteria - -`z9` moves to validation to be checked, not finished. The dev workflow marks -`validation` with `fresh: true` and `worktree: true`, so a fresh validator (one -that did not write the code) runs in its own worktree. It pulls every `AC-N` -from the entity's acceptance-criteria section, reproduces the evidence each one -cites, and produces a PASSED or REJECTED recommendation. - -`z9` is a front-door change, a high-stakes surface, so validation alone was not -sufficient. The dev workflow requires a **detached adversarial audit** for the -launcher front door: a read-only pass on a throwaway checkout of the merge result -that tries to refute the validation. The `z9` audit ran at commit `0b714fac` -and exercised five mandated probes, each reddening the test suite then reverting. -It refuted nothing material. Channel-tracking was confirmed clean. That clean -audit satisfied the sprint's DoD item for the high-stakes surface. - -Validation also has `gate: true` and `feedback-to: implementation`. A REJECTED -recommendation routes the finding back to implementation for another cycle; a -PASSED recommendation goes to the captain for the terminal decision. - -## done: the captain's verdict - -`z9` reaches `done` when the captain reads the validation report and approves. -The terminal stage records the outcome in frontmatter: `status: done`, -`verdict: PASSED`, and a `completed` timestamp. The sprint's `post-sprint-audit.md` -confirms `z9` finished `status: done`, `verdict: PASSED`, archived, with PR -reference #329, alongside its three sprint siblings (`kb` #327, `qa` #328, -`vh` #330). - -That is the whole point of the machine: nothing reached `done` on assertion -alone. `z9` advanced only on an approved design, an isolated implementation, a -fresh validation, a detached audit that failed to break it, and an explicit -captain verdict. Each one a decision, each one on the record. - -## Read the trail yourself - -The full trajectory is reconstructable from the sprint directory: - -- [`index.md`](https://github.com/spacedock-dev/spacedock/blob/next/docs/roadmap/0198-pre-flip-hardening/index.md): goal, members, definition of done, the approved-gate note. -- [`dispatch-sprint-execution.md`](https://github.com/spacedock-dev/spacedock/blob/next/docs/roadmap/0198-pre-flip-hardening/dispatch-sprint-execution.md): the per-member drive plan and `z9`'s build notes. -- [`debrief.md`](https://github.com/spacedock-dev/spacedock/blob/next/docs/roadmap/0198-pre-flip-hardening/debrief.md): what shipped, the PR links, and the decisions made along the way. -- [`post-sprint-audit.md`](https://github.com/spacedock-dev/spacedock/blob/next/docs/roadmap/0198-pre-flip-hardening/post-sprint-audit.md): the final-state confirmation and the detached-audit record. diff --git a/docs/site/concepts/workflows-and-entities.md b/docs/site/concepts/workflows-and-entities.md index f86753640..605fa254d 100644 --- a/docs/site/concepts/workflows-and-entities.md +++ b/docs/site/concepts/workflows-and-entities.md @@ -1,14 +1,12 @@ # Workflows & entities -**A workflow is a directory plus a README, and an entity is one markdown file inside it.** The README defines the stages, the schema, and the gates; each entity is a work item that moves through those stages. Everything about a work item lives in the file itself: the problem, the design notes, the bar for done, the stage reports. State survives a session, so the next one picks up where you left off. +**Your work lives as plain text in your repo: readable, editable, diffable, and nothing is lost between sessions.** A workflow is a directory plus a README; an entity is one markdown file inside it, one work item. Everything about a work item is in its file: the problem, the design notes, the bar for done, the reports filed as it advances. -This page covers what those two things are, the frontmatter on an entity, and where the entity files actually live at runtime. +## The README is where you set the rules -## The workflow: a directory and its README +The README is the single source of truth: it declares the stages and defines what each stage means, what counts as good, and what a worker must produce. Commission generates it; you edit it like any file, at any time, or more likely ask the first officer to edit it to improve the workflow. When a bar turns out fuzzy in practice, tighten the prose and the next dispatch works to the new line. Spacedock's own [dev workflow README](https://github.com/spacedock-dev/spacedock/blob/main/docs/dev/README.md) is a live example. -The README is the single source of truth. Its frontmatter declares the stages, the entity type, and the ID style; its prose body defines what each stage means, what counts as good, and what a worker must produce. You commission a workflow with `/spacedock:commission`, which generates the directory, the README, and a few seed entities for you. - -A minimal README frontmatter looks like this: +A generated README's frontmatter looks like this: ```yaml --- @@ -36,60 +34,16 @@ stages: --- ``` -Each `states` entry is one stage. The per-stage flags decide behavior: - -- **`initial: true`** marks where a new entity starts; **`terminal: true`** marks where it ends. -- **`gate: true`** makes the stage end at a gate: the first officer pauses and presents a decision instead of advancing on its own. -- **`worktree: true`** runs the stage in its own git worktree, so stages that touch shared state stay isolated. Lighter stages run inline. -- **`fresh: true`** dispatches the stage with no access to the prior worker's reasoning. Use it for review stages so the maker doesn't judge its own work. -- **`feedback-to: `** names where rejected work bounces back to for revision. - -The README body documents each stage with `Inputs`, `Outputs`, `Good`, and `Bad`. That prose is the living spec; it is what every dispatched ensign works to. Tighten it to your actual bar before the first dispatch; editing it after agents have run against vague prose costs more. - -## The entity: one work item - -**An entity is a single work item, stored as a markdown file with YAML frontmatter.** Each entity lives as either a flat file `{slug}.md` or a folder `{slug}/index.md`. Use the folder form when reports or artifacts accumulate beside the work item. Slugs are lowercase with hyphens, no spaces (`add-login.md` or `add-login/index.md`). `spacedock status` reads both forms. - -The body holds the human-readable record: a description, a problem statement, the proposed approach, acceptance criteria, and the stage reports filed as the entity advances. - -## Entity frontmatter - -The frontmatter is the machine-readable state. The full schema lives in the workflow's own README; the [frontmatter contract](../reference/frontmatter-contract.md) is the field reference across workflows. The fields you set and read most often: - -| Field | What it holds | -|-------|---------------| -| `id` | The unique identifier; format set by `id-style` in the README. | -| `title` | Human-readable name. The filename slug is derived from it. | -| `status` | The current stage, one of the stage names declared in the README. | -| `source` | Where the entity came from (e.g. `commission seed`, `linear`). | -| `started` / `completed` | ISO 8601 timestamps for when work began and when the entity reached terminal. | -| `verdict` | `PASSED` or `REJECTED`, set at the final stage. | -| `score` | Optional priority, 0.0–1.0. | -| `worktree` | The worktree path while a dispatched agent is active; empty otherwise. | -| `issue` | Optional external ticket reference, e.g. `ENG-123`, `kata:task-abc123`, or `owner/repo#42`. | - -`status` is the field that drives everything: the first officer reads it to decide which entities are ready to advance. To see the queue, run the status viewer against the workflow directory: - -```bash -spacedock status --workflow-dir docs/dev -``` - -To list only the entities ready for dispatch, run the query the first officer runs each loop: +## Each work item is one file -```bash -spacedock status --workflow-dir docs/dev --next -``` +An entity lives as a flat file `{slug}.md`, or a folder `{slug}/index.md` when reports and artifacts accumulate beside it. The body is the human-readable record: the problem, the approach, the acceptance criteria, and the stage reports. On top sits YAML frontmatter, the machine-readable state: the item's id, its current stage, its outcome. The [frontmatter contract](../reference/frontmatter-contract.md) has the fields and the schemas that define them. -## Where entities live: the state checkout +## Keep workflow state off your code branch -**A workflow can keep its mutable entity state separate from the README using a state checkout.** The README frontmatter declares it with one field: +A workflow can keep its mutable state in a separate state checkout, so routine stage transitions never churn your code branch or collide with a feature PR. The README opts in with one field, and you can ask the first officer to change the setting: ```yaml state: .spacedock-state ``` -With this set, the README (the living spec) stays on your code branch, while the entity files, their reports, and the archive live under `.spacedock-state`. Routine stage transitions then never churn the code branch or collide with a feature PR. The path is resolved relative to the README's directory; `spacedock status` reads stages from the README and entities from the state checkout, and writes frontmatter updates and archive moves into the state checkout. - -The state checkout is a linked git worktree on an orphan branch in the same repo, so state commits land on that branch and the code branch never sees them. On a fresh clone the state worktree is absent. Run `spacedock state init` to fetch the orphan branch and re-add the linked worktree before working the workflow. - -Omit `state:` (or set `state: $inline`) for a standalone workflow that isn't embedded in a code repo you ship from. Then the entities live beside the README in the same directory, with no orphan branch and no extra checkout. +[Split-root state](../advanced/split-root-state.md) covers the mechanics and the fresh-clone setup. diff --git a/docs/site/contributing/adding-a-runtime.md b/docs/site/contributing/adding-a-runtime.md index 27cd5ba3b..abb24ef55 100644 --- a/docs/site/contributing/adding-a-runtime.md +++ b/docs/site/contributing/adding-a-runtime.md @@ -2,7 +2,7 @@ A host is supported when a live or fixture-backed run launches it as a first officer, dispatches an ensign through that host's native agent mechanism, and verifies durable workflow state: process exit, entity body, state-checkout git log, and clean status. A host is not supported because its instructions mention Spacedock, and a substring search over code or prose is not proof of behavior. Spacedock ships `spacedock claude` and `spacedock codex` as proven front doors; adding a host means earning the same level of proof. -This page is the contributor's orientation. The full procedure, the exact checklists, and the worked Pi example live in [Multi-host support](../reference/multi-host.md). Read that before you write code. +This page is the full procedure: what "supported" means, the layers to add support in, the acceptance checklist, and a worked Pi example. Use it when adding a new host such as Pi, or when turning a spike into a supported runtime lane. ## What "supported" means @@ -25,6 +25,10 @@ Add support in small layers, each with its own proof at its own abstraction leve 4. **Launch/install UX.** Add `spacedock ` only after a manual or live harness proves the runtime path. Add `spacedock install --host ` only when the install path is known and checkable without mutating unrelated global host state. Add `spacedock doctor --host ` when there is a manifest, package, or runtime health check to verify. 5. **Live runner.** Prove the host with a live-gated test when the claim is runtime integration. Use a temp workflow fixture, isolated host config and session directories, and copied credentials rather than global host state. Assert process exit, entity content, git log, and clean status. Never pass on transcript phrasing. +## Launcher binary propagation through wrappers + +`spacedock claude` and `spacedock codex` attach `SPACEDOCK_BIN` to the process they exec, including the outer `safehouse -- ...` process when safehouse wrapping is active. Spacedock does not modify safehouse internals or assume a private passthrough mechanism; if a wrapper or runtime strips `SPACEDOCK_BIN` before the agent session observes it, the skill contract's `${SPACEDOCK_BIN:-spacedock}` convention degrades to the existing `$PATH` lookup. + ## Match the proof to the claim Use the smallest proof at the same abstraction level as the claim: @@ -37,6 +41,18 @@ Use the smallest proof at the same abstraction level as the claim: The failure mode to guard against: declaring a host supported because its prose looks right. Substring presence is acceptable proof only when the claim itself is about text being present or absent. +## Acceptance checklist + +A new runtime support slice is not done until the entity or PR records evidence for each applicable item: + +- Dispatch output uses the host-native contract and excludes incompatible host tool names. +- The first-officer and ensign skills load host runtime adapters. +- Split-root entity paths remain in the state checkout and are not rewritten into a code worktree. +- Follow-up/reuse cannot accept stale completion evidence, if reuse exists. +- Optional team substrates are represented as adapters over their real action schema. +- A live smoke proves the default dispatch path when runtime behavior is the claim. +- Install/launch commands exist only after the underlying mechanism is proven. + ## Manifesting from void When a runtime looks unsupported on first contact, do not read setup friction as proof the product path is impossible. A missing `auth.json`, an extension not auto-discovered in a temp home, or a subagent tool schema that differs from Claude's is harness work, not a blocker. A real blocker is a proven inability to launch, delegate, observe completion, or verify durable state *after* the harness is correct. @@ -55,4 +71,113 @@ Assume support is supposed to work. Do not treat missing polish, auth The prompt earns its place by changing the default interpretation of a failure: harness work gets fixed in-loop, and only a proven product or design blocker stops the work. -The worked Pi runtime is in [Multi-host support](../reference/multi-host.md): the live smoke mechanism, the exact parent prompt, the install and doctor surface, and the full acceptance checklist. +## The worked Pi runtime + +Pi is the worked example of a host taken from spike to supported runtime: the live-smoke mechanism, the exact parent prompt, the install and doctor surface, and the skill load paths. + +### Pi live-smoke mechanism + +The Pi proof used a live-gated test named: + +```bash +go test -tags live -run TestLivePiSubagentEnsignSmoke ./internal/ensigncycle -v -count=1 +``` + +The harness did this: + +1. Resolve `pi` from `PATH` and the local Spacedock repo root. +2. Resolve the installed `pi-subagents` package root, defaulting to: + + ```text + ~/.pi/agent/npm/node_modules/pi-subagents + ``` + +3. Create temp runtime state: + + ```text + PI_CODING_AGENT_DIR= + PI_CODING_AGENT_SESSION_DIR= + --session-dir + HOME= + ``` + +4. Copy only the operator's existing OAuth file into the isolated Pi home: + + ```text + ~/.pi/agent/auth.json -> $PI_CODING_AGENT_DIR/auth.json + ``` + +5. Launch `pi --print` with explicit local resources: + + ```text + --extension ~/.pi/agent/npm/node_modules/pi-subagents/src/extension/index.ts + --skill ~/.pi/agent/npm/node_modules/pi-subagents/skills/pi-subagents + --skill /skills/first-officer + --skill /skills/ensign + ``` + +6. Create a temp split-root workflow: + - `README.md` declares `state: .spacedock-state`. + - The entity is folder-form in `.spacedock-state/pi-live-smoke/index.md`. + - Both workflow root and state checkout are git repositories. + +7. Ask the Pi parent to call `subagent(...)` exactly once. +8. Require the worker to append a stage report and commit only the state-checkout entity path. +9. Assert durable outcomes: + - Pi process exits successfully. + - Entity body contains the exact smoke marker and stage report shape. + - State checkout git log contains the worker commit. + - The entity path has no uncommitted changes. + +For Pi, the concrete "assume it works" prompt was: + +```text +Assume Pi support is supposed to work. Do not treat missing polish, auth setup friction, or tool-shape mismatch as proof the runtime is impossible. In FO capacity, iron out the frictions: + +- if Pi auth is missing in an isolated harness, copy/reuse the existing Pi OAuth auth file correctly; +- if the dispatch substrate needs a local package/extension path, wire it explicitly; +- if the Pi tool shape differs from Claude/Codex, adapt to the Pi-native contract rather than emulating Claude tools; +- if a live test fails due to harness setup, fix the harness and rerun; +- only stop for a real product/design blocker, not for first-contact setup friction. +``` + +### Exact Pi parent prompt + +The live test formats this prompt with repository and temp paths. Keep the structure when debugging Pi runtime support; only substitute the paths and marker. + +```text +You are the Spacedock first officer for a live Pi smoke test. + +Use the pi-subagents subagent(...) tool exactly once to dispatch one Pi ensign worker. Do not use or mention Claude Agent, SendMessage, TeamCreate, or TeamDelete tools. + +Dispatch a worker with agent "delegate" and this task: + +Load and follow the local Spacedock ensign skill at /skills/ensign/SKILL.md and the Pi ensign adapter at /skills/ensign/references/pi-ensign-runtime.md. This is a split-root Spacedock workflow. + +Workflow directory: +State checkout: +Entity file: +Target stage: implementation + +Required worker actions: +1. Read the workflow README and entity file. +2. Do not edit YAML frontmatter. +3. Append an implementation stage report to the entity body containing the exact marker PI-LIVE-SUBAGENT-ENSIGN-SMOKE, at least one '- DONE:' item, and a '### Summary' subsection. +4. Commit only the entity path in the state checkout with message 'ensign: pi live smoke'. Use a path-scoped git add/commit for pi-live-smoke/index.md. +5. Return a concise completion result naming the entity file and commit evidence. + +After subagent(...) returns, you as first officer must verify the entity file contains PI-LIVE-SUBAGENT-ENSIGN-SMOKE and verify the state checkout git log contains 'ensign: pi live smoke'. Exit successfully only after those durable checks pass. +``` + +### Skill install and load paths + +For Pi, `spacedock pi` launches the proven front door by loading local resources explicitly: + +```text +/skills/first-officer +/skills/ensign +~/.pi/agent/npm/node_modules/pi-subagents/skills/pi-subagents +~/.pi/agent/npm/node_modules/pi-subagents/src/extension/index.ts +``` + +`spacedock install --host pi` is an idempotent readiness check and setup guide for that substrate; it does not install a Claude/Codex-style marketplace plugin and does not accept `--plugin-dir`. Resolve the local skill checkout by running it from the checkout or setting `SPACEDOCK_REPO_ROOT`. `spacedock doctor --host pi` reports the Pi CLI, auth file, `pi-subagents` extension/skill, local Spacedock skill health, and supervisor-talkback setup prerequisites: the `pi-subagents` intercom bridge source, the resolved `PI_INTERCOM_PACKAGE_ROOT` package root, and the `pi-intercom` skill resource. Current `pi-subagents`/`pi-intercom` packages do not expose stable `pi-intercom` or `subagents-doctor` PATH commands, so readiness is based on package/resource paths instead of command shims. These doctor/install checks are necessary setup checks but insufficient to prove live supervisor talkback. Live proof still requires the cq-style `pi-intercom-supervisor-talkback` probe: progress update -> decision request -> supervisor reply -> child resume -> durable marker evidence. Live tests should not mutate global `~/.pi/agent`; they should keep using isolated Pi homes with copied auth. diff --git a/docs/site/contributing/build-from-source.md b/docs/site/contributing/build-from-source.md new file mode 100644 index 000000000..2d23bbc52 --- /dev/null +++ b/docs/site/contributing/build-from-source.md @@ -0,0 +1,34 @@ +# Build from source + +Use this when you're working on Spacedock itself. It builds the launcher from the +development branch and loads the plugin from your checkout, so local changes take +effect immediately. For a normal install, see [Install Spacedock](../get-started/install.md). + +1. **Clone and build.** + + ```bash + git clone --branch next https://github.com/spacedock-dev/spacedock + cd spacedock + go build -o spacedock ./cmd/spacedock + ``` + +2. **Confirm the binary.** + + ```bash + ./spacedock --version + ``` + + Prints `spacedock ` for your local build. + +3. **Launch with the local plugin.** + + ```bash + ./spacedock claude "your task" -- --plugin-dir "$PWD" + ``` + + `--plugin-dir` is a host flag, so it rides after `--`. It loads the + first-officer and ensign agents from your checkout instead of the installed + plugin. Edits to the repo are live. + +The `next` branch is the development channel. It has no Homebrew release; use the +[Homebrew install](../get-started/install.md) for a stable build. diff --git a/docs/site/contributing/voice-and-tone.md b/docs/site/contributing/voice-and-tone.md deleted file mode 100644 index 2694356cd..000000000 --- a/docs/site/contributing/voice-and-tone.md +++ /dev/null @@ -1,51 +0,0 @@ -# Voice & tone - -This is the writing-style guide for Spacedock's public documentation. It is grounded in two real voice signals, not invented: - -- **The root `README.md`** — Spacedock's actual public voice: direct, anti-hype, claim-first ("Spacedock is a multi-agent orchestrator where nothing ships without a decision"), concrete over abstract, second person addressed to the reader. -- **The `comm-officer` mod** (`docs/dev/_mods/comm-officer.md`) — the project's prose discipline. It applies the `elements-of-style:writing-clearly-and-concisely` skill (Strunk) and is **light-touch by default**: "Preserve the caller's voice, rhythm, and technical vocabulary. Cut empty words, tighten sentences, fix clear grammar errors." Critically, it **defers to a project voice guide when one exists**. This page is that guide — so the comm-officer and any doc-writer apply it. - -## Voice - -- **Precise, honest, technical.** State what is true and what a thing does. Spacedock's own pitch leads with the claim, not the adjective. -- **No marketing or hype adjectives.** Avoid "powerful", "seamless", "revolutionary", "effortless", "blazing-fast", "game-changing". If a sentence still carries its meaning with the adjective removed, remove it. The product ethos — evidence over assertion — is the writing ethos. -- **Concrete over abstract.** Name the command, the file, the outcome. Prefer "`spacedock status --next` lists the items ready to dispatch" over "Spacedock surfaces actionable work." -- **Claim-first, then support.** Lead a section with the load-bearing sentence; follow with the detail. Mirrors the README's "what's different" bullets (bold claim, then the mechanism). - -## Tone and register per audience - -- **New-user pages (Get started): welcoming and encouraging, still precise.** Assume no prior Spacedock knowledge; define a term on first use; show the command and the output to expect. Confidence-building, never breezy. -- **Operator pages (Concepts, Running workflows): direct and operational.** The reader is doing the work; tell them the steps and the decision points plainly. -- **Contributor pages (Contributing, Reference): exact and unembellished.** Precision outranks warmth. Name the contract, the test, the failure mode. -- **Person and tense.** Second person ("you run", "you approve") and present tense for how-to and instructions — the README's register. Imperative for steps ("Run `spacedock doctor`."). Describe the system in the present tense ("the first officer dispatches an ensign"), not the future. - -## Canonical terminology and capitalization - -Pinned from how the README and skills actually use these forms — not an imposed guess. Use them consistently. - -| Term | Form | Notes (grounded in real usage) | -|------|------|--------------------------------| -| Spacedock | `Spacedock` | The product. Always capitalized. `spacedock` (lowercase, code font) only as the literal command/binary. | -| Captain | `Captain` (role) / `the captain` (prose) | README roles table capitalizes the role name; running prose uses lowercase "the captain" (skills use `{captain}`). The human operator. | -| First Officer | `First Officer` (role) / `the first officer` (prose) | Same pattern: Title Case in the roles table / when naming the role; lowercase in running prose ("the first officer reads the README"). The orchestrator agent. | -| Ensign | `Ensign` (role) / `the ensign`, `ensigns` (prose) | Same pattern. The worker agent that moves one item through one stage. | -| workflow | `workflow` | Common noun, lowercase. A directory of markdown entities + a README. | -| entity | `entity` | Common noun, lowercase. One work item (a markdown file or folder). The README also says "work item" — prefer "entity" in docs, gloss it as "work item" on first use for new users. | -| stage | `stage` | Common noun, lowercase. backlog → ideation → implementation → validation → done. | -| gate | `gate` | Common noun, lowercase. The decision point at the end of a stage. | -| sprint | `sprint` | Common noun, lowercase. A grouped set of entities driven to a deliverable. | -| worktree, mod, safehouse | lowercase | Common nouns. `safehouse` is also the sandbox profile filename `.safehouse`. | - -Rule of thumb: capitalize a **role** when you name it as a role (the roles table, a definition); use lowercase for the same word in ordinary running prose; never capitalize the common-noun primitives (workflow, entity, stage, gate, sprint). - -## Formatting conventions - -- **Commands and code.** Inline code font for commands, flags, filenames, and identifiers: `spacedock claude`, `--strict`, `mkdocs.yml`. Multi-line commands and config in fenced code blocks with a language tag (` ```bash `, ` ```yaml `). -- **Show expected output.** For a command a reader runs, name the result, as `install.md` does ("Prints the installed version, e.g. `spacedock 0.20.0`"). -- **Headings.** Sentence case ("Get started", "Your first workflow"), not Title Case. One `#` h1 per page (the page title); section headings start at `##`. -- **Links.** Descriptive link text, never "click here" / "this link". Internal links are relative so `mkdocs build --strict` can resolve them. -- **Lists and emphasis.** Bullets for parallel items; bold for the load-bearing claim of a bullet (the README pattern). Use emphasis sparingly — if everything is bold, nothing is. - -## How this guide is applied - -The `comm-officer` mod loads a project voice guide on first use and defers to it over plain Strunk. Once this page ships, it is that guide: doc contributions and comm-officer polish follow these rules. When this guide is silent on a question, fall back to Strunk via `elements-of-style:writing-clearly-and-concisely`. diff --git a/docs/site/get-started/first-launch.md b/docs/site/get-started/first-launch.md deleted file mode 100644 index 5659cafe8..000000000 --- a/docs/site/get-started/first-launch.md +++ /dev/null @@ -1,42 +0,0 @@ -# Your first launch - -One command starts your first session and orients you in a project you already have: - -```bash -spacedock claude "/spacedock:survey" -``` - -This launches the first officer (the orchestrator agent that runs a Spacedock workflow) inside Claude Code, and hands it the survey task. The first launch sets up the plugin for you, so this single line is enough; you do not need a separate setup step. When a `.safehouse` profile is present in the working directory, the launch runs sandboxed automatically. - -Run it from inside a project that already has some agent history, such as a repo you have been coding in with Claude Code. Survey reads that history; an empty directory has nothing to report on. - -## The command grammar - -The front door is one shape, and every launch uses it: - -```bash -spacedock claude "task" [--safehouse…] [-- host-flags…] -``` - -- **The task comes first.** It is handed to the first officer as the launch prompt. Here the task is `/spacedock:survey`, a skill the first officer runs. It could just as well be a plain sentence describing work. -- **`--safehouse` forces the launch through the sandbox.** A `.safehouse` profile in the working directory does the same automatically, so you only pass the flag when you want to force it. -- **Anything after `--` forwards verbatim to the host** (`claude` itself), including flags like `--resume`, `--model`, and `--plugin-dir`. - -`spacedock codex "task"` and `spacedock pi "task"` take the same shape for the Codex and Pi harnesses. Claude Code is the primary surface; the examples here use it. - -## What survey reports - -Survey reconstructs what the agents in this project have implicitly been doing, then reports it back. The survey is read-only; it never edits your files. It reads your recorded agent session history (through the `agentsview` tool, which it offers to install if it is missing), and it shows you, in one pass: - -- **The inferred workflow**: the loop you have been running without naming it, written out as a stage chain. -- **The workstreams**, the distinct tracks of work, each labeled as mechanical (a routine loop) or exploration (human-driven creative work). -- **The decisions still open and waiting on you.** These are the forks that were raised but never resolved. Survey leads with them, because they are the work that is actually blocked on you. -- **How often you had to step in.** Survey counts the interruptions and decision points across those sessions, so you can see where your attention has been going. - -A typical report opens with a one-line headline naming the project, the number of sessions, the date range, and the decision and interruption counts, then lays out each section below it. If the project has no agent history, survey says so plainly and stops; there is nothing to misreport. - -## What comes next - -Survey ends with an offer, not an action. After the report, it asks whether you want it to commission a real Spacedock workflow built from what it found, turning the open decisions into approval gates and the workstreams into work items. You can say yes and let it scaffold the workflow, or say no and keep the survey as a standalone orientation. Either way, nothing has changed in your project until you choose. - -To go straight to defining a workflow yourself instead, see [your first workflow](first-workflow.md). If `spacedock` is not yet installed, start with [installing Spacedock](install.md). diff --git a/docs/site/get-started/first-workflow.md b/docs/site/get-started/first-workflow.md index 182022b66..8c74024e1 100644 --- a/docs/site/get-started/first-workflow.md +++ b/docs/site/get-started/first-workflow.md @@ -1,122 +1,107 @@ # Your first workflow -A workflow is a directory of plain-text work items plus a README that defines -the stages they move through, the schema each item carries, and the gates where -you make a call. You create one by describing what you want, in plain language, -to the `/spacedock:commission` skill. This page walks that first commission end -to end: the questions it asks, the design and review gates it sets up, and what -happens once the workflow starts running. - -A few terms used below, defined on first use: - -- An **entity** is one work item: a single markdown file (the README also calls - it a "work item"). A bug report, a design idea, a feature: whatever the - workflow processes, each one is an entity. -- A **stage** is a bucket an entity sits in as it advances, for example - `ideation` or `implementation`. The first entity starts in the first stage and - moves toward a terminal one. -- A **gate** is a decision point at the end of a stage where the workflow pauses - for your call instead of advancing on its own. - -You are addressed as the captain, the workflow operator who makes the calls at -gates. The first officer is the orchestrator agent that runs the workflow; the -ensign is the worker agent that moves one entity through one stage. +Your first workflow comes from one of two places: [survey](survey.md) offers +to build one from what it found in your project, or you describe one yourself +to the `/spacedock:commission` skill. A workflow is your work moving through +stages you define, pausing at the gates where you decide. ## Commission a workflow -Run `/spacedock:commission` inside a Spacedock session and describe the work in -the same line: +Describe the work in the launch line: ```bash -spacedock claude "/spacedock:commission Track design ideas through review stages" +spacedock claude "/spacedock:commission ship features through design, implementation, and review" ``` -If you have not launched a session yet, see -[Install Spacedock](install.md) first. You can also start bare -(`/spacedock:commission` with no description) and answer the questions from -scratch. - -The skill greets you and walks three phases: **design** (a few questions), -**generate** (it writes the files), and a **pilot run** (it starts the workflow -on your seed items). The design phase asks you three things, one question at a -time: - -1. **The mission and the entity.** What the workflow is for, and what each work - item represents: "a design idea", "a bug report", "a candidate feature". The - skill derives a short label from your answer (a "design idea" becomes an - `idea`). -2. **The stages.** It proposes an ordered list from your mission (for a design - workflow, something like `backlog → ideation → implementation → validation → - done`), and you confirm, add, remove, or rename them. -3. **Seed entities.** Two or three starting items to run through, each with a - title and a short description (and an optional score). These become the - workflow's first work. - -From your answers the skill then derives the gates: which stages pause for your -approval, and which earlier stage rejected work bounces back to. By default it -gates the stage before the terminal one. - -You do not have to get every answer right. After the questions, the skill -presents the full design as a summary (stages, gates, seed items, where the -files will live) and waits. **Nothing is generated until you accept.** Tell it -what to change and it re-presents. +Commission asks a few questions, one at a time: what each work item is, the +stages it moves through, which stages pause for your decision, and the quality +bar for each. You confirm or adjust each proposal, and it presents the design: + +> I'll call you Captain; let me know if you prefer something else. +> +> For each run, we process tasks going through the following stages: +> +> a. backlog +> new tasks wait here for triage +> +> b. design +> the approach is worked out and written down +> +> c. implementation +> the change is built on an isolated branch +> +> d. review +> the result is checked against the design +> +> e. done +> accepted and closed +> +> If you reject at review, it goes back to implementation for revision. +> +> Our pilot run will be with: +> +> - Add rate limiting to the API +> - Fix the flaky login test +> +> Entity identity will use `sd-b32`. +> +> All files will be created in `docs/ship-features/` for you to review. +> +> Accept this design, or tell me what to change. ## What gets generated -Once you accept, the skill writes the workflow into a new directory under -`docs/` and confirms each file it created: +Everything is plain text in your repo: a README that holds the workflow's +rules (what each stage expects) and one file per work item. The generated +rules are a starting point; tighten them before any work runs, because an +agent working to a vague bar is expensive to correct. Learn more about every +design decision in [Commission a workflow](../running-workflows/commission.md). -- `README.md`, the workflow's living spec. It holds the mission, the schema each - entity carries, and a section per stage describing its inputs, outputs, and - quality bar (`Good:` / `Bad:`). -- One file per seed entity, named from its title, with YAML frontmatter that - records its `status`, `score`, and other fields. +## The pilot run -The per-stage prose in the README is a best-guess starting point, not a -commitment. The skill flags this directly and offers a `review stages` walk that -steps through each stage's expectations so you can tighten the quality bar before -any work runs. Tightening here is cheap; an agent dispatched against a vague bar -is not. +On accept, commission dispatches your seed items in parallel, each moving +through the stages until everything is idle or waiting on you. When a work +item reaches the `review` gate, you get a gate review: -## The design and review gates - -A gate is where the workflow stops and hands you a decision instead of advancing -on its own. This is the line Spacedock draws: work flows through the stages, but -**nothing crosses a gate without a recorded decision.** A development workflow -gates the design stage and the review stage among others, so you sign off on -the approach before code is written, and on the result before it ships. - -At each gate the first officer pauses and presents a stage report: the chosen -direction, the evidence behind it, and a single recommendation. You make one of -three calls: +``` +Gate review: Add rate limiting to the API — review +Chosen direction: token-bucket limiter at the API middleware layer +Recommend approve. -- **Approve**, and the entity advances to the next stage. -- **Redo with feedback**: it goes back for revision against the notes you give. -- **Reject**, and it bounces to an earlier stage (the one the design named as the - rejection target) to be reworked. +Checklist (from ## Stage Report in docs/ship-features/add-rate-limiting-to-the-api.md): +- DONE: limiter implemented with per-client buckets +- DONE: tests cover burst and refill behavior -You decide on the report and its evidence, not on the agent's transcript. The -decision is recorded with its reason, so a result can later be traced back to the -call that produced it. +Assessment: 2 done, 0 skipped, 0 failed. -## What happens after +Decision: approve to close; reject to bounce back to implementation. +``` -When you accept the design, the commission skill launches a pilot run on your -seed entities. It takes on the first-officer role itself for this first run: -it reads the workflow README, checks which entities are ready to advance, and -dispatches ensigns to move them through their stages. Stages that modify the -repo run in their own git worktree; lighter stages run inline. +You approve, send it back with feedback, or reject. Details: +[gates and decisions](../concepts/gates-and-decisions.md). -The run proceeds until the workflow goes idle or reaches a gate. There the first -officer stops and reports what happened: which entities moved, which stages they -passed through, which gate is waiting on you. From that point you are running the -workflow: approve, send back, or reject, and work continues. +## Operate the workflow -To resume the workflow in a later session, launch the first officer with no task: +Every work item's state is stored and serialized in the +[workflow](../concepts/workflows-and-entities.md), so you do not need to worry +about context limits, resuming, or clearing. A typical session: ```bash spacedock claude ``` -It reads the saved workflow state, picks up where you left off, and dispatches -ensigns for any entity ready for its next stage. +It picks up whatever is ready to dispatch. Keep dispatching, approving, +rejecting, steering. Before you stop, run +[`/spacedock:debrief`](../running-workflows/debrief.md) +to record what happened, update the learnings into the workflow, and file the +follow-up items. The workflow self-improves. +[Operate a workflow](../running-workflows/operating.md) covers the details. + +A project can hold +[multiple workflows](../advanced/multi-workflow.md); Spacedock finds them +and drives them together. +Since this runs in your existing coding agent, you can just ask the agent if +anything is unclear. + +Now you have the first Spacedock-powered workflow: dispatch and let the agents +work and hum when you stay calm! diff --git a/docs/site/get-started/install.md b/docs/site/get-started/install.md index b034b524a..e12572417 100644 --- a/docs/site/get-started/install.md +++ b/docs/site/get-started/install.md @@ -1,168 +1,57 @@ # Install Spacedock -This guide walks a fresh install end to end and names the output you should see -at each step. Every command here is one you can run and check against the stated -result. +Spacedock works with a coding agent you already have: Claude Code, Codex, or +Pi. Install one of those first. -Spacedock plugs into a coding agent harness you already run: Claude Code, Codex, -or Pi. Install one of those first. +=== "macOS (Homebrew)" -Spacedock itself is two pieces that install separately: + ```bash + brew tap spacedock-dev/homebrew-tap + brew install spacedock + ``` -1. **The `spacedock` launcher.** The command you run to start a session. -2. **The host plugin.** The first-officer and ensign agents, loaded by your - harness (Claude Code, Codex, or Pi). +=== "Binary (macOS / Linux)" -The recommended setup installs the launcher with Homebrew, then adds the plugin. -A from-source build is available for development. + ```bash + curl -fsSL https://raw.githubusercontent.com/spacedock-dev/spacedock/main/install.sh | sh + ``` -## Install with Homebrew (recommended) + Installs a checksum-verified binary to `~/.local/bin`. -1. **Install the launcher.** +## Launch - ```bash - brew tap spacedock-dev/homebrew-tap - brew install spacedock - ``` +In a project you already have: -2. **Confirm it.** - - ```bash - spacedock --version - ``` - - Prints the installed version, e.g. `spacedock 0.20.0`. - -3. **Launch.** Point it at a project you already have and let it survey. - - ```bash - spacedock claude "/spacedock:survey" - ``` - - Starts the first officer in Claude Code and runs the survey. The first launch - sets up the plugin for you, so this single command is enough. When a - `.safehouse` profile is present in the working directory, the launch runs - sandboxed. - - To set up the plugin ahead of time, or to refresh it later, run - `spacedock install --host claude`. - -## Install on Linux (or macOS without Homebrew) - -The Homebrew cask is macOS-only. On Linux — and on macOS if you'd rather not use -Homebrew — install the launcher with the `curl | sh` script. It detects your -OS and architecture, downloads the matching tarball from the latest GitHub -Release, verifies it against the release `checksums.txt`, and installs the -`spacedock` binary to `~/.local/bin`. - -1. **Install the launcher.** - - ```bash - curl -fsSL https://raw.githubusercontent.com/spacedock-dev/spacedock/next/install.sh | sh - ``` - - Installs `spacedock` to `~/.local/bin`. If that directory is not on your - `PATH`, the script prints a note; add it (`export PATH="$HOME/.local/bin:$PATH"`) - so the `spacedock` command resolves. Set `SPACEDOCK_INSTALL_DIR` to install - elsewhere (e.g. `SPACEDOCK_INSTALL_DIR=/usr/local/bin`, which may need `sudo`). - -2. **Confirm it.** - - ```bash - spacedock --version - ``` - - Prints the installed version, e.g. `spacedock 0.20.0`. - -3. **Add the plugin and launch** exactly as in the Homebrew steps above - (`spacedock claude "/spacedock:survey"`). - -**Sandboxing on Linux.** Spacedock's safehouse integration behaves the same on -Linux as on macOS: when a `.safehouse` profile is present in the working -directory, Spacedock wraps the launch through the `safehouse` command. Spacedock -does not ship a sandbox — it detects the profile and delegates. A run is -sandboxed only when a Linux-capable `safehouse` binary is on your `PATH`. When -the binary is absent, Spacedock prints an install hint and the launch proceeds -**unsandboxed**. Install safehouse separately if you need the sandbox on Linux; -the macOS-only Gatekeeper/quarantine handling does not apply on Linux and is not -needed there. - -## Use Codex or Pi instead - -Codex and Pi are supported but experimental. Claude Code is the primary surface. - -1. **Install the launcher** (same Homebrew step as above). - -2. **Add the plugin** for your host. - - ```bash - spacedock install --host codex # or: --host pi - ``` - - Codex installs plugins from your shell rather than programmatically, so this - prints the `codex plugin` commands to run. Run them, then use the - first-officer skill in your Codex session. - -3. **Launch** with the matching subcommand. - - ```bash - spacedock codex "your task" # or: spacedock pi "your task" - ``` - -## Build from source (for development) - -Use this when you're working on Spacedock itself. It builds the launcher from -the development branch and loads the plugin from your checkout, so local changes -take effect immediately. - -1. **Clone and build.** - - ```bash - git clone --branch next https://github.com/spacedock-dev/spacedock - cd spacedock - go build -o spacedock ./cmd/spacedock - ``` - -2. **Confirm the binary.** - - ```bash - ./spacedock --version - ``` - - Prints `spacedock ` for your local build. - -3. **Launch with the local plugin.** +```bash +spacedock claude "/spacedock:survey" +``` - ```bash - ./spacedock claude "your task" -- --plugin-dir "$PWD" - ``` +Or launch directly: - `--plugin-dir` is a host flag, so it rides after `--`. It loads the - first-officer and ensign agents from your checkout instead of the installed - plugin. Edits to the repo are live. +```bash +spacedock claude "what can spacedock do for me in this project" +``` -The `next` branch is the development channel. It has no Homebrew release. Use the -Homebrew path above for a stable install. +Replace `claude` with `codex` or `pi` for the respective coding agents. -## Keep things in sync +## Skills -`spacedock doctor` is the compatibility check. If it reports your installed -plugin is out of date, refresh it: +Spacedock installs the relevant skills on launch. To install them manually: ```bash -spacedock install --host claude +claude plugin marketplace add spacedock-dev/spacedock +claude plugin install spacedock@spacedock ``` -If the `spacedock` command itself is missing, install the launcher with Homebrew -first, then run `spacedock install --host claude`. +## Sandboxing + +See [supported sandboxes](../reference/sandbox.md). + +## Troubleshooting -## Command grammar +Run `spacedock doctor`. -The front door is `spacedock claude "task" [--safehouse…] [-- host-flags…]` -(and the same shape for `spacedock codex` and `spacedock pi`): +## Next -- The task comes first. It's handed to the first officer as the launch prompt. -- Anything after `--` forwards verbatim to the host (`claude` / `codex` / `pi`), - including `--plugin-dir`, `--resume`, `--model`, and the like. -- `--safehouse` forces the launch through the sandbox. A `.safehouse` profile in - the working directory does the same automatically. +Read about the [survey report](survey.md) to understand your usage pattern +with coding agents, or start with [your first workflow](first-workflow.md). diff --git a/docs/site/get-started/survey.md b/docs/site/get-started/survey.md new file mode 100644 index 000000000..c428640ee --- /dev/null +++ b/docs/site/get-started/survey.md @@ -0,0 +1,35 @@ +# Survey your project + +When you use the `spacedock:survey` skill, it looks at your existing agent +conversation logs on local disk (through [agentsview](https://agentsview.io/), +an open source session-history tool). It is read-only; if `agentsview` is +missing, it asks before installing it. + +## What it reports + +- **The repeated manual behavior**: the loop you have been driving by hand, + run after run, without naming it. +- **The steering and interruptions observed**: how often your agents needed + you to step in, and for what. +- **Your current workstreams**: what is in flight, clustered into tracks. +- **What is still undecided**: forks raised but never resolved. + +## Turn the report into a workflow + +Survey ends with an offer, not an action. It can turn what it found into a +Spacedock [workflow](../concepts/workflows-and-entities.md): the repeated loop +becomes the stages, the workstreams become the work items, and the undecided +forks become [approval gates](../concepts/gates-and-decisions.md). Nothing +changes in your project until you say yes; on a no, the survey stands on its +own as orientation. + +The offer matches your work: + +- **Routine loops** (issue → worktree → PR) get automation: a workflow that + gates the crucial decisions and lets the agent drive between gates. +- **Exploration** (creative or design work where your steering is the point) + gets book-keeping: structure for the parallel threads and their state. There + is no automate-the-human-out pitch; the involvement is the work. + +To define a workflow yourself instead, see +[your first workflow](first-workflow.md). diff --git a/docs/site/index.md b/docs/site/index.md index 7ea8ad5cc..7b79859f2 100644 --- a/docs/site/index.md +++ b/docs/site/index.md @@ -1,40 +1,25 @@ # Spacedock -**Spacedock is a multi-agent orchestrator where nothing ships without a decision.** It lives within your existing harness: Claude Code, Codex, or Pi. +Spacedock runs your work as a series of stages. **Nothing crosses a gate without a decision you own.** -Spacedock breaks work into stages and surfaces the decisions each stage needs, batched for you. Each decision arrives with evidence measured against a predefined bar for what good looks like. You approve, send back, or escalate. You can also delegate the call to an agent. Either way, the decision is recorded with its evidence and reason. +A gate is a checkpoint where the workflow pauses and puts the question to you: ship this, or not? You approve it, send it back, or escalate. You can also delegate the call to an agent. Either way, the decision is recorded with its evidence and its reason. That is the whole idea. Everything else is detail. -A few terms you'll meet throughout these docs: - -- A **workflow** is a directory of plain-text work item files plus a README that defines the stages, the schema, and the gates. -- An **entity** is one work item, a markdown file (or folder) that carries everything about the work: the problem, the design notes, the bar for done, and the stage reports. -- A **stage** is one step in the lifecycle; a **gate** is the decision point at its end. - -Three roles run a workflow: - -| Role | Who | -|------|-----| -| **Captain** | You. You define the mission and make the calls at gates unless you delegate them. | -| **First Officer** | The orchestrator agent that runs the workflow and reports to you at gates. | -| **Ensign** | The worker agent that moves one entity through one stage. | +You are the captain. You set the bar and make the calls; the agents do the rest. The bar starts rough and sharpens every time you reject, so calls that once needed you become ones you can hand off with confidence. See [the operating model](concepts/operating-model.md) for how the three roles divide the work. ## What's different - **The agent doesn't get to judge its own work.** Review runs as a separate stage with fresh context, no access to the maker's reasoning. It pushes back on thin evidence and work that looks busy without proving its claim. - **Every decision leaves a trail.** Each gate carries a stage report: findings, verdicts, artifacts, anomalies. You decide on evidence, not the transcript, and the record outlives the reviewer. - **The bar sharpens as you use it.** Each stage declares what good means and the agent works to that line. When a standard turns out fuzzy in practice, the agent proposes an edit to the written criteria for your approval. -- **Batch the work; decide as it flows back.** Queue many entities at once. Agents advance each through its stages, and you handle gates as they surface, not one session at a time. +- **Batch the work; decide as it flows back.** Queue many work items at once. Agents advance each through its stages, and you handle gates as they surface, not one session at a time. - **Work survives the context limit.** When an agent runs out of context, a successor carries forward what's in flight. ## Where to go next -- **[Get started](get-started/install.md)**: install the `spacedock` launcher and the host plugin, then make your [first launch](get-started/first-launch.md) and build your [first workflow](get-started/first-workflow.md). -- **[Concepts](concepts/operating-model.md)** covers the operating model, workflows and entities, the stage lifecycle, gates and decisions, and a worked example. -- **[Running workflows](running-workflows/commission.md)** walks through commissioning a workflow, surveying an existing project, operating a running workflow, and debriefing and refitting between sessions. -- **[Contributing](contributing/development-workflow.md)** covers the development workflow, agent development, the proof policy, and releasing. - -New here? Start with [Install](get-started/install.md). It walks a fresh install end to end and names the output to expect at each step. +- **[Get started](get-started/install.md)**: [install](get-started/install.md) Spacedock, then pick an entry. [Survey an existing project](get-started/survey.md) to see where your agents burn your time and surface the workflow you are already running without naming it. Or [start a fresh workflow](get-started/first-workflow.md) from a common shape like development or research. +- **[Concepts](concepts/operating-model.md)** covers the operating model, workflows and entities, the stage lifecycle, and gates and decisions. +- **[Running workflows](running-workflows/commission.md)** walks through commissioning a workflow, operating a running workflow, and debriefing and refitting between sessions. ## For agents using Spacedock -Spacedock's docs are read by agents too. A user's first officer parsing these docs is itself an agent. The build emits a curated `llms.txt` index of the docs at the site root for product-using agents. (Repo-development guidance for an agent working ON Spacedock lives under Contributing → [Agent development](contributing/agent-development.md).) +Agents read these docs too. Start from [`llms.txt`](/llms.txt), the curated index of these pages. diff --git a/docs/site/reference/command-reference.md b/docs/site/reference/command-reference.md index 29d24470f..dc917c844 100644 --- a/docs/site/reference/command-reference.md +++ b/docs/site/reference/command-reference.md @@ -1,182 +1,36 @@ # Command reference -The `spacedock` binary has ten subcommands in three groups (Launch, Setup, and -Workflow), plus a top-level `--version`. Run `spacedock` with no arguments to -print the grouped help; run `spacedock --help` for a command's own -flags. An unknown command or a stray leading flag exits 2 with a diagnostic on -stderr, and an unknown command resolution under cobra also exits 2. The verbs are -registered in `internal/cli/cli.go`. +The `spacedock` binary groups its subcommands into Launch, Setup, and Workflow, plus a top-level `spacedock --version` (the binary version and contract level). For the exact flags of any command, run `spacedock --help`, the always-current source of truth; `spacedock` with no arguments prints the grouped help. -## --version +## Launch -```bash -spacedock --version -``` - -Prints the binary version and the contract level, e.g. `spacedock 0.20.0 (contract 1)`. -The `(contract N)` token is load-bearing: the first-officer and ensign skills -read it to check the launcher and the installed plugin agree. The version string -defaults to the `dev` sentinel and is overwritten by the release pipeline's -linker stamp, so an unstamped `go build` reads as a dev build rather than -impersonating a release. - -## Launch: claude, codex, pi - -`spacedock claude`, `spacedock codex`, and `spacedock pi` each start the named -host with the Spacedock first officer loaded. Claude Code is the primary surface; -Codex and Pi are supported but experimental. The grammar is the same for all -three: +`spacedock claude`, `spacedock codex`, and `spacedock pi` start a host with the first officer loaded. Claude Code is the primary surface; Codex and Pi are experimental. The grammar is the same for all three: ```bash spacedock claude [task] [spacedock-flags] [-- host-flags] ``` -- **The task comes first.** Positionals before `--` join with single spaces into - the launch prompt handed to the first officer. With no task, a fixed bootstrap - prompt starts the first officer rather than opening an idle agent. -- **Everything after `--` forwards verbatim to the host** (`claude` / `codex` / - `pi`): `--model`, `--resume`, `--plugin-dir`, and the like. A task placed - after `--` is host passthrough, not the launch prompt; the launcher warns on - stderr when it detects this without altering the assembled argv. -- **The launch is contract-gated.** When no plugin is installed, the launcher - auto-installs it then launches, so the single command yields a working session. - `--no-install` opts out and prints the manual install remedy. A contract - mismatch fails fast (exit 1), since auto-installing would not fix it. - -Spacedock-owned launch flags, declared in `internal/cli/frontdoor.go`: - -- `--safehouse`: force the safehouse sandbox wrap even without a `.safehouse` - profile in the working directory. A `.safehouse` profile triggers the wrap - automatically. -- `--safehouse-enable KEY[,KEY]`, `--safehouse-add-dirs DIR`, - `--safehouse-add-dirs-ro DIR`: repeatable sandbox knobs whose presence also - implies sandbox-on. -- `--plugin-dir DIR`: load a local plugin checkout, relaxing the contract gate. - Repeatable, and accepted both before `--` (parsed by spacedock) and after `--` - (forwarded verbatim). -- `--skip-contract-check`: bypass the contract gate and launch without resolving - the installed plugin. -- `--no-install`: refuse to auto-install a missing plugin and print instructions - instead. - -The contract gate is bypassed by `--skip-contract-check` or by any `--plugin-dir` -(the local checkout supersedes the installed plugin). `spacedock pi` loads Pi's -native skills and extension rather than a plugin manifest; its only spacedock -flag is `--plugin-dir`. - -## Setup: install, doctor - -`spacedock install` installs the per-host plugin, then runs the compatibility -check. `spacedock doctor` runs the check alone. - -```bash -spacedock install [--host claude|codex|pi] [--check] -spacedock doctor [--host claude|codex|pi] -``` - -- `--host` defaults to `claude`. For `codex`, install prints the `codex plugin` - commands to run from your shell rather than running them programmatically. -- `install --check` runs the compatibility report without installing. -- `doctor --plugin-manifest PATH` reads a manifest directly instead of resolving - the installed plugin. - -When `doctor` reports the installed plugin is out of date, refresh it with -`spacedock install --host claude`. See [Install Spacedock](../get-started/install.md) -for the full setup path. - -## Workflow: status, new, state, completion, dispatch - -These commands read and mutate workflow state. `status` and `dispatch` forward -their argv verbatim to their runners; the launcher neither parses nor reorders -their flags. - -### status - -```bash -spacedock status [args] -``` - -`spacedock status` resolves the workflow it acts on in this precedence: an -explicit `--workflow-dir DIR` (or the `PIPELINE_DIR` environment variable) is used -verbatim; otherwise it walks up from the working directory to the enclosing -commissioned workflow. With neither, it exits 1 with -`no Spacedock workflow here — pass --workflow-dir or run inside a workflow`. The -exit domain is `{0 success, 1 error}`, so a usage error is exit 1, never 2. - -The query forms, parsed in `internal/status/native_runner.go`: - -- **No flag**: prints the active-entity status table. `--archived` includes - archived entities; `--quiet` and `--json` change the rendering. -- `--next` prints the items ready to dispatch (requires a stages block in the - README). -- `--next-id` computes the next entity id. It accepts `--id-seed` and `--id-actor` - (valid only with `--next-id` or `--new`). -- `--boot` prints the first-officer boot view (queue, stages, team state). It is - incompatible with `--next`, `--next-id`, `--archived`, `--where`, and - `--fields`/`--all-fields`. -- `--validate` validates the workflow. It prints `VALID` and exits 0, or prints the - errors and exits 1. -- `--resolve REF` resolves a reference to one entity and prints its resolve line. - With `--root ROOT` it resolves across every workflow under `ROOT`, accepting a - `workflow::ref` qualifier. -- `--short-id REF` prints an entity's short display id. -- `--set SLUG field=value...` mutates an entity's frontmatter. Bare `completed` - and `started` auto-fill a timestamp; every other field requires a value. - Terminal transitions are gated (mod-block, merge-hook, verdict, and the - require-external-proof guards); `--force` bypasses with a warning. -- `--archive SLUG` archives an entity. -- `--discover [--root ROOT]` prints the commissioned workflows under `ROOT` - (default: the git toplevel of the working directory). It is incompatible with - every other flag. -- `--where 'field = value'` filters the table. It supports `=`, `!=`, `field =` - (empty), and `field !=` (non-empty), and is repeatable. -- `--fields a,b,c` / `--all-fields` chooses the columns. The two are mutually - exclusive. +The task comes first and becomes the launch prompt. Anything after `--` forwards verbatim to the host (`--model`, `--resume`, and the like). When no plugin is installed, the launcher auto-installs it and launches, so the single command yields a working session; a contract mismatch fails fast. The sandbox flags (`--safehouse` and its knobs) and the contract-gate flags are listed by `spacedock claude --help`. -Most flags refuse to combine; the runner names the conflict and exits 1 (e.g. -`--set cannot be combined with --next`). +## Setup -### new +| Command | What it does | +|---------|--------------| +| `spacedock install` | Install the per-host plugin, then run the compatibility check | +| `spacedock doctor` | Run the compatibility check alone | -```bash -echo "entity body" | spacedock new [--folder] SLUG -``` - -A pure alias for `status --new`: it prefixes the argv with `--new` and forwards -it, reading the entity body from stdin and auto-discovering the workflow. -`--folder` creates the entity as a folder rather than a single file. +Both take `--host claude|codex|pi` (default `claude`). When `doctor` reports the plugin is out of date, refresh it with `spacedock install`. See [Install Spacedock](../get-started/install.md) for the full setup path. -### state +## Workflow -```bash -spacedock state init|new [--workflow-dir DIR] -``` +The first officer runs these against workflow state as it moves entities; you operate through it, not by hand. They are documented here for completeness and for the rare direct use (scripting, debugging, restoring a state checkout on a fresh clone). -Manages a split-root workflow's state checkout. `init` resumes a cloned workflow -by fetching the orphan state branch and checking it out as a linked worktree at -the workflow's `state:` path; a present checkout is a no-op that refreshes from -origin. `new` births that branch and worktree around a present split-root README. -An inline workflow has nothing to init. A missing or unknown subcommand exits 2. - -### completion - -```bash -spacedock completion bash|zsh -``` - -Prints a static shell-completion script for bash or zsh (exit 0). It completes -the top-level verbs and the common `status` flags. A missing or unknown shell -exits 2. - -### dispatch - -```bash -spacedock dispatch build | show-stage-def -``` +| Command | What it does | +|---------|--------------| +| `spacedock status` | Read or mutate the state: the entity table, `--next`, `--where`, `--set`, `--validate`, `--boot` | +| `spacedock new` | Create an entity (`new [--folder] SLUG`) from a body on stdin | +| `spacedock dispatch` | Build the worker dispatch artifacts (`dispatch build`, `dispatch show-stage-def`) | +| `spacedock state` | Manage a [split-root workflow](../advanced/split-root-state.md)'s state checkout (`state init` resumes one on a fresh clone, `state new` births one) | +| `spacedock completion` | Print a bash or zsh completion script | -Builds the worker dispatch artifacts the first officer hands an ensign. `build` -assembles the assignment (requires `--workflow-dir` unless `--print-schema` or -validate-only); `show-stage-def` prints a stage's definition (`--workflow-dir` -and `--stage`). A missing or unknown subcommand exits 2. See -[Adding runtime support](multi-host.md) for how `dispatch build` learns a new -host mode. +[Operate a workflow](../running-workflows/operating.md) covers how the first officer uses `status` on your behalf. Run `spacedock status --help` (and the same for each command) for the full flag list, the mutation guards, and the exit codes. diff --git a/docs/site/reference/frontmatter-contract.md b/docs/site/reference/frontmatter-contract.md index 8fe3f340e..d18aa1d0f 100644 --- a/docs/site/reference/frontmatter-contract.md +++ b/docs/site/reference/frontmatter-contract.md @@ -1,79 +1,11 @@ # Frontmatter contract -Every entity is a markdown file (or a folder with an `index.md`) whose YAML frontmatter carries the fields Spacedock reads to track and move it. The always-current schema lives in the development workflow's [Schema / Field Reference](../contributing/development-workflow.md#field-reference); this page surfaces that table and the external-tracker bridge fields in one place for reference. A standalone `docs/specs/frontmatter-contract.md` is a planned follow-up; until it lands, the development workflow README is the source of truth. +Spacedock reads YAML frontmatter as the machine-readable state of a workflow and its entities. The first officer and dispatched workers write it; you operate through them, not by hand-editing. The field-level contract (names, types, patterns, defaults, invariants) is two machine-checkable schemas. -The frontmatter parser is line-oriented. Keep fields flat and top-level. If richer metadata becomes necessary, add more flat custom fields rather than nested YAML, because the v1 parser preserves lines, not arbitrary structure. +## Workflow README -## Entity fields +The workflow `README.md` frontmatter declares the entity type, the id style, and the stages with their per-stage defaults and gates. The contract is [`workflow-readme.mdschema.yml`](https://github.com/spacedock-dev/spacedock/blob/main/docs/schema/workflow-readme.mdschema.yml), which also specifies the required per-stage body subsections. -These are the fields the development workflow declares for a `task` entity. Other workflows may rename the entity type and adjust fields, but these are the contract a dev-workflow entity must satisfy. +## Entity -| Field | Type | Description | -|-------|------|-------------| -| `id` | string | Unique 24-character Spacedock Base32 ID, because this workflow uses `id-style: sd-b32`. | -| `title` | string | Human-readable entity name. | -| `status` | enum | One of: `backlog`, `ideation`, `implementation`, `validation`, `done`. The current stage. | -| `source` | string | Where the entity came from. Also used by the external-tracker bridge (see below). | -| `started` | ISO 8601 | When active work began. | -| `completed` | ISO 8601 | When the entity reached terminal status. | -| `verdict` | enum | `PASSED` or `REJECTED`. Set at the final stage. | -| `score` | number | Priority score, `0.0`–`1.0` (optional). A workflow can upgrade to a multi-dimension rubric in its README. | -| `worktree` | string | Worktree path while a dispatched agent is active; empty otherwise. | -| `issue` | string | Optional external ticket reference, such as `ENG-123`, `kata:task-abc123`, or `owner/repo#42`. | - -The `status` field is the execution state. `spacedock status` reads stage declarations from the workflow README and reports each entity's `status` against them; `--set status=` is the mutation that advances an entity. The status read path does not invent stages. If the README declares no stages block, membership cannot be validated. - -The `verdict` field is guarded on the finalize action, not on reaching a terminal stage. The guard keys on the finalize shape (a `--set` that writes `completed`, or an `--archive` of a terminal entity) and refuses it (exit 1, entity unmutated) when the post-state `verdict` is empty (see `internal/status/verdict_guard_test.go`). A bare dispatch into a terminal stage that does not write `completed` passes without a verdict, because the verdict is the outcome of work that has not happened yet. A finalize on an entity that already carries a verdict also passes, even when that `--set` does not re-name it. `--force` bypasses the guard. The failure mode the guard catches is finalizing or archiving a terminal entity with no verdict on record. - -## Copy-paste starter - -The development workflow's task template ships these fields blank for a new entity: - -```yaml ---- -id: -title: Task name here -status: backlog -source: -started: -completed: -verdict: -score: -worktree: -issue: ---- -``` - -Fill `title`, `status`, and `source` at creation. `started`, `completed`, `verdict`, and `worktree` are written by the runtime as the entity moves; do not edit them by hand while a dispatched agent is active. - -## External-tracker fields - -The `issue` and `source` fields are the v0 bridge to an external ledger such as kata, Linear, or GitHub Issues. They are flat top-level fields the current parser preserves; the bridge adds no tracker-specific stage rules. See [Tracking work in an external system](../advanced/external-tracker.md) for the full integration model. - -```yaml -issue: ENG-123 -source: linear -``` - -or: - -```yaml -issue: kata:task-abc123 -source: kata -``` - -The contract for these two fields: - -- **`issue` is the human-facing external reference.** It points at the ticket the entity mirrors; Spacedock does not parse its internals. -- **`source` records where the entity came from** when useful: the tracker name, or any origin marker. -- **Spacedock `status` remains the execution status.** The external tracker does not redefine Spacedock stage semantics inside the entity, and ownership stays one-way unless a future bridge explicitly declares bidirectional sync. - -## Validating an entity - -Check an entity against the contract with the status command's `--validate` flag: - -```bash -spacedock status --workflow-dir docs/dev --validate -``` - -It exits 0 when the workflow is valid and 1 when it is not, printing the errors to stderr; with `--json` it also emits a `{"command":"validate","valid":"true"}` (or `"false"`) envelope. `--validate` cannot be combined with the other status flags: `--next`, `--next-id`, `--boot`, `--where`, `--fields`/`--all-fields`, `--archived`, `--archive`, or `--set`; the command rejects the combination. Validation reads stages from the workflow README and entities from the state checkout, so it enforces the contract against the same schema the workflow declares, not an assumed one. +Each entity's frontmatter carries its id, current stage, outcome, and worktree state. The contract is [`entity.mdschema.yml`](https://github.com/spacedock-dev/spacedock/blob/main/docs/schema/entity.mdschema.yml), which defines the fields, the custom-field policy, the recognized body headings, and the invariants. diff --git a/docs/site/reference/multi-host.md b/docs/site/reference/multi-host.md deleted file mode 120000 index a7ccb9a20..000000000 --- a/docs/site/reference/multi-host.md +++ /dev/null @@ -1 +0,0 @@ -../../runtime-support.md \ No newline at end of file diff --git a/docs/site/reference/sandbox.md b/docs/site/reference/sandbox.md new file mode 100644 index 000000000..14e9429ee --- /dev/null +++ b/docs/site/reference/sandbox.md @@ -0,0 +1,5 @@ +# Supported sandboxes + +| Sandbox | Platforms | Trigger | +|---------|-----------|---------| +| [`safehouse`](https://agent-safehouse.dev/) | macOS | A `.safehouse` profile in the working directory, or the `--safehouse` flag | diff --git a/docs/site/running-workflows/commission.md b/docs/site/running-workflows/commission.md index 461b3175c..5340636cb 100644 --- a/docs/site/running-workflows/commission.md +++ b/docs/site/running-workflows/commission.md @@ -1,6 +1,6 @@ # Commission a workflow -`/spacedock:commission` turns a description of the work you want tracked into a runnable workflow: a directory of markdown entities, a README that is the workflow's living spec, and a first officer ready to dispatch an ensign for each seed entity. You answer a short interactive design pass, the skill generates the files, and it launches a pilot run as the first officer. +A workflow is designed in conversation: you make four decisions and commission derives the rest. [Your first workflow](../get-started/first-workflow.md) shows the whole flow, including the design summary and pilot run. Invoke it from a session started with `spacedock claude`. You can pass the mission inline: @@ -8,54 +8,27 @@ Invoke it from a session started with `spacedock claude`. You can pass the missi /spacedock:commission product idea to simulated customer interview ``` -Text after the command name is taken as the workflow mission and presented for confirmation rather than asked from scratch. With no argument, the skill greets you and asks. +Text after the command name becomes the workflow mission; with no argument, commission greets you and asks. -## The four things you name +## The four things you decide -The design pass collects four decisions. Everything else (the directory path, the entity identity scheme, the rejection routing) is derived from these and shown back to you for confirmation before any file is written. +1. **The mission and what each work item is.** The description you give becomes the label the workflow uses everywhere: "a design idea" makes it a workflow of ideas. -1. **The mission and what each entity is.** The first question asks what the workflow is for and what each work item represents. From the entity description the skill derives the entity label used throughout the generated files: "a design idea" becomes label `idea`, plural `ideas`, type `design_idea`. An entity is one work item, a markdown file that moves through the stages. +2. **The stages.** Commission matches your mission to a workflow archetype (development: ship code through review; experiment: test a hypothesis against evidence; refinement: iterate on an artifact until it is good enough) and proposes the stage list from it. Confirm, adjust, or trim; it pushes back on redundant names (`awaiting_validation` is just `validation`). -2. **The stages.** The skill detects the workflow's shape from your mission (shipping code, testing a hypothesis, or iterating on an artifact) and proposes a stage list for you to confirm, modify, add to, or trim. Stage names describe the bucket an entity is sitting in: activity-flavored (`implementation`, `review`, `validation`) or state-flavored (`proposed`, `published`, `accepted`). The skill pushes back on pleonastic names: `awaiting_validation` reads as "the entity is in awaiting_validation," so it suggests `validation` instead. `done` is the universal terminal and stays as-is. +3. **The gated stages.** Where the workflow pauses for your decision: by default one gate before the terminal stage, each with a rejection target stated in plain language ("If you reject at `review`, it goes back to `draft` for revision"). -3. **The gated stages.** A gate is a stage where the workflow pauses for your decision before an entity advances. By default the skill places one gate before the terminal stage. For each gate it also derives a rejection flow, the earlier stage an entity bounces back to when you reject, defaulting to the stage immediately before the gate. You confirm both in the design summary, stated in plain language ("If you reject at `review`, it goes back to `draft` for revision"). +4. **The per-stage quality bar.** What "good" means for each stage in the generated README: what to produce, the bar to meet, the anti-patterns to avoid. Starting prose, not commitments. -4. **The per-stage rules.** Each stage in the generated README carries three bullets that tell a dispatched ensign what "good" means for that stage: **`Outputs`** (what the worker produces), **`Good`** (your quality bar), and **`Bad`** (anti-patterns to avoid). The skill drafts these from the mission, but they are starting prose, not commitments. They are the rules every dispatched agent works from, so tightening them before the first dispatch is the single highest-leverage edit you can make. +Everything else is derived or asked with a recommendation attached: the directory under `docs/` where everything lands as plain text, how entities are identified, how rejections route. Stages that write code give each entity an isolated worktree, so your main checkout stays clean; if you ship through PR review, commission offers the [pr-merge mod](../advanced/mods-and-standing-teammates.md), which manages the PR lifecycle so merging never needs to be a stage. The design summary shows it all; ask about the tradeoffs before you accept. -You also choose the entity ID style: `sd-b32` (recommended when multiple people or agents create entities across branches or worktrees), `sequential` (single-writer or numeric-continuity workflows), or `slug` (the filename is the identity). See the [frontmatter contract](../reference/frontmatter-contract.md) for what each style stores. +## Two ways to tighten the README -## What gets generated +The generated per-stage rules are best-guesses; [tighten them before any work runs](../concepts/workflows-and-entities.md#the-readme-is-where-you-set-the-rules). Either: -After you accept the design, the skill writes everything into `docs/{mission-slug}/`: +- open the README and edit the bullets under each stage heading directly, or +- type `review stages` to have commission walk you through each stage, flag the rules that read as generic, and apply your amendments inline. -- `README.md`: the mission, the schema, a per-stage section for every stage (with the `Outputs` / `Good` / `Bad` bullets), and a copy-paste entity template. -- One file per seed entity at `docs/{mission-slug}/{slug}.md`, each with valid YAML frontmatter and the description you gave. -- `_mods/pr-merge.md`, generated only when a stage modifies the repo (a worktree stage) and you accept the `pr-merge` mod, which tracks PR state on the entity's `pr` field instead of modeling merge as its own stage. +## After you accept -If any stage writes code or produces artifacts beyond the entity file, that stage is marked `worktree: true` so each entity gets an isolated branch and the main checkout stays clean. - -## Tighten the README before the first dispatch - -The README is the workflow's living spec. Before launching, the skill reminds you that the auto-generated per-stage bullets are best-guesses and prompts you to tighten them. You have two ways to do it: - -- Open `docs/{mission-slug}/README.md` and edit the bullets under each `### {stage}` heading directly. -- Type `review stages` to have the skill walk you through each stage one at a time, flag the bullets that read as generic, and apply your amendments inline. - -Editing here costs minutes; un-editing after agents have been dispatched against vague bullets costs more. - -## What the first officer does next - -Commission does not stop at generating files. It assumes the first-officer role itself and runs the pilot. There is no separate launch step for the first run. - -1. It loads the [first officer's operating contract](../concepts/operating-model.md) and reads the workflow README you just generated. -2. It runs `spacedock status --boot` to read the workflow's current state. -3. It probes for the team-mode tools and dispatches an ensign for each seed entity that is ready to advance, processing them through the stages. -4. When the workflow goes idle or reaches a gate, it reports the pilot results: which entities moved, which stages they passed through, and any gate waiting on your decision. - -To run the workflow in any later session, launch the first officer again: - -``` -spacedock claude -``` - -It reads the workflow state, picks up where the last session left off, and dispatches agents for any entity ready for its next stage. Day-to-day operation (seeing what is ready, dispatching, and handling gate decisions) is covered in [Operating a workflow](operating.md). +You can now tell the first officer to dispatch your entities. From there, [Operate a workflow](operating.md) covers the day-to-day loop. diff --git a/docs/site/running-workflows/debrief-and-refit.md b/docs/site/running-workflows/debrief-and-refit.md deleted file mode 100644 index 88d35d4a3..000000000 --- a/docs/site/running-workflows/debrief-and-refit.md +++ /dev/null @@ -1,67 +0,0 @@ -# Debrief & refit - -Two maintenance commands keep a workflow durable across sessions and releases. `/spacedock:debrief` captures what happened in a session into a record the next session reads to start with context. `/spacedock:refit` brings an existing workflow's scaffolding up to the current Spacedock version while leaving your local edits in place. You run both as the captain; each pauses for your confirmation before it writes anything. - -## Debrief: capture a session - -`/spacedock:debrief` writes a structured record of a session (shipped entities, newly filed backlog seeds, workflow-only commits, gate decisions, issues, and what's next) to `{dir}/_debriefs/{date}-{sequence}.md` and commits it. The next session's first officer reads the most recent debrief instead of starting cold. - -Run it at the end of a working session, or whenever you want a checkpoint: - -```bash -spacedock claude "/spacedock:debrief" -``` - -The skill works in four phases. You make the decisions at the boundaries; everything else is git and local-file reads, with no external services until you ask it to file an issue. - -1. **Discovery.** It finds the workflow with `spacedock status --discover`, then anchors the session start. If a prior debrief exists in `{dir}/_debriefs/`, the new session starts at the commit after that debrief's `last-commit` frontmatter field; if none exists, it falls back to the first commit in the workflow directory or the last 24 hours. It shows you the session boundary (since-commit and commit count) and waits for you to confirm or supply a different starting commit. - -2. **Extract.** It buckets every commit in range: PR squash-merges roll up into a **Shipped** section as a PR link, never enumerated; routine state churn (`dispatch:`, `advance:`, `state:`) is suppressed; only workflow-only commits that never flowed through a PR (`docs:`, `feedback:`, `ideation:`, reverts) are listed. It reads entity frontmatter to find what reached `done`, scans for gate approvals and rejections, and runs `spacedock status --workflow-dir {dir} --next` to populate **What's next**. - -3. **Draft and review.** It presents the draft with **Decisions** and **Observations** left as placeholders for you to fill. Add why a gate was approved or rejected, scope changes, design insights, or confirm as-is. Issues are split into **Workflow** (quirks in your pipeline, kept local) and **Spacedock** (framework bugs). For each Spacedock issue it offers to file an **anonymized** GitHub issue: the body carries the bug, repro steps, and scale, but never your mission, entity titles, or domain. You approve, edit, or decline each one before any `gh issue create` runs. - -4. **Write and commit.** It writes the debrief to `{dir}/_debriefs/{date}-{sequence:02d}.md` with `first-commit`, `last-commit`, and an approximate `duration` in frontmatter, commits it with a `debrief:` prefix, and reports the path: - - ``` - Debrief written to {dir}/_debriefs/2026-06-09-01.md and committed. - ``` - -The `last-commit` field is the load-bearing part: it is the anchor the next debrief reads to know where this session ended. - -## Refit: upgrade scaffolding to the current release - -`/spacedock:refit` upgrades a workflow's scaffolding files (the README and any installed mods in `_mods/`) to match the current Spacedock version, and migrates entity frontmatter when a schema change requires it. Agent files and the status viewer ship with the plugin, so they are never refit locally. The skill never auto-replaces a file you may have customized; it shows you a diff and you decide. - -You must give it the workflow directory: - -```bash -spacedock claude "/spacedock:refit path/to/workflow" -``` - -It reads the version stamp from the README frontmatter (`commissioned-by: spacedock@X.Y.Z`) and each mod's `version` field, compares them against the current version from the plugin manifest, and stops with "Workflow is already up to date." if everything matches. Otherwise it presents an upgrade plan and proceeds per file by strategy: - -- **`README.md`: show diff, never auto-replace.** Because you customize stages, schema fields, and quality criteria here, the skill generates what the current template would produce, diffs it against your README, and leaves it to you to apply the changes you want. It does not modify the README itself, only the version stamp at the end. -- **`_mods/{name}.md`: version diff.** For each installed mod it compares your `version` against the canonical mod at `mods/{name}.md`. Matching versions are skipped; differing versions get a diff and a y/n. A mod with no canonical match is treated as custom: acknowledged, no action. Canonical mods you don't have installed are offered for install. -- **`status` (legacy): remove.** A workflow-local `status` script predates the launcher. The status viewer is now the `spacedock status` command, so refit removes the local copy with `git rm`. - -### Schema migration and ID style - -After scaffolding, refit compares the old and new README `## Schema` and `### Field Reference` sections for changed types or ranges, renamed fields, removed fields, or new required fields. If a change affects entity data, it lists the affected entities, proposes the migration (for example, "Convert score from /25 to 0.0–1.0 by dividing by 25"), and waits for your y/n. On approval it edits **only** the named frontmatter fields with the Edit tool, never an entity body, never a whole-file rewrite. - -Refit preserves the README's `id-style` (`sequential`, `sd-b32`, or `slug`) and never changes it silently. It recommends `sd-b32` only under collaboration pressure (worktree stages, PR/merge mods, multiple creators, branches, offline work) and requires your explicit approval. Before any approved style change it runs `spacedock status --validate` against the workflow and reports failures; the actual ID rewrite is manual in this release. - -### When there is no version stamp - -If the README has no `commissioned-by` stamp, refit cannot tell what the original scaffolding looked like, so it enters **degraded mode** and offers two choices: **stamp only** (add stamps without changing anything, to establish a baseline) or **full refit with review** (show a full diff for every file and require your approval before replacing each). It never auto-replaces an unstamped file. - -When the refit finishes it updates the README stamp to the current version, prints a per-file summary, and suggests the commit: - -```bash -git commit -m "refit: upgrade workflow scaffolding to spacedock@{current_version}" -``` - -Git is the safety net throughout: `git diff` and `git checkout` recover anything you didn't mean to keep. - -## Where these fit - -Debrief and refit bracket the working loop described in [Operating a workflow](operating.md): you commission once, operate session by session, debrief at the end of a session, and refit when you upgrade Spacedock. For the commands these skills call, see the [Command reference](../reference/command-reference.md). diff --git a/docs/site/running-workflows/debrief.md b/docs/site/running-workflows/debrief.md new file mode 100644 index 000000000..14aff45a0 --- /dev/null +++ b/docs/site/running-workflows/debrief.md @@ -0,0 +1,10 @@ +# Debrief a session + +A workflow improves when friction is recorded and revisited instead of lost when the session ends. `/spacedock:debrief` captures the session into a chronological record of the work completed: what shipped, what you decided at the gates, and where it rubbed. Run it before you stop; the next session's first officer reads the latest debrief and starts with context instead of cold. + +The friction splits two ways: + +- **Workflow friction stays local.** Quirks in your pipeline, fixed by tightening the workflow's README. +- **Spacedock friction goes upstream.** For each framework issue, debrief offers to file an anonymized GitHub issue: the body carries the bug, repro steps, and scale, never your mission, entity titles, or domain. You approve, edit, or decline each one before anything is filed. + +Debrief drafts everything from git and the workflow files, and pauses for you three times: confirm where the session starts, add the why behind decisions and observations, approve each upstream issue. It commits the record, and the next debrief picks up where this one ended. diff --git a/docs/site/running-workflows/operating.md b/docs/site/running-workflows/operating.md index 47dc885c5..06525d304 100644 --- a/docs/site/running-workflows/operating.md +++ b/docs/site/running-workflows/operating.md @@ -1,71 +1,29 @@ -# Operating a workflow +# Operate a workflow -Operating a workflow is a loop: see what is ready, dispatch the first officer to move it, and make a decision when work reaches a gate. The captain drives that loop; the first officer does the orchestration and the ensigns do the stage work. This page covers the loop, the `spacedock status` queries that show you the state, and how to handle a gate. +You operate a workflow by talking to the first officer: launch a session, let it move everything that is ready, and decide when work reaches a gate. -## The day-to-day loop +## The session loop -You run the same three steps each session: - -1. **See what is ready.** Query workflow state to find the entities that can move (the dispatchable set) and where everything sits. -2. **Dispatch the first officer.** Launch a session and let it pull a dispatchable entity through its next stage. -3. **Handle gates.** When a stage is gated, the first officer stops and presents the result. You approve, reject, or send it back. - -The loop ends a session when nothing is dispatchable, or when every dispatchable entity is waiting on a gate decision from you. - -## See what is ready - -`spacedock status` reads the workflow state and prints it. Run it against the workflow directory, the one holding the commissioned `README.md`: +Start a session. Name a task if you have one in mind, or say nothing and it picks up where the last session left off: ```bash -spacedock status --workflow-dir docs/dev +spacedock claude ``` -Prints the status table: one row per active entity, with its ID, title, and current stage. This is the full picture. - -To list only the entities ready to dispatch, add `--next`: +The first officer reads the workflow state and dispatches an ensign for every entity ready for its next stage. Completed stages flow forward on their own: when the next stage has no gate, the first officer advances the entity and dispatches again without waiting for you. It returns to you for four things only: a gate needs your decision, a finished entity is ready to close, something is blocked, or nothing is left to dispatch. -```bash -spacedock status --workflow-dir docs/dev --next -``` - -Prints the dispatchable set: the entities whose next stage can run now, given concurrency limits and what is already in flight. This is the query the first officer runs each loop. When nothing is ready, the result is empty; there is nothing to dispatch. - -To filter the table by a frontmatter field, use `--where "field=value"`. The filter takes `=` (equals) or `!=` (not equals): - -```bash -spacedock status --workflow-dir docs/dev --where "status=ideation" -spacedock status --workflow-dir docs/dev --where "verdict!=" -``` - -The first prints every entity in the `ideation` stage. The second prints entities whose `verdict` field is set (the `!=` against an empty value). Use `--where` to answer targeted questions: what is in a given stage, which entities carry an external `issue`, which already have a `verdict`. - -Two more queries are worth knowing: - -- **`--validate`** checks every entity against the workflow's contract and reports problems (a missing or malformed ID, a duplicate ID, a stage name that breaks the naming rule). Run it when the table looks wrong. -- **`--resolve REF`** looks up one entity by slug, full ID, or ID prefix, so you can name it unambiguously before acting on it. - -All status queries are read-only. They print state, they do not change it. For the full flag list and the `--set` and `--archive` mutation forms, see the [Command reference](../reference/command-reference.md). - -## Dispatch - -Hand the first officer the workflow and let it run the dispatch cycle. Launch with your harness subcommand and a task that names the work; the first officer takes the workflow directory from the path you give it in the task, or runs `spacedock status --discover` to find it: - -```bash -spacedock claude "/spacedock:first-officer operate the workflow in docs/dev" -``` +In between, ask it whatever you want to know: "what's ready?", "where is the rate-limiting task?", "what's still in review?". The first officer queries the workflow state and shows you; there are no status commands to learn. -The first officer reads the workflow `README.md`, runs its own `status --next`, and for each dispatchable entity it dispatches an ensign to move the entity through its next stage. The ensign does the stage work (write the design, produce the deliverable, run the validation), commits, and files a stage report. The first officer reads the report, checks it against the stage's outputs and the entity's acceptance criteria, and advances the entity. A completed non-gated, non-terminal stage is not a stopping point: the first officer advances it and dispatches the next stage on its own, without waiting for you. +## When a gate arrives -It stops and returns to you only at a gate, at a terminal entity's merge ceremony, on a blocker, or when nothing is left to dispatch. +A gate stops the loop and hands you the call: the first officer presents the review and waits. You make one of [the three calls](../concepts/gates-and-decisions.md#the-three-calls): approve, redo with feedback, or reject. -## Handle gate decisions +Approving the last stage closes the entity: the first officer records the merge and the verdict, archives the entity file, removes the worktree, and stands the workers down. The loop then continues with whatever is ready next. -A gate is the decision point at the end of a stage marked `gate: true` in the workflow. When an entity reaches one, the first officer presents the stage report and the result of its review, then waits. It never self-approves. You decide: +## Delegating the loop -- **Approve.** The entity advances to its next stage. The first officer dispatches it (or, at a terminal stage, runs the merge-and-cleanup ceremony to close the entity with its verdict). -- **Reject.** On a stage with a `feedback-to` target (`validation` routes back to `implementation` in the `docs/dev` workflow), the rejection routes the concrete findings back to that stage and re-runs the work, then re-validates. A repeated rejection escalates back to you rather than bouncing indefinitely. -- **Send it back with direction.** If the result is close but not right, give the first officer the specific change to make. It updates the entity body, acceptance criteria, and test plan together, then re-runs the stage. +The loop needs you less as the workflow matures. Rejections already run without you: findings bounce back to the stage that owns the fix and the reviewer re-runs; [a rejection reaches your desk](../concepts/gates-and-decisions.md#rejections) only after three failed rounds. When the bar you set is sharp enough to trust, [hand over the conn](../concepts/gates-and-decisions.md) and let the first officer drive whole tasks to done with auto-approval. -The gate review names the chosen direction, cites the stage report, and ends with a single recommendation, approve or reject. Read the report it cites before deciding; overriding a `REJECTED` recommendation without a reason is exactly the kind of unexamined approval the gate exists to catch. +## Ending a session -When you approve a terminal stage, the entity is closed: the first officer records the merge, sets the `completed` timestamp and `verdict`, clears the worktree, and tears the worker down. At that point the loop returns to the top: run `status --next` and see what moved into reach. +Stop whenever you want; every entity's state is in the workflow files, so the next session resumes where this one ended. Before you stop, run [`/spacedock:debrief`](debrief.md) to record the session for the next one. diff --git a/docs/site/running-workflows/survey.md b/docs/site/running-workflows/survey.md deleted file mode 100644 index 3a800a501..000000000 --- a/docs/site/running-workflows/survey.md +++ /dev/null @@ -1,65 +0,0 @@ -# Survey an existing project - -`/spacedock:survey` reads a brownfield project's agent history and reports what the agents have implicitly been doing (read-only), then offers to commission a workflow from what it found. Run it when you arrive at or return to a repo that already has agent sessions and want the lay of the land before doing anything else. It never edits your files; the only stop in the flow is the commission offer at the end. - -Survey is the recommended first launch. Point Spacedock at a project and hand it the survey skill: - -```bash -spacedock claude "/spacedock:survey" -``` - -This starts the first officer in Claude Code and runs the survey. The new-user walkthrough is in [Your first launch](../get-started/first-launch.md); this page is the operator's view of what survey does and how to read it. - -## What it reads - -Survey reads recorded agent session history through `agentsview`, a session-history tool. It does not parse raw logs by hand. It drives the `agentsview` binary to sync this project's sessions into a process-readable copy, then runs a fixed set of labeled, read-only SQL queries against that copy. The queries live in `skills/survey/references/queries.sql`, one labeled query per concern, so nothing is a black box. - -Two behaviors matter when you run it: - -- **It scopes to this repo by identity, not by name.** Survey resolves the repo root and scopes every query to that absolute path prefix. Because `agentsview` keys each session's project by the git-root basename, a same-basename sibling repo elsewhere on disk would otherwise fold in; the path-prefix scope keeps it out, and admits every checkout of this repo: the root, a subdir, a worktree. -- **If `agentsview` is missing, it asks before installing.** Survey needs `agentsview` to read the logs. When the binary is absent it tells you so and asks consent; on a yes it installs (`brew install --cask agentsview`, or the install-script fallback). It never installs without an explicit yes. If the sync fails for any reason (network, disk, permissions), it reports the exact failure and stops rather than guessing. - -If the repo has no Claude agent history, survey says so plainly and stops. There is nothing to discover. - -## What it reports - -Survey leads with a one-line headline (the project, the session count, the date range, and the decision and interruption counts), then renders the body in the same turn. The body is the value, so it does not pause for a confirmation before showing you the sections: - -- **Inferred workflow.** The implicit loop reconstructed from the decisions and prompts, as an arrow chain, with one honest line about it. -- **Workstreams.** The decisions and prompts clustered into tracks, each tagged with its work mode (see below). -- **Work by area.** Where edits actually landed, by logical area (`src`, `internal`, `docs`, …) regardless of physical location. A worktree edit counts toward its area, so worktree-based work is not hidden. Genuine config paths (`.claude`, `.beads`, `.git`) and external sibling references demote to a footnote. -- **Needs you.** The open decisions, the forks raised but never resolved. **Survey leads the report with these**, because they are the work blocked on you. Exploration threads you are deliberately holding are separated from mechanical questions awaiting an answer. -- **Recent decisions** and **interruptions**: the answered or shipped forks, and how often you had to step in. -- **Scaffold.** If another agent scaffold is in use (superpowers, gsd / get-shit-done, or another `.claude` skill tree), survey states it as a fact: the family, its invocation count, and whether it is checked in on disk. -- **Codex** (only when present). Codex sessions land with no recorded working directory, so survey attributes them to this repo through each command's working directory and reports them as their own section: a session count, the workstream clusters, and an activity tally. Gemini is a deferred follow-up. - -If a section's signal is empty, survey says the run found none of it. It never dresses an empty section up as "no decisions". - -### The open frontier is cross-checked - -The open-decision scan reads transcripts only, which cannot tell a shipped fork from a still-open one. Before presenting the frontier, survey cross-references the repo (`git log`, merged PRs, the working tree) and splits each open fork three ways: - -- **shipped**: a confident match to a merged PR or commit. Dropped from the frontier. -- **decided, not shipped**: moved to a backlog line. -- **never decided**: stays on the `NEEDS YOU` frontier. - -The match is conservative: a fork is dropped only on a confident repo match, because a false "still open" is a cheap nudge while a false "shipped" silently hides real open work. When no repo signal is available, whether because this is not a git repo or because the lookups fail, the frontier degrades to transcript-only and every open fork is flagged `unverified` rather than presented as authoritative. - -## The commission offer - -Survey closes by offering to commission a workflow from what it found, and the offer is keyed to each track's work mode, so it is not one undifferentiated pitch. Survey classifies each track as **mechanical**, **exploration**, or **unlabeled**: - -- **Mechanical tracks** (the routine issue → worktree → PR loop) get an **automation** offer: a workflow that gates the crucial decisions and lets the agent drive the loop between gates. -- **Exploration tracks** (creative, content, or design work where your steering is the point) get a **book-keeping** offer: structure for the parallel threads, tracking each draft or path and its state (in-flight / paused-by-choice / abandoned). There is no automate-the-human-out pitch here. The involvement is the work. -- **Unlabeled tracks** get the generic book-keeping offer, never a guessed automation pitch. - -A project with both modes gets both offers. Each offer cites a real number from the scan: the track names, the gate-pass count, the open forks, the cancelled-path count. - -On a **yes**, survey hands off to [commission](commission.md) in batch mode, assembling the inputs from the scan: - -- **stages** ← the inferred workflow loop; -- **seed entities** ← the workstreams; -- **approval gates** ← the open forks that survived the repo cross-check; -- **mission and entity** ← inferred from the workstreams and the project. - -Survey does not write the workflow files itself; file generation stays commission's job. On a **no**, it stops; the survey stands on its own as orientation. diff --git a/docs/site/stylesheets/brand.css b/docs/site/stylesheets/brand.css index 4f4466adb..b033ee828 100644 --- a/docs/site/stylesheets/brand.css +++ b/docs/site/stylesheets/brand.css @@ -5,6 +5,10 @@ .md-header, .md-tabs { background-color: var(--bg-2); + /* Material colours header text with --md-primary-bg-color (a dark ink in both + schemes); since we force a dark header bg, set a light fg so the wordmark, + tabs, and icons stay legible under the slate (default) scheme. */ + color: var(--fg); border-bottom: 1px solid var(--line-soft); } @@ -13,6 +17,17 @@ border-top: 1px solid var(--line-soft); } +/* Discreet machine-readers link in the footer meta row */ +.sd-footer-llms { + margin-left: 1em; + font-size: 0.64rem; + opacity: 0.7; + color: inherit; +} +.sd-footer-llms:hover { + opacity: 1; +} + /* Mono display headings -- Material has no display-font slot */ .md-typeset h1, .md-typeset h2 { diff --git a/mkdocs.yml b/mkdocs.yml index ef7f4c774..e0194a720 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,12 +1,18 @@ site_name: Spacedock site_description: A multi-agent orchestrator where nothing ships without a decision. -site_url: https://spacedock-dev.github.io/spacedock/ +site_url: https://spacedock.md/docs/ repo_url: https://github.com/spacedock-dev/spacedock repo_name: spacedock-dev/spacedock copyright: Spacedock is released under the Apache License 2.0. docs_dir: docs/site +# Internal authoring directive (AGENTS.md), not a published page. +# Contributor docs are not hosted on the site; the repo is their source of truth. +exclude_docs: | + AGENTS.md + contributing/ + theme: name: material custom_dir: overrides @@ -40,11 +46,13 @@ theme: plugins: - search - llmstxt: + full_output: llms-full.txt markdown_description: >- Documentation for Spacedock, a multi-agent orchestrator where nothing ships without a decision. It lives within your existing coding harness (Claude Code, Codex, or Pi), breaks work into stages, and surfaces an - evidenced decision at each gate. + evidenced decision at each gate. This index links the product docs; + llms-full.txt is every page concatenated into one file for a single fetch. sections: Get started: - index.md @@ -57,43 +65,33 @@ plugins: - advanced/*.md Reference: - reference/*.md - Contributing: - - contributing/*.md nav: - - Home: index.md - Get started: + - Welcome: index.md - Install: get-started/install.md - - Your first launch: get-started/first-launch.md + - Survey your project: get-started/survey.md - Your first workflow: get-started/first-workflow.md - Concepts: - The operating model: concepts/operating-model.md - Workflows & entities: concepts/workflows-and-entities.md - The stage lifecycle: concepts/stage-lifecycle.md - Gates & decisions: concepts/gates-and-decisions.md - - A worked example: concepts/worked-example.md - Running workflows: - Commission a workflow: running-workflows/commission.md - - Survey an existing project: running-workflows/survey.md - - Operating a workflow: running-workflows/operating.md - - Debrief & refit: running-workflows/debrief-and-refit.md + - Operate a workflow: running-workflows/operating.md + - Debrief a session: running-workflows/debrief.md - Advanced topics: - - Sprints & roadmap: advanced/sprints-and-roadmap.md - Mods & standing teammates: advanced/mods-and-standing-teammates.md - - External-tracker bridge: advanced/external-tracker.md - - Multi-workflow & split-root state: advanced/split-root-state.md + - Multiple workflows: advanced/multi-workflow.md + - Split-root state: advanced/split-root-state.md + - Bridge an external tracker: advanced/external-tracker.md + - Refit a workflow: advanced/refit.md - Reference: - Command reference: reference/command-reference.md - Frontmatter contract: reference/frontmatter-contract.md - - Multi-host support: reference/multi-host.md - - Contributing: - - The development workflow: contributing/development-workflow.md - - Agent development: contributing/agent-development.md - - Proof policy: contributing/proof-policy.md - - Adding a runtime: contributing/adding-a-runtime.md - - Releasing: contributing/releasing.md - - Voice & tone: contributing/voice-and-tone.md - - Architecture notes: contributing/architecture-notes.md + - Supported sandboxes: reference/sandbox.md + - Contributing: https://github.com/spacedock-dev/spacedock/blob/main/CONTRIBUTING.md extra_css: - stylesheets/tokens.css @@ -104,7 +102,11 @@ markdown_extensions: - pymdownx.highlight: anchor_linenums: true - pymdownx.details - - pymdownx.superfences + - pymdownx.superfences: + custom_fences: + - name: mermaid + class: mermaid + format: !!python/name:pymdownx.superfences.fence_code_format - pymdownx.tabbed: alternate_style: true - toc: diff --git a/overrides/partials/footer.html b/overrides/partials/footer.html index b21bab21b..e56ae4119 100644 --- a/overrides/partials/footer.html +++ b/overrides/partials/footer.html @@ -46,6 +46,7 @@