Skip to content

fix(coordinator): resolve externalized state in agent governance file#1117

Closed
ahhlun wants to merge 10 commits into
bradygaster:devfrom
ahhlun:fix/external-state-resolution
Closed

fix(coordinator): resolve externalized state in agent governance file#1117
ahhlun wants to merge 10 commits into
bradygaster:devfrom
ahhlun:fix/external-state-resolution

Conversation

@ahhlun
Copy link
Copy Markdown

@ahhlun ahhlun commented May 14, 2026

Fix: Coordinator resolves externalized state before mode-switch check

Fixes #1116

Problem

When .squad/ state is externalized via squad externalize, the coordinator agent (squad.agent.md) fails to detect the existing team and erroneously enters Init Mode. The CLI correctly creates the external state directory and leaves a .squad/config.json marker, but the agent governance file's session-start logic never reads this marker.

Changes

  1. External State Resolution section — inserted before the mode-switch check. Reads .squad/config.json, detects stateLocation: "external", and resolves the AppData path as team root.
  2. Mode-switch check — updated to reference {team_root}/team.md instead of hardcoded .squad/team.md.
  3. Worktree Awareness algorithm — added step 0 to short-circuit when external state is detected.

Applied identically to both:

  • .github/agents/squad.agent.md (runtime)
  • .squad-templates/squad.agent.md (template for squad upgrade)

Testing

Manual verification: with stateLocation: "external" in config.json, the coordinator now resolves the external directory and finds team.md there, entering Team Mode correctly.

When .squad/config.json has stateLocation: 'external', resolve the team root from %APPDATA%/squad/projects/{projectKey}/ before checking for team.md. Without this, the coordinator always enters Init Mode on externalized projects.

Closes #PENDING (issue to be filed manually)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the coordinator governance prompt (squad.agent.md) to recognize when .squad/ state has been externalized (via .squad/config.json) so the coordinator enters Team Mode instead of incorrectly falling back to Init Mode.

Changes:

  • Add an “External State Resolution” section intended to resolve external state before the Team/Init mode check.
  • Update the mode-switch check and the Worktree Awareness algorithm to reference the resolved location.
  • Add a changeset proposing a patch release for @bradygaster/squad-cli.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
.squad-templates/squad.agent.md Adds external-state resolution + modifies mode-switch/worktree logic in the canonical template.
.github/agents/squad.agent.md Applies the same prompt updates to the runtime agent file.
.changeset/fix-external-state-resolution.md Declares a patch release note for the CLI.
Comments suppressed due to low confidence (2)

.github/agents/squad.agent.md:41

  • The new mode-switch check uses {team_root}/team.md, but elsewhere in this file team_root is resolved to the repo/worktree root (e.g., Worktree Awareness step 1 sets team root = CWD when .squad/ exists). In local (non-external) state, team.md lives at {team_root}/.squad/team.md, so this check will fail and incorrectly enter Init Mode. Consider introducing a separate state_root/squad_dir variable (local: {team_root}/.squad, external: external state dir) and checking {state_root}/team.md instead.
   c. Set **team root** = that external directory. ALL `.squad/` paths (team.md, routing.md, agents/, decisions/, etc.) resolve from this external root.
   d. Skip the Worktree Awareness resolution below — external state is already branch-independent.
3. If `.squad/config.json` does not exist, or `stateLocation` is not `"external"` → proceed with normal resolution (Worktree Awareness) below.

Check: Does `{team_root}/team.md` exist? (where `team_root` is the resolved path from External State Resolution or Worktree Awareness above; fall back to `.ai-team/team.md` for repos migrating from older installs)
- **No** → Init Mode
- **Yes, but `## Members` has zero roster entries** → Init Mode (treat as unconfigured — scaffold exists but no team was cast)

.squad-templates/squad.agent.md:41

  • The mode-switch check now uses {team_root}/team.md, but in local state the file is .squad/team.md under the repo root. Unless team_root is redefined everywhere to mean the .squad directory itself, this will cause false Init Mode. Consider using a separate state_root (local: {team_root}/.squad, external: external dir) and consistently referring to {state_root}/team.md.
   c. Set **team root** = that external directory. ALL `.squad/` paths (team.md, routing.md, agents/, decisions/, etc.) resolve from this external root.
   d. Skip the Worktree Awareness resolution below — external state is already branch-independent.
3. If `.squad/config.json` does not exist, or `stateLocation` is not `"external"` → proceed with normal resolution (Worktree Awareness) below.

Check: Does `{team_root}/team.md` exist? (where `team_root` is the resolved path from External State Resolution or Worktree Awareness above; fall back to `.ai-team/team.md` for repos migrating from older installs)
- **No** → Init Mode
- **Yes, but `## Members` has zero roster entries** → Init Mode (treat as unconfigured — scaffold exists but no team was cast)

Comment thread .github/agents/squad.agent.md
Comment thread .github/agents/squad.agent.md
Comment thread .squad-templates/squad.agent.md
Comment thread .changeset/fix-external-state-resolution.md
ahhlun and others added 2 commits May 14, 2026 16:42
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
- Add projectKey sanitization rules to External State Resolution prompt section
- Sync templates
- Clarify team_root semantics in external state mode
@ahhlun
Copy link
Copy Markdown
Author

ahhlun commented May 14, 2026

cc @diberry — original author of squad externalize. Would value your review on the coordinator-side resolution since this completes the externalize feature loop.

@ahhlun
Copy link
Copy Markdown
Author

ahhlun commented May 14, 2026

Addressed Copilot review feedback in 5737c55:

  • Added projectKey sanitization rules
  • Synced templates
  • Clarified team_root semantics in external mode

@ahhlun
Copy link
Copy Markdown
Author

ahhlun commented May 15, 2026

cc @tamirdresher — original author of squad externalize (PR #797). This PR completes the externalize feature loop on the coordinator side; would value your review alongside @diberry.

Copy link
Copy Markdown
Collaborator

@tamirdresher tamirdresher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahhlun looks real good

Can you check what I commented and then we decide how to merge?

Comment thread .github/agents/squad.agent.md Outdated
- **Linux:** `$XDG_CONFIG_HOME/squad/projects/{projectKey}/` (default `~/.config/squad/projects/{projectKey}/`)
c. Set **team root** = that external directory. In external mode, `team_root` points directly to the flat external state directory — files like `team.md`, `routing.md`, and `agents/` live at the top level of this path (no nested `.squad/` subfolder). ALL state paths resolve from this external root.
d. Skip the Worktree Awareness resolution below — external state is already branch-independent.
3. If `.squad/config.json` does not exist, or `stateLocation` is not `"external"` → proceed with normal resolution (Worktree Awareness) below.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahhlun could you please try something. The custom agent md file is becoming long (and it already long :-)) can you try to put the externalize instructions in anothrr MD file and tell the custom agent to read the instructions from there is this squad is externalize?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great suggestion — the file is already long and this adds ~20 lines. We'll extract the External State Resolution block into a dedicated file (e.g. .github/agents/external-state.md) and replace it with a short instruction like:

If .squad/config.json has stateLocation: "external", read and follow .github/agents/external-state.md to resolve the team root.

This keeps the main agent prompt lean while preserving discoverability. We'll include this refactor in the next commit.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 4eef10e. Extracted the algorithm into .github/agents/external-state.md (+ .squad-templates/external-state.md mirror). squad.agent.md now has a one-line on-demand pointer matching the existing reference-file pattern (ceremonies, casting, etc.). Diff is tight: 4 files, +34/-24.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tamir — I'm reversing the extraction approach, and wanted to flag why so you can push back if you disagree.

When testing the PR end-to-end, we hit a bootstrap problem: squad externalize moves the entire .squad/ directory (including .squad/templates/) into %APPDATA%/squad/projects/{key}/. So if external-state.md lives in .squad/templates/, the coordinator can't find it post-externalize without already knowing the external-state remap rule — chicken-and-egg.

The External State Resolution algorithm has to live in the always-accessible file (.github/agents/squad.agent.md) because that's the entry point Copilot loads regardless of externalize state. This was actually the structural concern Copilot bot flagged in the first round; the empirical break confirmed it.

Re-inlined in 6d832e07. The block is ~18 lines — short, and now structurally robust against externalize. The squad.agent.md is unavoidably ~18 lines longer than today; if file length is the bigger concern, I'd push back on the inline-everything-else convention as a separate refactor rather than special-casing external-state.

Re-opening this thread for your call. If you have a different bootstrap path in mind (e.g., a dedicated .github/agents/external-state.md with a test-allowlist exception), happy to follow that instead.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahhlun thanks for digging into this, your bootstrap concern is real, but I think the externalization can still work with a small tweak. The key insight: .squad/config.json is left behind in the repo as a marker after squad externalize runs (per your PR description). That file never moves, so it can serve as the always-reachable bootstrap anchor.

Proposed loading rule in squad.agent.md (≈4 lines instead of ~18):
Read .squad/config.json (or the monorepo sub-folder equivalent). If it contains stateLocation: "external", treat externalPath from that config as the team root and read {externalPath}/templates/external-state.md for the detailed resolution algorithm. Otherwise, treat the repo (or sub-folder) as the team root and read .squad/templates/external-state.md. Follow whichever file you loaded.

Why this avoids the chicken-and-egg:

  • squad.agent.md doesn't need to compute the external path (no sanitization rules, no AppData logic). It just reads externalPath from the config marker.
  • external-state.md travels with the state — externalized installs have their copy in %APPDATA%/.../templates/, local installs have theirs in .squad/templates/. Either way it's reachable once you know which root to use.
  • Monorepo case is handled naturally: the config marker lives next to the sub-folder's .squad/, so the same rule applies at whichever level the squad is rooted.

This matches the on-demand reference pattern you used in commit 8888543 (ceremonies, casting, plugin-marketplace) and keeps squad.agent.md lean. Your earlier extraction was structurally correct — it just needed the pointer to key off the config marker rather than off a path the agent had to derive
itself.

Happy to be wrong if there's a case I'm missing - does config.json get moved in some scenario I'm not seeing?

…nd reference

Per @tamirdresher review on bradygaster#1117: move the External State Resolution
algorithm out of squad.agent.md into a dedicated .github/agents/external-state.md
file, following the existing pattern for on-demand references
(e.g., ceremony-reference.md, casting-reference.md). squad.agent.md
retains a short pointer line.

Mirror template at .squad-templates/external-state.md.
@ahhlun
Copy link
Copy Markdown
Author

ahhlun commented May 17, 2026

Addressed in 4eef10e — extracted the External State Resolution algorithm into .github/agents/external-state.md (and the template mirror) per the on-demand reference pattern. squad.agent.md now retains a one-line pointer.

Also reverted the inadvertent 0.0.0-source0.9.6-build.1 version-line change in this push so the diff is purely the refactor.

Scoped tight on purpose: just the 2 new doc files + 2 pointer-replacement edits. No CLI/sync-templates/test changes — happy to follow up in a separate PR if those need updating to recognize the new file.

@ahhlun ahhlun force-pushed the fix/external-state-resolution branch from 583a6bd to 4eef10e Compare May 17, 2026 04:29
@ahhlun
Copy link
Copy Markdown
Author

ahhlun commented May 17, 2026

Note: force-pushed back to 4eef10e to keep this PR minimal (4 files). Template-mirror sync (originally in 583a6bd) will land in a separate follow-up PR per the round-2 plan.

allai-dev added 2 commits May 16, 2026 22:07
Reverts inadvertent version-line change from 4eef10e. The 0.9.6-build.1
marker is intentional and should remain in source — distinguishes the
canonical source from the 0.0.0-source placeholder used by CLI/SDK
templates before stamping at install/upgrade time.
Reverts the version-block changes inadvertently included in this PR.
Baseline values per dev branch:
- .github/agents/squad.agent.md: 0.9.1
- .squad-templates/squad.agent.md: 0.0.0-source

Per Tamir's review ('Is this change needed?'), no change to the version
block was intended; only the External State Resolution refactor.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (5)

templates/squad.agent.md.template:35

  • The projectKey sanitization/validation described here doesn’t match the SDK’s resolveExternalStateDir() implementation: the SDK trims leading/trailing dashes after replacement and does not reject keys just because they start with .. If the coordinator follows these instructions, it can compute a different folder name than the CLI/SDK and fail to find the external state. Align the prompt’s sanitization rules with packages/squad-sdk/src/resolution.ts so the resolved directory name is consistent.
2. If it exists and contains `"stateLocation": "external"`:
   a. Read the `projectKey` field from the same config. Sanitize the key: replace path separators and non-`[a-zA-Z0-9._-]` chars with `-`. Reject keys that are empty, start with `.`, or contain `..`.
   b. Resolve the external state directory:
      - **Windows:** `%APPDATA%\squad\projects\{projectKey}\`
      - **macOS:** `~/Library/Application Support/squad/projects/{projectKey}/`
      - **Linux:** `$XDG_CONFIG_HOME/squad/projects/{projectKey}/` (default `~/.config/squad/projects/{projectKey}/`)
   c. Set **team root** = that external directory. In external mode, `team_root` points directly to the flat external state directory — files like `team.md`, `routing.md`, and `agents/` live at the top level of this path (no nested `.squad/` subfolder). ALL state paths resolve from this external root.

packages/squad-cli/templates/squad.agent.md.template:35

  • The projectKey sanitization/validation described here doesn’t match the SDK’s resolveExternalStateDir() implementation (it trims leading/trailing dashes and doesn’t reject keys solely for starting with .). Diverging here can make squad init/upgrade installs look in a different external folder than the CLI created. Please align this prompt’s sanitization rules with packages/squad-sdk/src/resolution.ts to ensure consistent paths.
2. If it exists and contains `"stateLocation": "external"`:
   a. Read the `projectKey` field from the same config. Sanitize the key: replace path separators and non-`[a-zA-Z0-9._-]` chars with `-`. Reject keys that are empty, start with `.`, or contain `..`.
   b. Resolve the external state directory:
      - **Windows:** `%APPDATA%\squad\projects\{projectKey}\`
      - **macOS:** `~/Library/Application Support/squad/projects/{projectKey}/`
      - **Linux:** `$XDG_CONFIG_HOME/squad/projects/{projectKey}/` (default `~/.config/squad/projects/{projectKey}/`)
   c. Set **team root** = that external directory. In external mode, `team_root` points directly to the flat external state directory — files like `team.md`, `routing.md`, and `agents/` live at the top level of this path (no nested `.squad/` subfolder). ALL state paths resolve from this external root.

packages/squad-sdk/templates/squad.agent.md.template:35

  • The projectKey sanitization/validation rules here differ from the SDK’s resolveExternalStateDir() (SDK trims leading/trailing dashes after replacement and doesn’t reject keys just for starting with .). If this prompt diverges, the coordinator may resolve a different external directory than the CLI/SDK and miss team.md. Please align with packages/squad-sdk/src/resolution.ts.
2. If it exists and contains `"stateLocation": "external"`:
   a. Read the `projectKey` field from the same config. Sanitize the key: replace path separators and non-`[a-zA-Z0-9._-]` chars with `-`. Reject keys that are empty, start with `.`, or contain `..`.
   b. Resolve the external state directory:
      - **Windows:** `%APPDATA%\squad\projects\{projectKey}\`
      - **macOS:** `~/Library/Application Support/squad/projects/{projectKey}/`
      - **Linux:** `$XDG_CONFIG_HOME/squad/projects/{projectKey}/` (default `~/.config/squad/projects/{projectKey}/`)
   c. Set **team root** = that external directory. In external mode, `team_root` points directly to the flat external state directory — files like `team.md`, `routing.md`, and `agents/` live at the top level of this path (no nested `.squad/` subfolder). ALL state paths resolve from this external root.

.squad-templates/external-state.md:13

  • The sanitization rules for projectKey here appear to diverge from the SDK’s resolveExternalStateDir() (which trims leading/trailing dashes and doesn’t reject keys solely for starting with .). If the coordinator follows these instructions, it may compute a different external folder name than the CLI/SDK created. Consider matching the exact SDK rules to avoid path mismatches.
2. If it exists and contains `"stateLocation": "external"`:
   a. Read the `projectKey` field from the same config. Sanitize the key: replace path separators and non-`[a-zA-Z0-9._-]` chars with `-`. Reject keys that are empty, start with `.`, or contain `..`.
   b. Resolve the external state directory:
      - **Windows:** `%APPDATA%\squad\projects\{projectKey}\`
      - **macOS:** `~/Library/Application Support/squad/projects/{projectKey}/`
      - **Linux:** `$XDG_CONFIG_HOME/squad/projects/{projectKey}/` (default `~/.config/squad/projects/{projectKey}/`)

.github/agents/external-state.md:13

  • The sanitization/validation described for projectKey here doesn’t exactly match the SDK’s resolveExternalStateDir() (SDK trims leading/trailing dashes after replacement and doesn’t reject keys just because they start with .). If these instructions are followed literally, the resolved directory name can differ from what the CLI/SDK used. Aligning this doc with the SDK implementation would avoid external-state lookup failures.
2. If it exists and contains `"stateLocation": "external"`:
   a. Read the `projectKey` field from the same config. Sanitize the key: replace path separators and non-`[a-zA-Z0-9._-]` chars with `-`. Reject keys that are empty, start with `.`, or contain `..`.
   b. Resolve the external state directory:
      - **Windows:** `%APPDATA%\squad\projects\{projectKey}\`
      - **macOS:** `~/Library/Application Support/squad/projects/{projectKey}/`
      - **Linux:** `$XDG_CONFIG_HOME/squad/projects/{projectKey}/` (default `~/.config/squad/projects/{projectKey}/`)

Comment thread .squad/templates/external-state.md Outdated
Comment thread .squad-templates/external-state.md Outdated
Comment thread .squad-templates/squad.agent.md Outdated
Comment thread .github/agents/squad.agent.md Outdated
Comment thread templates/squad.agent.md.template Outdated
Comment thread .changeset/fix-external-state-resolution.md
Comment thread packages/squad-cli/templates/squad.agent.md.template
Comment thread packages/squad-sdk/templates/squad.agent.md.template
allai-dev added 4 commits May 16, 2026 23:29
- Pointer now references .squad/templates/external-state.md (matches existing
  on-demand reference convention used by ceremony-reference, casting-reference, etc.)
- Delete .github/agents/external-state.md (was violating template-sync negative-guard
  test that asserts .github/agents/ contains only squad.agent.md)
- Add .squad/templates/external-state.md so the squad repo's own dogfooded coordinator
  can resolve the pointer at runtime
- Run scripts/sync-templates.mjs --sync to propagate external-state.md and the updated
  squad.agent.md.template to all 3 mirror dirs (templates/, packages/squad-cli/templates/,
  packages/squad-sdk/templates/)
- Fix changeset frontmatter: bump @bradygaster/squad-cli and @bradygaster/squad-sdk patch

All 158 template-sync.test.ts tests pass locally.
…ad.agent.md

The previous commit ran sync-templates.mjs which clobbered the
stamped version on the active .github/agents/squad.agent.md file
with the unstamped 0.0.0-source canonical value.

Only the .github/agents/ copy carries the published semver (matches
origin/dev baseline). The mirror copies under templates/, packages/*/
templates/ correctly stay at 0.0.0-source (canonical source).
… for local mode)

The earlier {team_root}/team.md change broke local mode:
- Local: team_root = repo root (per Worktree Awareness), so
  {team_root}/team.md resolved to <repo>/team.md instead of
  the correct <repo>/.squad/team.md.
- External: team_root = external dir (flat layout), so the
  same expression accidentally happened to work.

Reverting to the baseline .squad/team.md keeps local mode
identical to dev. External mode handles the path remap via
the external-state.md instructions ('In external mode, .squad/*
paths resolve from this external root') already linked from
the on-demand reference pointer above.

Also re-stamps .github/agents/squad.agent.md to 0.9.1 since
sync-templates.mjs special-cases this path and clobbered the
stamped version.
…t.md

Reverts the extract-to-separate-file refactor from earlier commits.

Bootstrap problem: squad externalize moves the entire .squad/ directory
to %APPDATA%/squad/projects/{key}/, including .squad/templates/. With
external-state.md sitting in .squad/templates/, the coordinator can't
discover it after externalize without already knowing the remap rule —
chicken-and-egg.

The algorithm must live in the always-accessible coordinator file
(.github/agents/squad.agent.md) which is read directly by Copilot
regardless of externalize state.

This is exactly what the Copilot bot review flagged in its first pass
('keep inline because of template-sync constraints'). User's empirical
discovery: 'squad externalize deletes all files under .squad including
.squad/templates/external-state.md' confirmed the structural issue.

Also re-stamps .github/agents/squad.agent.md to 0.9.1 (sync clobber).
@ahhlun
Copy link
Copy Markdown
Author

ahhlun commented May 17, 2026

Structural pivot at 6d832e07 — flagging for visibility

End-to-end testing surfaced that squad externalize moves the entire .squad/ directory (including .squad/templates/) into the external dir. With external-state.md sitting in .squad/templates/, the coordinator couldn't find it post-externalize without already knowing the external-state remap rule — chicken-and-egg.

Reverted the extract-to-separate-file approach. The External State Resolution algorithm is now inlined directly into squad.agent.md (~18 lines), which is the always-accessible coordinator file Copilot loads regardless of externalize state. This is the structural approach Copilot bot's first-round review argued for; the empirical test confirmed they were right.

Net diff vs dev (6 files, +99/-3):

  • .squad-templates/squad.agent.md + 4 synced copies — inline External State Resolution section added
  • .changeset/fix-external-state-resolution.md — patch bump for squad-cli + squad-sdk

All 155 template-sync tests pass locally. Tamir's "extract to another MD file" thread reopened above with details — if there's a different bootstrap path you'd prefer (e.g., dedicated .github/agents/external-state.md with a negative-guard exception), happy to follow that instead.

@ahhlun
Copy link
Copy Markdown
Author

ahhlun commented May 21, 2026

Closing this PR since it was fixed with

@ahhlun ahhlun closed this May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(coordinator): squad.agent.md does not resolve externalized state — always enters Init Mode

4 participants