Skip to content

fix: bridge station file paths to supervisor artifact retrieval (#21)#22

Open
randymorales wants to merge 2 commits into
dmora:mainfrom
randymorales:fix/artifact-file-path-bridge
Open

fix: bridge station file paths to supervisor artifact retrieval (#21)#22
randymorales wants to merge 2 commits into
dmora:mainfrom
randymorales:fix/artifact-file-path-bridge

Conversation

@randymorales

Copy link
Copy Markdown

Summary

Fixes the supervisor error failed to load artifact: file does not exist, raised when the user asks the supervisor about a file a station wrote to disk.

Two parallel "artifact" systems had no link between them:

  • System A — stations save their text result into the ADK artifact service under synthetic names (station-<x>-result), retrievable via load_artifacts.
  • System B — stations also return artifact_path, a real path to a file on disk.

The role-sealed supervisor has no filesystem read tool, so when asked about a file it passes the System B disk path into System A's load_artifacts. No artifact exists under that path → fs.ErrNotExist → the error.

Closes #21 — full root-cause analysis and code references are in the issue.

Changes

Belt-and-suspenders fix covering both in-tree and out-of-tree files:

  • read_file tool (internal/agent/read_file_tool.go) — sandboxed file read for the supervisor, restricted to the session's project roots (workingDir + each station's resolvedCWD). Symlink-safe traversal guards (EvalSymlinks on both target and roots before prefix check), 256KB cap, binary rejection. Reads in-tree files live.
  • Auto-register artifact files (harness_persist.go, agentrun.go) — after a station completes with a non-empty artifact_path, snapshot the file's content into the artifact service keyed by its path, so the supervisor's existing load_artifacts(path) reflex resolves. Covers out-of-tree files (e.g. plan output in ~/.claude/plans/) that read_file's sandbox would block.
  • Prompt (coder.md.tpl) — new <reading_files> block steering the supervisor to use read_file for paths and not pass disk paths to load_artifacts.
  • Tests (read_file_tool_test.go) — sandbox traversal/symlink rejection, multi-root allow, binary/dir/oversize/missing handling.

Division of responsibility

File location Mechanism Freshness
In-tree (repo) read_file(path) live
Out-of-tree (e.g. ~/.claude/plans/) auto-registered artifact via load_artifacts(path) snapshot at station completion

Known trade-off: the auto-registered artifact is a snapshot taken at station completion; if a later station mutates the same file, that artifact goes stale. For in-tree paths the supervisor has both routes and the prompt steers it to read_file (live), so the live read wins in practice.

Validation

Ran the full CI matrix locally (.github/workflows/{ci,lint}.yml):

Step Result
go mod verify
go vet ./...
go build -race ./...
go test -race ./... ✅ all packages ok
go mod tidy + diff ✅ go.mod/go.sum unchanged
gofmt -l .
golangci-lint run ✅ 0 issues
govulncheck ./... ⚠️ pre-existing only

The govulncheck findings (11, via go-git/stdlib through internal/fsext/ls.go) are pre-existing on main and unrelated to this change — no dependencies added (go.mod/go.sum unchanged).

🤖 Generated with Claude Code

…a#21)

The supervisor failed with "failed to load artifact: file does not
exist" when asked about a file a station wrote to disk. Two parallel
"artifact" systems had no link: stations save their text result into the
ADK artifact service under synthetic names (station-<x>-result), but also
return artifact_path pointing at a real file on disk. The role-sealed
supervisor has no filesystem read tool, so it passed the disk path into
load_artifacts, which has no entry for it and returns fs.ErrNotExist.

Fix both sides of the gap:

- read_file tool (Option 2): sandboxed file read for the supervisor,
  restricted to the session's project roots (workingDir + station CWDs),
  with symlink-safe traversal guards, a 256KB cap, and binary rejection.
  Handles in-tree files, read live.

- Auto-register artifact files (Option 1): after a station completes with
  a non-empty artifact_path, snapshot the file content into the artifact
  service keyed by its path, so load_artifacts(path) resolves. Covers
  out-of-tree files (e.g. plan output in ~/.claude/plans) that read_file's
  project sandbox would block.

- Prompt: add a <reading_files> block steering the supervisor to use
  read_file for paths and not pass disk paths to load_artifacts.

Closes dmora#21

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new read_file tool that allows the supervisor to read files directly from disk within allowed project roots, and adds automatic file artifact persistence. The review feedback suggests using filepath.Rel to avoid platform-specific path validation issues and adopting a more robust truncation detection method that does not rely on os.Stat's size reporting.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread internal/agent/read_file_tool.go Outdated
Comment thread internal/agent/read_file_tool.go Outdated
Address Gemini Code Assist review on dmora#22:

- resolveWithinRoots: use filepath.Rel for containment instead of
  strings.HasPrefix(abs, root+sep), which has edge cases at filesystem
  roots ("/" -> "//", "C:\" -> "C:\\").
- readCappedFile: detect truncation by reading maxReadFileBytes+1 and
  comparing the read count, instead of trusting info.Size() (unreliable
  for /proc, /sys, network shares, or TOCTOU between Stat and Open).
- Tests: sibling name-prefix rejection, root-resolves-to-root, and
  exact-cap / one-byte-over truncation boundaries.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@randymorales

Copy link
Copy Markdown
Author

Review feedback addressed — 8de7792

Both Gemini Code Assist findings on internal/agent/read_file_tool.go are fixed. (Codex posted no review — usage limit.)

# Sev Finding Resolution Status
1 HIGH HasPrefix(abs, root+sep) containment has filesystem-root edge cases (///, C:\C:\\) resolveWithinRoots now uses filepath.Rel (rel != ".." && not ".."+sep-prefixed); EvalSymlinks on both sides retained ✅ Fixed
2 MEDIUM Truncation via info.Size() unreliable (/proc, /sys, network shares, TOCTOU) readCappedFile reads maxReadFileBytes+1 and derives truncated := n > cap; os.Stat kept only for IsDir(); binary check on capped data[:n] ✅ Fixed

Tests added/adjusted (read_file_tool_test.go): sibling name-prefix rejection, root-resolves-to-root, exact-cap (not truncated), and one-byte-over-cap (truncated). All 15 subtests pass under -race; gofmt clean; go vet/go build ./... OK.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Supervisor 'failed to load artifact: file does not exist' when asked about a file a station wrote to disk

1 participant