Skip to content

feat(orchestrator): product context injection + publish spec to knowledge-repo PR#19

Merged
lhpaul merged 12 commits into
developfrom
feature/session-12-context-and-publish
May 24, 2026
Merged

feat(orchestrator): product context injection + publish spec to knowledge-repo PR#19
lhpaul merged 12 commits into
developfrom
feature/session-12-context-and-publish

Conversation

@lhpaul
Copy link
Copy Markdown
Owner

@lhpaul lhpaul commented May 24, 2026

Summary

  • Part A — context injection: new fetchProductContext() fetches README.md + AGENT.md/CLAUDE.md from raw.githubusercontent.com and injects a ## Product Context section into the spec-writer prompt (product slug, workflow stages, README excerpt, agent instructions). Truncated to 2 000 chars per file; 404 → skip gracefully; fetch errors non-fatal.
  • Part B — publish spec: new publishSpecToPR() clones/updates the knowledge repo locally under data/knowledge-repos/{slug}/, creates branch helm/spec/{externalId}, copies the spec, commits as helm-bot, force-pushes, and opens a PR via gh. Idempotent: reuses existing open PR for the branch. Stage transition discovery → spec-draft happens after successful publish.
  • DispatchResult.prUrl records the knowledge-repo PR URL.
  • DataPaths.knowledgeReposdata/knowledge-repos/ pre-created by ensureDataDir().
  • RunGit / RunGh injectable for testing without child processes.

Test plan

  • pnpm --filter @helm/storage test --run — 14/14 pass
  • pnpm --filter @helm/orchestrator test --run — 67/67 pass (11 new tests for fetchProductContext, 9 new for publishSpecToPR, 8 new in spec-writer)
  • pnpm --filter @helm/api test --run — 132/132 pass
  • pnpm --filter @helm/api build — tsc clean, bundle OK
  • Smoke test: reset issue_1 to discovery, start server with GITHUB_TOKEN, dispatch, poll until done, verify prUrl in result, verify PR in helm-playground-knowledge, verify re-dispatch reuses PR

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Optional product context fetched from GitHub (README/agent instructions) is injected into spec prompts.
    • Optional publishing of generated specs to knowledge repos with PR creation/reuse; PR URL returned and added to transition notes.
    • Public helpers to fetch product context and publish specs are exported for reuse.
  • Bug Fixes

    • Stricter validation of product and item identifiers to prevent unsafe paths.
    • API dispatch route forwards trimmed GitHub token and data directory into background jobs.
  • Tests

    • Expanded coverage for URL parsing, context fetching, spec publishing, token handling/sanitization, and dispatch options.
  • Chores

    • Data storage now creates and exposes a knowledge-repos directory.

Review Change Stack

…pec to knowledge-repo PR

Part A — context injection:
- New fetchProductContext() fetches README.md + AGENT.md/CLAUDE.md fallback from
  raw.githubusercontent.com using GITHUB_TOKEN; truncates to 2000 chars; 404 → skip
- buildSpecWriterPrompt() injects a ## Product Context section (product slug,
  workflow stages, README, agent instructions) when context is provided
- dispatchStageHandler() fetches context when githubToken present; errors non-fatal

Part B — publish spec to knowledge-repo PR:
- New publishSpecToPR() clones/updates knowledge repo, creates branch
  helm/spec/{externalId}, copies spec, commits, push --force, opens PR via gh
- Idempotent: reuses existing open PR for the branch; never creates duplicates
- Token-embedded HTTPS URL for auth (no interactive git prompts)
- RunGit / RunGh injectable for testing; descriptive error messages per step
- handleSpecWriterResult() accepts optional SpecPublishOptions; publish step
  runs before stage transition (stage change reflects PR state)
- DispatchResult gains optional prUrl field

Storage:
- DataPaths.knowledgeRepos → data/knowledge-repos/ pre-created by ensureDataDir

Tests: 67 orchestrator + 132 API (all passing); tsc --noEmit clean

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@haystack-code-reviewer-pr-hook
Copy link
Copy Markdown
Contributor

Haystack Code Reviewer: Analysis started

PR opened by @lhpaul bb935ac

Haystack is analyzing this PR. We'll let you know when results are ready!

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 24, 2026

Warning

Review limit reached

@lhpaul, we couldn't start this review because you've used your available PR reviews for now.

Your plan includes 5 reviews of capacity. Refill in 25 minutes and 39 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9ae61df1-a7c9-4ac5-b84a-418acd5f61a7

📥 Commits

Reviewing files that changed from the base of the PR and between 791c69d and 0f1e331.

📒 Files selected for processing (6)
  • packages/orchestrator/src/dispatcher.test.ts
  • packages/orchestrator/src/specialists/fetch-product-context.test.ts
  • packages/orchestrator/src/specialists/spec-publisher.test.ts
  • packages/orchestrator/src/specialists/spec-publisher.ts
  • packages/orchestrator/src/specialists/spec-writer.test.ts
  • packages/orchestrator/src/specialists/spec-writer.ts
📝 Walkthrough

Walkthrough

Adds product-context fetching from GitHub, a spec publisher that creates/reuses PRs, storage support for knowledge repos, threads dataRoot and GITHUB_TOKEN through dispatch, and integrates optional context injection and publishing into the spec-writer flow.

Changes

Spec publication with product context integration

Layer / File(s) Summary
Knowledge repo storage directory
packages/storage/src/data-dir.ts, packages/storage/src/data-dir.test.ts
DataPaths adds knowledgeRepos directory path; ensureDataDir() populates and creates the data/knowledge-repos/ subdirectory during initialization.
Product context fetching from GitHub
packages/orchestrator/src/specialists/fetch-product-context.ts, packages/orchestrator/src/specialists/fetch-product-context.test.ts
ProductContext type with optional readme and agentMd fields; parseGitHubRepoUrl() parses owner/repo from GitHub URLs; fetchProductContext() fetches and truncates README and agent instructions (AGENT.md preferred, CLAUDE.md fallback) with injectable FetchFn for testing.
Spec publishing to knowledge repo PR
packages/orchestrator/src/specialists/spec-publisher.ts, packages/orchestrator/src/specialists/spec-publisher.test.ts
publishSpecToPR() validates externalId, clones/syncs knowledge repo using token-embedded HTTPS URL, creates/reset branch helm/spec/{externalId}, writes specs/{externalId}.md, commits with Helm-bot identity, force-pushes, and creates or reuses open PR; supports injectable RunGit/RunGh; errors are token-sanitized and prefixed.
Spec-writer with context and publishing
packages/orchestrator/src/specialists/spec-writer.ts, packages/orchestrator/src/specialists/spec-writer.test.ts
buildSpecWriterPrompt() accepts optional ProductContext and prepends a "## Product Context" block; buildSpecWriterParams() threads context into the prompt; SpecWriterResult gains optional prUrl; handleSpecWriterResult() conditionally calls publishSpecToPR() when publishOpts provided, logs publish failures without transitioning, and includes PR URL in transition note on success.
Dispatcher orchestration of context and publishing
packages/orchestrator/src/dispatcher.ts, packages/orchestrator/src/dispatcher.test.ts
DispatchOptions adds dataRoot, githubToken, fetchFn, runGit, runGh; DispatchResult gains optional prUrl; dispatchStageHandler unconditionally validates path components, optionally fetches product context when githubToken is present (errors logged), and passes publish options into spec-writer when appropriate.
Public API exports
packages/orchestrator/src/index.ts
Re-exports SpecPublishOptions and spec-writer functions; exports fetchProductContext, parseGitHubRepoUrl, ProductContext, FetchFn; exports publishSpecToPR and spec-publisher types PublishSpecOpts, PublishSpecResult, RunGit, RunGh.
Request dispatch credential threading
apps/api/src/routes/dispatch.ts, apps/api/src/routes/dispatch-options.test.ts
runDispatchJob context now includes dataRoot and githubToken; the HTTP handler forwards dataRoot and a trimmed GITHUB_TOKEN (when present) into the background job; tests assert forwarded DispatchOptions contain dataRoot and correct githubToken handling.

Sequence Diagram(s)

sequenceDiagram
  participant HTTPHandler as HTTP POST /dispatch
  participant DispatchHandler as dispatchStageHandler
  participant Fetcher as fetchProductContext
  participant PromptBuilder as buildSpecWriterParams
  participant Writer as handleSpecWriterResult
  participant Publisher as publishSpecToPR

  HTTPHandler->>DispatchHandler: context { workdir, dataRoot, specialistId, githubToken }
  alt spec-writer specialist
    DispatchHandler->>Fetcher: fetch product context (product, githubToken)
    alt githubToken provided
      Fetcher-->>DispatchHandler: ProductContext { readme?, agentMd? }
    else
      Fetcher-->>DispatchHandler: undefined
    end
    DispatchHandler->>PromptBuilder: build params with context
    PromptBuilder-->>DispatchHandler: SpawnParams (prompt includes Product Context?)
    DispatchHandler->>Writer: run spec-writer (publishOpts?)
    alt publishOpts present
      Writer->>Publisher: publish spec (specPath, product, externalId, token, dataRoot)
      Publisher-->>Writer: prUrl
      Writer-->>DispatchHandler: { transitioned: true, prUrl }
    else
      Writer-->>DispatchHandler: { transitioned: true }
    end
    DispatchHandler-->>HTTPHandler: DispatchResult { prUrl? }
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • lhpaul/helm#18: Modifies the dispatch background execution path in apps/api/src/routes/dispatch.ts; related to threading job-store/dispatch context.
  • lhpaul/helm#14: Touched the dispatch/spec-writer flow; related to extending how dispatchStageHandler is parameterized and used.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main changes: adding product context injection and spec publishing to knowledge-repo PRs, which are the two primary features introduced across all files in this changeset.
Docstring Coverage ✅ Passed Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/session-12-context-and-publish

Comment @coderabbitai help to get the list of available commands and usage tips.

@haystack-code-reviewer-pr-hook
Copy link
Copy Markdown
Contributor

Haystack Code Reviewer: PR Analysis Ready!

Haystack Code Reviewer has analyzed your PR and organized it into a format that's easy to parse.

View PR

This link provides a "walkthrough" and explanation of your PR.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/orchestrator/src/specialists/spec-writer.ts (1)

180-183: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize client-facing errors in failure paths.

These branches currently return internal details (absolute filesystem paths and raw publish errors). Keep detailed info in console.error, but return generic messages to callers.

Suggested fix
   } catch {
     return {
       transitioned: false,
-      error: `spec file not found at ${specPath} — agent did not create it`,
+      error: 'Spec file was not created by the agent',
     };
   }
@@
     } catch (err) {
       const message = err instanceof Error ? err.message : String(err);
       console.error('[spec-writer] Publish step failed', { externalId, error: message });
       return {
         transitioned: false,
-        error: message,
+        error: 'Failed to publish spec to knowledge repository',
       };
     }
   }

Also applies to: 200-205

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/orchestrator/src/specialists/spec-writer.ts` around lines 180 - 183,
Replace the client-facing error strings that expose internal details (e.g., the
branch that returns `error: \`spec file not found at ${specPath} — agent did not
create it\`` and the branch around lines 200-205 that returns raw publish
errors) with generic messages (e.g., "spec file could not be created" or "failed
to publish spec"); keep the full error objects/paths logged via console.error
for debugging. Locate these returns in spec-writer.ts (look for the `specPath`
variable and the publish error return) and change only the returned `error` text
to a sanitized, non-path, non-raw-error string while leaving existing
console.error calls (or add one if missing) to record the detailed internal
error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/dispatcher.ts`:
- Around line 130-135: The publish path construction uses item.productSlug to
build knowledgeRepoLocalPath without ensuring the same whitelist/sanitization
applied elsewhere (options.workdir guard), creating a path-safety gap; update
the logic around publishOpts (where knowledgeRepoLocalPath is composed) to
validate/sanitize item.productSlug the same way enforced when options.workdir is
absent (reuse the existing whitelist check or sanitizer function) before joining
it into a filesystem path, and ensure any bypass via options.workdir still
triggers validation of item.productSlug so malicious slugs cannot escape the
intended dataRoot.

In `@packages/orchestrator/src/specialists/fetch-product-context.ts`:
- Around line 26-29: The parseGitHubRepoUrl function currently matches any
string containing "github.com/..." instead of validating it is a proper GitHub
URL; update parseGitHubRepoUrl to construct a URL via new URL(url) (wrapped in
try/catch to handle non-URLs), check urlObj.hostname is exactly "github.com" or
"www.github.com", then parse urlObj.pathname (split on "/") to extract owner and
repo, strip a trailing ".git" from repo if present, and return null for invalid
hostnames or malformed paths; ensure the function still returns the same {
owner, repo } | null shape and handle exceptions by returning null.

In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 123-124: Validate and sanitize externalId before using it to
construct branch names and file paths: reject or normalize any values containing
path traversal or unsafe characters (e.g., "../", "/", null bytes) and allow
only a safe pattern such as alphanumerics, dashes and underscores; if invalid,
throw an error or return a handled failure. Update the code that builds const
branchName = `helm/spec/${externalId}` and any code that writes
`${externalId}.md` so it either uses a sanitized/validated externalId (or
path.basename/strict whitelist) and ensure generated Git branch names are safe;
add unit checks where externalId is accepted. Ensure the same
validation/sanitization is applied to the other occurrences noted in this file.
- Around line 47-58: Replace hardcoded binary paths in defaultRunGit and
defaultRunGh to invoke "git" and "gh" via PATH (remove '/usr/bin/git' and
'/opt/homebrew/bin/gh'), and keep GIT_TERMINAL_PROMPT=0 behavior by preserving
env merging in those functions; stop embedding githubToken directly into HTTPS
URLs—use a credential mechanism instead (provide the token via env/credential
helper or gh auth) so commands use credentials from the environment/credential
store rather than appearing in command args; and sanitize/validate externalId
before using it in refs or filenames (reject or normalize path separators and
disallowed characters, then canonicalize to a safe ref name like replacing
unsafe chars) so usages of externalId (e.g., in git ref helm/spec/${externalId}
and file paths join(destDir, `${externalId}.md`)) cannot cause path traversal or
unsafe ref names.

---

Outside diff comments:
In `@packages/orchestrator/src/specialists/spec-writer.ts`:
- Around line 180-183: Replace the client-facing error strings that expose
internal details (e.g., the branch that returns `error: \`spec file not found at
${specPath} — agent did not create it\`` and the branch around lines 200-205
that returns raw publish errors) with generic messages (e.g., "spec file could
not be created" or "failed to publish spec"); keep the full error objects/paths
logged via console.error for debugging. Locate these returns in spec-writer.ts
(look for the `specPath` variable and the publish error return) and change only
the returned `error` text to a sanitized, non-path, non-raw-error string while
leaving existing console.error calls (or add one if missing) to record the
detailed internal error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9bddb1d1-4a45-41cb-bc82-f12f6f72576c

📥 Commits

Reviewing files that changed from the base of the PR and between 9242eb8 and bb935ac.

📒 Files selected for processing (11)
  • apps/api/src/routes/dispatch.ts
  • packages/orchestrator/src/dispatcher.ts
  • packages/orchestrator/src/index.ts
  • packages/orchestrator/src/specialists/fetch-product-context.test.ts
  • packages/orchestrator/src/specialists/fetch-product-context.ts
  • packages/orchestrator/src/specialists/spec-publisher.test.ts
  • packages/orchestrator/src/specialists/spec-publisher.ts
  • packages/orchestrator/src/specialists/spec-writer.test.ts
  • packages/orchestrator/src/specialists/spec-writer.ts
  • packages/storage/src/data-dir.test.ts
  • packages/storage/src/data-dir.ts

Comment thread packages/orchestrator/src/dispatcher.ts Outdated
Comment thread packages/orchestrator/src/specialists/fetch-product-context.ts
Comment thread packages/orchestrator/src/specialists/spec-publisher.ts
Comment thread packages/orchestrator/src/specialists/spec-publisher.ts Outdated
Comment thread packages/orchestrator/src/specialists/spec-publisher.ts
lhpaul and others added 2 commits May 24, 2026 11:50
CR-1 (major): make isSafePathPart guard unconditional — was gated on
!options?.workdir, missing the publish path that also uses productSlug

CR-2 (major): parseGitHubRepoUrl now uses new URL() + hostname === 'github.com'
check; the regex previously matched any string containing 'github.com/...'

CR-3/CR-4 (critical): stop embedding githubToken in HTTPS remote URLs — token
appeared in .git/config and git error messages. Switch to GIT_HTTP_EXTRAHEADER
via GIT_CONFIG_COUNT/KEY/VALUE env; also fix defaultRunGh to resolve 'gh' from
PATH instead of hardcoded /opt/homebrew/bin/gh

CR-5/H-1 (critical): validate externalId at top of publishSpecToPR with
/^[A-Za-z0-9._-]+$/ before any branch/path construction

H-3: add 4 dispatcher tests — fetchFn called when token present, skipped when
absent, dispatch continues on context-fetch failure (fallback), prUrl absent
when githubToken missing

72/72 orchestrator tests, 132/132 API tests, tsc + bun build clean

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
H-1/H-2 (block-dot-segment-path-inputs): both isSafePathPart in dispatcher
and EXTERNAL_ID_SAFE in publishSpecToPR used /^[A-Za-z0-9._-]+$/ which
allowed '.' and '..'; add (?!\.) lookahead to block dot-segment values

H-5: defaultRunGit now calls 'git' from PATH instead of /usr/bin/git for
portability (NixOS, containers, non-standard installations)

H-4 (test coverage): new dispatch-options.test.ts verifies GITHUB_TOKEN
trimming and dataRoot forwarding to dispatchStageHandler via a focused
vi.mock('@helm/orchestrator') that captures options without running the
full spec-writer flow

Skipped H-3 (error-detail sanitization): internal tool — operators need
full diagnostic detail from git/gh failures

72/72 orchestrator tests, 136/136 API tests, tsc + bun build clean

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/orchestrator/src/specialists/spec-publisher.ts (1)

119-124: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Tighten externalId validation to Git branch rules.

This regex still accepts values like foo..bar, foo., and foo.lock, but those are not valid Git branch/ref names, so the input passes here and then fails later when Line 163 builds helm/spec/${externalId}. (git-scm.com)

Suggested direction
-  const EXTERNAL_ID_SAFE = /^(?!\.)[A-Za-z0-9._-]+$/;
+  const EXTERNAL_ID_SAFE = /^(?!\.)(?!.*\.\.)(?!.*\.lock$)(?!.*\.$)[A-Za-z0-9._-]+$/;
#!/bin/bash
set -euo pipefail

python - <<'PY'
import re
rx = re.compile(r'^(?!\.)[A-Za-z0-9._-]+$')
for value in ["HLM-99", "foo..bar", "foo.", "foo.lock"]:
    print(f"{value}: regex_accepts={bool(rx.fullmatch(value))}")
PY

for value in "HLM-99" "foo..bar" "foo." "foo.lock"; do
  if git check-ref-format --branch "helm/spec/$value" >/dev/null 2>&1; then
    echo "git_accepts=$value"
  else
    echo "git_rejects=$value"
  fi
done
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/orchestrator/src/specialists/spec-publisher.ts` around lines 119 -
124, The current EXTERNAL_ID_SAFE (/^(?!\.)[A-Za-z0-9._-]+$/) in
spec-publisher.ts still allows invalid Git ref names like "foo..bar", "foo.",
and "foo.lock"; update the validation for externalId used in the function that
throws `[spec-publisher] Invalid externalId: "${externalId}"` so it enforces Git
branch/ref rules: disallow consecutive dots, disallow a trailing dot, disallow
names ending with ".lock", and reject any leading dot; you can either replace
the single regex EXTERNAL_ID_SAFE with a stricter pattern that enforces those
constraints (no "\.\.", no trailing ".", no /\.lock$/ and still block leading
".") or call out to git check-ref-format --branch to validate
"helm/spec/${externalId}" and throw the same error if git rejects it; ensure the
change references EXTERNAL_ID_SAFE and externalId so callers still hit the same
error path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 80-96: The current ensureKnowledgeRepo flow mutates the shared
checkout (knowledgeRepoLocalPath) via runGit operations (fetch/checkout/reset)
leading to race conditions when concurrent publishes run; fix by making
ensureKnowledgeRepo use a per-call isolated worktree/temporary clone or acquire
a mutex keyed by knowledgeRepoLocalPath before mutating the repo, perform all
runGit actions (clone/fetch/checkout/reset/copy/commit/push) while holding the
lock, and always release the lock in a finally block; reference
ensureKnowledgeRepo, knowledgeRepoLocalPath, and runGit when implementing the
per-call temp clone or lock to prevent interleaved commits/force-pushes (apply
same change to the related code region around the second occurrence).

---

Duplicate comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 119-124: The current EXTERNAL_ID_SAFE (/^(?!\.)[A-Za-z0-9._-]+$/)
in spec-publisher.ts still allows invalid Git ref names like "foo..bar", "foo.",
and "foo.lock"; update the validation for externalId used in the function that
throws `[spec-publisher] Invalid externalId: "${externalId}"` so it enforces Git
branch/ref rules: disallow consecutive dots, disallow a trailing dot, disallow
names ending with ".lock", and reject any leading dot; you can either replace
the single regex EXTERNAL_ID_SAFE with a stricter pattern that enforces those
constraints (no "\.\.", no trailing ".", no /\.lock$/ and still block leading
".") or call out to git check-ref-format --branch to validate
"helm/spec/${externalId}" and throw the same error if git rejects it; ensure the
change references EXTERNAL_ID_SAFE and externalId so callers still hit the same
error path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 62df23ef-70d7-435d-8fdf-f78fa307d1b0

📥 Commits

Reviewing files that changed from the base of the PR and between bb935ac and 70838c9.

📒 Files selected for processing (6)
  • apps/api/src/routes/dispatch-options.test.ts
  • packages/orchestrator/src/dispatcher.test.ts
  • packages/orchestrator/src/dispatcher.ts
  • packages/orchestrator/src/specialists/fetch-product-context.ts
  • packages/orchestrator/src/specialists/spec-publisher.test.ts
  • packages/orchestrator/src/specialists/spec-publisher.ts

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts Outdated
lhpaul and others added 3 commits May 24, 2026 13:53
H-new-1 (SSH URLs): parseGitHubRepoUrl now handles git@github.com:owner/repo.git
before calling new URL() — previously threw and silently dropped context;
3 new tests cover SSH with/without .git, non-GitHub SSH host

H-new-2 (concurrent publish isolation): knowledgeRepoLocalPath in dispatcher now
includes item.externalId as a subdirectory so concurrent dispatches for different
items of the same product each get their own git clone, preventing working-tree
clobber; on re-dispatch the same path is reused (fetch+reset, not re-clone)

ENOTEMPTY race fix in dispatch-options.test.ts: waitForJobDone() now polls job
status instead of only checking mockDispatch.toHaveBeenCalled() — ensures the
jobStore.updateJob write completes before afterEach rm() runs

Skipped H-new-3 (sanitize-and-map-http-errors): 3rd occurrence, same reasoning —
internal tool where operators need full diagnostic detail from git/gh failures

75/75 orchestrator tests, 136/136 API tests, build clean

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rsal tests

Fixes a bug where parseGitHubRepoUrl rejected SSH URLs for repos whose
names contain dots (e.g. git@github.com:org/my.repo.git) — the [^/.]+?
character class was incorrectly excluding dots from the repo-name segment.
Changed to [^/]+? so dotted names parse correctly while still stripping
the optional .git suffix via the trailing (?:\.git)? group.

Added tests: SSH repo with dots (.git), SSH repo with dots (no suffix),
HTTPS repo with dots, dispatcher dot-prefix productSlug → error, dispatcher
'..' externalId → error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SSH knowledge repo URLs are unsupported for token-based auth because
GIT_HTTP_EXTRAHEADER only applies to HTTPS transport. publishSpecToPR now
fails early with a clear message directing operators to use an HTTPS URL
instead of silently attempting the operation and producing a confusing
auth failure.

Also threads runGit/runGh from DispatchOptions through to the publish step
so integration tests can verify the full token+dataRoot→prUrl happy path
without spawning real git/gh processes. Added test cases for both.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 136-145: The guard that rejects SSH knowledge repo URLs only
checks for scp-style "git@" strings and misses "ssh://" URLs; update the check
around knowledgeRepo.url in spec-publisher.ts (the existing if that throws the
`[spec-publisher] SSH knowledge repo URLs are not supported...` error) to also
detect URLs that start with "ssh://” (e.g., by expanding the condition to
include startsWith('ssh://') or a regex like /^git@|^ssh:\/\//) so both
scp-style and ssh://-prefixed SSH URLs are rejected with the same error message.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b7c376c9-0efc-4f58-ae51-126c21abb386

📥 Commits

Reviewing files that changed from the base of the PR and between 70838c9 and 2ed6798.

📒 Files selected for processing (7)
  • apps/api/src/routes/dispatch-options.test.ts
  • packages/orchestrator/src/dispatcher.test.ts
  • packages/orchestrator/src/dispatcher.ts
  • packages/orchestrator/src/specialists/fetch-product-context.test.ts
  • packages/orchestrator/src/specialists/fetch-product-context.ts
  • packages/orchestrator/src/specialists/spec-publisher.test.ts
  • packages/orchestrator/src/specialists/spec-publisher.ts

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts Outdated
lhpaul and others added 2 commits May 24, 2026 14:48
The existing git@ check missed URLs using the ssh:// URI scheme
(e.g. ssh://git@github.com/org/repo.git), which would bypass validation
and fail later with a confusing auth error. Extended the guard to reject
both git@ and ssh:// prefixes, and added a matching test case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ent cross-item races

publishSpecToPR now clones the knowledge repo to a fresh temporary directory
(tmpdir/helm-publish-{externalId}-{uuid}/) on every call instead of mutating
a shared checkout. The temp dir is always removed in a finally block, even on
failure. Two concurrent publishes for any combination of items can never
interleave their git operations or corrupt each other's branches.

Breaking: knowledgeRepoLocalPath removed from PublishSpecOpts and
SpecPublishOptions — the publish function manages its own working directory.
DispatchOptions.dataRoot is retained for future specialists but is no longer
needed by the spec-writer publish step (githubToken alone enables it).

Also removes the ensureKnowledgeRepo abstraction (always-fresh clone
replaces the clone-or-update logic) and updates all affected tests,
including a new concurrent-publish test and a cleanup-on-failure test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 261-264: The final cleanup call await rm(workDir, { recursive:
true, force: true }) in the finally block of spec-publisher.ts can throw and
mask the original error; wrap that rm call in a try/catch inside the same
finally block, catch any error, and log a non-fatal warning (using the module's
logger or console.warn) for errors other than expected "not found" cases, then
swallow the cleanup error so it doesn't overwrite the original exception from
the try block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 05147e2b-7ff5-4eac-b18f-65f63702bc01

📥 Commits

Reviewing files that changed from the base of the PR and between 2ed6798 and 791c69d.

📒 Files selected for processing (6)
  • packages/orchestrator/src/dispatcher.test.ts
  • packages/orchestrator/src/dispatcher.ts
  • packages/orchestrator/src/specialists/spec-publisher.test.ts
  • packages/orchestrator/src/specialists/spec-publisher.ts
  • packages/orchestrator/src/specialists/spec-writer.test.ts
  • packages/orchestrator/src/specialists/spec-writer.ts

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts
lhpaul and others added 2 commits May 24, 2026 16:03
…inal error

rm() in the finally block could throw (e.g. if the OS temp dir was already
purged), which would replace the actual publish error with an ENOENT and make
debugging harder. Wrapping with .catch(() => {}) ensures the original error
always propagates.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…token from errors

Root cause: GIT_CONFIG_COUNT/KEY/VALUE env vars for http.extraheader require
Git 2.31+ and were silently ignored on older versions, causing 'invalid
credentials' on private knowledge repos.

Fix: embed the token as x-access-token credentials in the clone URL
(https://x-access-token:{token}@github.com/{owner}/{repo}). The 'origin'
remote inside the temp clone inherits this URL, so git push origin is
authenticated without additional configuration.

Security: the authenticated URL is never logged. All error paths now run
through sanitizeToken() which replaces both the x-access-token:{token}@
pattern and any bare occurrence of the token with *** before propagating
or logging. The token lives on-disk only in the temp clone's .git/config
for the duration of the publish operation and is removed with the workdir
in the finally block.

Also folds two manual edits to spec-writer.ts:
- Prompt: replaced the literal '{workdir}' placeholder with a clearer
  instruction to write relative to cwd (avoids confusing the agent with
  an unresolved template variable)
- Artifact-missing error: includes agentResult.finalOutput in both the
  returned error string and a new console.error so operators can see WHY
  the agent didn't write the spec

New tests: authenticated clone URL format, token sanitization in errors.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
packages/orchestrator/src/specialists/spec-publisher.ts (1)

64-81: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Remove the PAT from clone URLs.

Embedding githubToken in https://x-access-token:...@ leaks repo credentials via child-process argv and persists them in remote.origin.url. If the process crashes or the temp dir is inspected before cleanup, the token is still exposed. Use a non-URL git auth path and keep the remote URL clean.

#!/bin/bash
set -euo pipefail

file="packages/orchestrator/src/specialists/spec-publisher.ts"

echo "== Current implementation =="
nl -ba "$file" | sed -n '64,81p;148,175p;223,227p'

echo
echo "== Demonstrate that authenticated remote URLs are persisted in git config =="
tmp="$(mktemp -d)"
trap 'rm -rf "$tmp"' EXIT
git init "$tmp/repo" >/dev/null
git -C "$tmp/repo" remote add origin "https://x-access-token:demo-token@github.com/test-org/knowledge"
git -C "$tmp/repo" config --get remote.origin.url
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/orchestrator/src/specialists/spec-publisher.ts` around lines 64 -
81, The current buildAuthenticatedUrl embeds the PAT in the clone URL
(buildAuthenticatedUrl), which leaks credentials; change buildAuthenticatedUrl
to return the clean HTTPS repo URL (https://github.com/{owner}/{repo}) and stop
writing token into remote.origin.url, and update the git clone flow that calls
buildAuthenticatedUrl to instead pass credentials via a git config
http.extraHeader (e.g. run git with -c http.extraHeader="Authorization: token
${githubToken}" or set GIT_HTTP_EXTRA_HEADER for the subprocess) so the token is
used for authentication but not recorded in the repo config or child argv;
ensure any logging uses the plain knowledgeRepo.url and sanitizeToken for raw
errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 91-92: The sanitizeToken function currently uses String.replace
which only replaces the first match and can leak tokens when they appear
multiple times; update sanitizeToken (the function named sanitizeToken in
spec-publisher.ts) to use replaceAll (or a global RegExp) for both the
`x-access-token:${token}@` pattern and the bare token replacement so every
occurrence is redacted (e.g., replace(...).replace(...) ->
replaceAll(...).replaceAll(...) or use /.../g regexes referencing the token
safely).

In `@packages/orchestrator/src/specialists/spec-writer.ts`:
- Around line 182-193: The return payload in spec-writer.ts currently includes
raw agent output (agentResult.finalOutput) which must not be sent to callers;
keep the detailed console.error log as-is for server logs but change the
returned object (the block that currently references externalId, specPath and
agentResult.finalOutput) to return a bounded generic error string that omits
agentResult.finalOutput (e.g. "spec file not found at {specPath} — agent did not
create it.") while still returning transitioned: false and any safe identifiers
like specPath or externalId if needed.

---

Duplicate comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 64-81: The current buildAuthenticatedUrl embeds the PAT in the
clone URL (buildAuthenticatedUrl), which leaks credentials; change
buildAuthenticatedUrl to return the clean HTTPS repo URL
(https://github.com/{owner}/{repo}) and stop writing token into
remote.origin.url, and update the git clone flow that calls
buildAuthenticatedUrl to instead pass credentials via a git config
http.extraHeader (e.g. run git with -c http.extraHeader="Authorization: token
${githubToken}" or set GIT_HTTP_EXTRA_HEADER for the subprocess) so the token is
used for authentication but not recorded in the repo config or child argv;
ensure any logging uses the plain knowledgeRepo.url and sanitizeToken for raw
errors.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7377069f-2931-4320-b0e1-93a05aa5fa2d

📥 Commits

Reviewing files that changed from the base of the PR and between 791c69d and 678d6a2.

📒 Files selected for processing (3)
  • packages/orchestrator/src/specialists/spec-publisher.test.ts
  • packages/orchestrator/src/specialists/spec-publisher.ts
  • packages/orchestrator/src/specialists/spec-writer.ts

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts Outdated
Comment thread packages/orchestrator/src/specialists/spec-writer.ts
…put from error payload

sanitizeToken used .replace() (replaces first occurrence only) — a verbose
git error echoing the authenticated URL twice or including both the URL
pattern and the bare token would leak the token in remaining occurrences.
Changed both substitutions to .replaceAll() so every occurrence is redacted.

handleSpecWriterResult was including agentResult.finalOutput in the error
string returned to callers. With product context injected into the prompt,
finalOutput can contain README/AGENT.md fragments. Removed finalOutput from
the returned error string; the console.error server-side diagnostic retains
the full output for operator debugging.

Also fixes a pre-existing noUncheckedIndexedAccess TS error in
fetch-product-context.test.ts (calls[0] → calls[0]!).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lhpaul
Copy link
Copy Markdown
Owner Author

lhpaul commented May 24, 2026

@coderabbitai resume

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 24, 2026

✅ Actions performed

Reviews resumed.

- Remove contradictory "Confirm once the file is written" step from the
  spec-writer prompt; the IMPORTANT block already says to write directly
  without asking for confirmation — having both caused agent ambiguity.

- Guard buildContextSection call: only render the "## Product Context"
  section when context actually contains readme or agentMd; an empty
  object no longer emits a scaffolded section that wastes context tokens.

- Add path-validation tests without a custom workdir to confirm the
  isSafePathPart guard fires before the default workdir (which embeds
  productSlug/externalId in a filesystem path) is computed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lhpaul lhpaul merged commit 1e7e83d into develop May 24, 2026
1 check passed
@lhpaul lhpaul deleted the feature/session-12-context-and-publish branch May 24, 2026 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant