feat(orchestrator): product context injection + publish spec to knowledge-repo PR by lhpaul · Pull Request #19 · lhpaul/helm

lhpaul · 2026-05-24T15:30:51Z

Summary

Part A — context injection: new fetchProductContext() fetches README.md + AGENT.md/CLAUDE.md from raw.githubusercontent.com and injects a ## Product Context section into the spec-writer prompt (product slug, workflow stages, README excerpt, agent instructions). Truncated to 2 000 chars per file; 404 → skip gracefully; fetch errors non-fatal.
Part B — publish spec: new publishSpecToPR() clones/updates the knowledge repo locally under data/knowledge-repos/{slug}/, creates branch helm/spec/{externalId}, copies the spec, commits as helm-bot, force-pushes, and opens a PR via gh. Idempotent: reuses existing open PR for the branch. Stage transition discovery → spec-draft happens after successful publish.
DispatchResult.prUrl records the knowledge-repo PR URL.
DataPaths.knowledgeRepos → data/knowledge-repos/ pre-created by ensureDataDir().
RunGit / RunGh injectable for testing without child processes.

Test plan

pnpm --filter @helm/storage test --run — 14/14 pass
pnpm --filter @helm/orchestrator test --run — 67/67 pass (11 new tests for fetchProductContext, 9 new for publishSpecToPR, 8 new in spec-writer)
pnpm --filter @helm/api test --run — 132/132 pass
pnpm --filter @helm/api build — tsc clean, bundle OK
Smoke test: reset issue_1 to discovery, start server with GITHUB_TOKEN, dispatch, poll until done, verify prUrl in result, verify PR in helm-playground-knowledge, verify re-dispatch reuses PR

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Optional product context fetched from GitHub (README/agent instructions) is injected into spec prompts.
- Optional publishing of generated specs to knowledge repos with PR creation/reuse; PR URL returned and added to transition notes.
- Public helpers to fetch product context and publish specs are exported for reuse.
Bug Fixes
- Stricter validation of product and item identifiers to prevent unsafe paths.
- API dispatch route forwards trimmed GitHub token and data directory into background jobs.
Tests
- Expanded coverage for URL parsing, context fetching, spec publishing, token handling/sanitization, and dispatch options.
Chores
- Data storage now creates and exposes a knowledge-repos directory.

…pec to knowledge-repo PR Part A — context injection: - New fetchProductContext() fetches README.md + AGENT.md/CLAUDE.md fallback from raw.githubusercontent.com using GITHUB_TOKEN; truncates to 2000 chars; 404 → skip - buildSpecWriterPrompt() injects a ## Product Context section (product slug, workflow stages, README, agent instructions) when context is provided - dispatchStageHandler() fetches context when githubToken present; errors non-fatal Part B — publish spec to knowledge-repo PR: - New publishSpecToPR() clones/updates knowledge repo, creates branch helm/spec/{externalId}, copies spec, commits, push --force, opens PR via gh - Idempotent: reuses existing open PR for the branch; never creates duplicates - Token-embedded HTTPS URL for auth (no interactive git prompts) - RunGit / RunGh injectable for testing; descriptive error messages per step - handleSpecWriterResult() accepts optional SpecPublishOptions; publish step runs before stage transition (stage change reflects PR state) - DispatchResult gains optional prUrl field Storage: - DataPaths.knowledgeRepos → data/knowledge-repos/ pre-created by ensureDataDir Tests: 67 orchestrator + 132 API (all passing); tsc --noEmit clean Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

haystack-code-reviewer-pr-hook · 2026-05-24T15:30:56Z

Haystack Code Reviewer: Analysis started

PR opened by @lhpaul bb935ac

Haystack is analyzing this PR. We'll let you know when results are ready!

coderabbitai · 2026-05-24T15:31:01Z

Warning

Review limit reached

@lhpaul, we couldn't start this review because you've used your available PR reviews for now.

Your plan includes 5 reviews of capacity. Refill in 25 minutes and 39 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9ae61df1-a7c9-4ac5-b84a-418acd5f61a7

📥 Commits

Reviewing files that changed from the base of the PR and between 791c69d and 0f1e331.

📒 Files selected for processing (6)

packages/orchestrator/src/dispatcher.test.ts
packages/orchestrator/src/specialists/fetch-product-context.test.ts
packages/orchestrator/src/specialists/spec-publisher.test.ts
packages/orchestrator/src/specialists/spec-publisher.ts
packages/orchestrator/src/specialists/spec-writer.test.ts
packages/orchestrator/src/specialists/spec-writer.ts

📝 Walkthrough

Walkthrough

Adds product-context fetching from GitHub, a spec publisher that creates/reuses PRs, storage support for knowledge repos, threads dataRoot and GITHUB_TOKEN through dispatch, and integrates optional context injection and publishing into the spec-writer flow.

Changes

Spec publication with product context integration

Layer / File(s)	Summary
Knowledge repo storage directory `packages/storage/src/data-dir.ts`, `packages/storage/src/data-dir.test.ts`	`DataPaths` adds `knowledgeRepos` directory path; `ensureDataDir()` populates and creates the `data/knowledge-repos/` subdirectory during initialization.
Product context fetching from GitHub `packages/orchestrator/src/specialists/fetch-product-context.ts`, `packages/orchestrator/src/specialists/fetch-product-context.test.ts`	`ProductContext` type with optional `readme` and `agentMd` fields; `parseGitHubRepoUrl()` parses owner/repo from GitHub URLs; `fetchProductContext()` fetches and truncates README and agent instructions (AGENT.md preferred, CLAUDE.md fallback) with injectable `FetchFn` for testing.
Spec publishing to knowledge repo PR `packages/orchestrator/src/specialists/spec-publisher.ts`, `packages/orchestrator/src/specialists/spec-publisher.test.ts`	`publishSpecToPR()` validates `externalId`, clones/syncs knowledge repo using token-embedded HTTPS URL, creates/reset branch `helm/spec/{externalId}`, writes `specs/{externalId}.md`, commits with Helm-bot identity, force-pushes, and creates or reuses open PR; supports injectable `RunGit`/`RunGh`; errors are token-sanitized and prefixed.
Spec-writer with context and publishing `packages/orchestrator/src/specialists/spec-writer.ts`, `packages/orchestrator/src/specialists/spec-writer.test.ts`	`buildSpecWriterPrompt()` accepts optional `ProductContext` and prepends a "## Product Context" block; `buildSpecWriterParams()` threads context into the prompt; `SpecWriterResult` gains optional `prUrl`; `handleSpecWriterResult()` conditionally calls `publishSpecToPR()` when `publishOpts` provided, logs publish failures without transitioning, and includes PR URL in transition note on success.
Dispatcher orchestration of context and publishing `packages/orchestrator/src/dispatcher.ts`, `packages/orchestrator/src/dispatcher.test.ts`	`DispatchOptions` adds `dataRoot`, `githubToken`, `fetchFn`, `runGit`, `runGh`; `DispatchResult` gains optional `prUrl`; `dispatchStageHandler` unconditionally validates path components, optionally fetches product context when `githubToken` is present (errors logged), and passes publish options into spec-writer when appropriate.
Public API exports `packages/orchestrator/src/index.ts`	Re-exports `SpecPublishOptions` and spec-writer functions; exports `fetchProductContext`, `parseGitHubRepoUrl`, `ProductContext`, `FetchFn`; exports `publishSpecToPR` and spec-publisher types `PublishSpecOpts`, `PublishSpecResult`, `RunGit`, `RunGh`.
Request dispatch credential threading `apps/api/src/routes/dispatch.ts`, `apps/api/src/routes/dispatch-options.test.ts`	`runDispatchJob` context now includes `dataRoot` and `githubToken`; the HTTP handler forwards `dataRoot` and a trimmed `GITHUB_TOKEN` (when present) into the background job; tests assert forwarded `DispatchOptions` contain `dataRoot` and correct `githubToken` handling.

Sequence Diagram(s)

sequenceDiagram
  participant HTTPHandler as HTTP POST /dispatch
  participant DispatchHandler as dispatchStageHandler
  participant Fetcher as fetchProductContext
  participant PromptBuilder as buildSpecWriterParams
  participant Writer as handleSpecWriterResult
  participant Publisher as publishSpecToPR

  HTTPHandler->>DispatchHandler: context { workdir, dataRoot, specialistId, githubToken }
  alt spec-writer specialist
    DispatchHandler->>Fetcher: fetch product context (product, githubToken)
    alt githubToken provided
      Fetcher-->>DispatchHandler: ProductContext { readme?, agentMd? }
    else
      Fetcher-->>DispatchHandler: undefined
    end
    DispatchHandler->>PromptBuilder: build params with context
    PromptBuilder-->>DispatchHandler: SpawnParams (prompt includes Product Context?)
    DispatchHandler->>Writer: run spec-writer (publishOpts?)
    alt publishOpts present
      Writer->>Publisher: publish spec (specPath, product, externalId, token, dataRoot)
      Publisher-->>Writer: prUrl
      Writer-->>DispatchHandler: { transitioned: true, prUrl }
    else
      Writer-->>DispatchHandler: { transitioned: true }
    end
    DispatchHandler-->>HTTPHandler: DispatchResult { prUrl? }
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

lhpaul/helm#18: Modifies the dispatch background execution path in apps/api/src/routes/dispatch.ts; related to threading job-store/dispatch context.
lhpaul/helm#14: Touched the dispatch/spec-writer flow; related to extending how dispatchStageHandler is parameterized and used.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly describes the main changes: adding product context injection and spec publishing to knowledge-repo PRs, which are the two primary features introduced across all files in this changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/session-12-context-and-publish

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

haystack-code-reviewer-pr-hook · 2026-05-24T15:35:04Z

Haystack Code Reviewer: PR Analysis Ready!

Haystack Code Reviewer has analyzed your PR and organized it into a format that's easy to parse.

View PR

This link provides a "walkthrough" and explanation of your PR.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/orchestrator/src/specialists/spec-writer.ts (1)

180-183: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Sanitize client-facing errors in failure paths.

These branches currently return internal details (absolute filesystem paths and raw publish errors). Keep detailed info in console.error, but return generic messages to callers.

Suggested fix

   } catch {
     return {
       transitioned: false,
-      error: `spec file not found at ${specPath} — agent did not create it`,
+      error: 'Spec file was not created by the agent',
     };
   }
@@
     } catch (err) {
       const message = err instanceof Error ? err.message : String(err);
       console.error('[spec-writer] Publish step failed', { externalId, error: message });
       return {
         transitioned: false,
-        error: message,
+        error: 'Failed to publish spec to knowledge repository',
       };
     }
   }

Also applies to: 200-205

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/orchestrator/src/specialists/spec-writer.ts` around lines 180 - 183,
Replace the client-facing error strings that expose internal details (e.g., the
branch that returns `error: \`spec file not found at ${specPath} — agent did not
create it\`` and the branch around lines 200-205 that returns raw publish
errors) with generic messages (e.g., "spec file could not be created" or "failed
to publish spec"); keep the full error objects/paths logged via console.error
for debugging. Locate these returns in spec-writer.ts (look for the `specPath`
variable and the publish error return) and change only the returned `error` text
to a sanitized, non-path, non-raw-error string while leaving existing
console.error calls (or add one if missing) to record the detailed internal
error.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/dispatcher.ts`:
- Around line 130-135: The publish path construction uses item.productSlug to
build knowledgeRepoLocalPath without ensuring the same whitelist/sanitization
applied elsewhere (options.workdir guard), creating a path-safety gap; update
the logic around publishOpts (where knowledgeRepoLocalPath is composed) to
validate/sanitize item.productSlug the same way enforced when options.workdir is
absent (reuse the existing whitelist check or sanitizer function) before joining
it into a filesystem path, and ensure any bypass via options.workdir still
triggers validation of item.productSlug so malicious slugs cannot escape the
intended dataRoot.

In `@packages/orchestrator/src/specialists/fetch-product-context.ts`:
- Around line 26-29: The parseGitHubRepoUrl function currently matches any
string containing "github.com/..." instead of validating it is a proper GitHub
URL; update parseGitHubRepoUrl to construct a URL via new URL(url) (wrapped in
try/catch to handle non-URLs), check urlObj.hostname is exactly "github.com" or
"www.github.com", then parse urlObj.pathname (split on "/") to extract owner and
repo, strip a trailing ".git" from repo if present, and return null for invalid
hostnames or malformed paths; ensure the function still returns the same {
owner, repo } | null shape and handle exceptions by returning null.

In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 123-124: Validate and sanitize externalId before using it to
construct branch names and file paths: reject or normalize any values containing
path traversal or unsafe characters (e.g., "../", "/", null bytes) and allow
only a safe pattern such as alphanumerics, dashes and underscores; if invalid,
throw an error or return a handled failure. Update the code that builds const
branchName = `helm/spec/${externalId}` and any code that writes
`${externalId}.md` so it either uses a sanitized/validated externalId (or
path.basename/strict whitelist) and ensure generated Git branch names are safe;
add unit checks where externalId is accepted. Ensure the same
validation/sanitization is applied to the other occurrences noted in this file.
- Around line 47-58: Replace hardcoded binary paths in defaultRunGit and
defaultRunGh to invoke "git" and "gh" via PATH (remove '/usr/bin/git' and
'/opt/homebrew/bin/gh'), and keep GIT_TERMINAL_PROMPT=0 behavior by preserving
env merging in those functions; stop embedding githubToken directly into HTTPS
URLs—use a credential mechanism instead (provide the token via env/credential
helper or gh auth) so commands use credentials from the environment/credential
store rather than appearing in command args; and sanitize/validate externalId
before using it in refs or filenames (reject or normalize path separators and
disallowed characters, then canonicalize to a safe ref name like replacing
unsafe chars) so usages of externalId (e.g., in git ref helm/spec/${externalId}
and file paths join(destDir, `${externalId}.md`)) cannot cause path traversal or
unsafe ref names.

---

Outside diff comments:
In `@packages/orchestrator/src/specialists/spec-writer.ts`:
- Around line 180-183: Replace the client-facing error strings that expose
internal details (e.g., the branch that returns `error: \`spec file not found at
${specPath} — agent did not create it\`` and the branch around lines 200-205
that returns raw publish errors) with generic messages (e.g., "spec file could
not be created" or "failed to publish spec"); keep the full error objects/paths
logged via console.error for debugging. Locate these returns in spec-writer.ts
(look for the `specPath` variable and the publish error return) and change only
the returned `error` text to a sanitized, non-path, non-raw-error string while
leaving existing console.error calls (or add one if missing) to record the
detailed internal error.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9bddb1d1-4a45-41cb-bc82-f12f6f72576c

📥 Commits

Reviewing files that changed from the base of the PR and between 9242eb8 and bb935ac.

📒 Files selected for processing (11)

apps/api/src/routes/dispatch.ts
packages/orchestrator/src/dispatcher.ts
packages/orchestrator/src/index.ts
packages/orchestrator/src/specialists/fetch-product-context.test.ts
packages/orchestrator/src/specialists/fetch-product-context.ts
packages/orchestrator/src/specialists/spec-publisher.test.ts
packages/orchestrator/src/specialists/spec-publisher.ts
packages/orchestrator/src/specialists/spec-writer.test.ts
packages/orchestrator/src/specialists/spec-writer.ts
packages/storage/src/data-dir.test.ts
packages/storage/src/data-dir.ts

CR-1 (major): make isSafePathPart guard unconditional — was gated on !options?.workdir, missing the publish path that also uses productSlug CR-2 (major): parseGitHubRepoUrl now uses new URL() + hostname === 'github.com' check; the regex previously matched any string containing 'github.com/...' CR-3/CR-4 (critical): stop embedding githubToken in HTTPS remote URLs — token appeared in .git/config and git error messages. Switch to GIT_HTTP_EXTRAHEADER via GIT_CONFIG_COUNT/KEY/VALUE env; also fix defaultRunGh to resolve 'gh' from PATH instead of hardcoded /opt/homebrew/bin/gh CR-5/H-1 (critical): validate externalId at top of publishSpecToPR with /^[A-Za-z0-9._-]+$/ before any branch/path construction H-3: add 4 dispatcher tests — fetchFn called when token present, skipped when absent, dispatch continues on context-fetch failure (fallback), prUrl absent when githubToken missing 72/72 orchestrator tests, 132/132 API tests, tsc + bun build clean Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

H-1/H-2 (block-dot-segment-path-inputs): both isSafePathPart in dispatcher and EXTERNAL_ID_SAFE in publishSpecToPR used /^[A-Za-z0-9._-]+$/ which allowed '.' and '..'; add (?!\.) lookahead to block dot-segment values H-5: defaultRunGit now calls 'git' from PATH instead of /usr/bin/git for portability (NixOS, containers, non-standard installations) H-4 (test coverage): new dispatch-options.test.ts verifies GITHUB_TOKEN trimming and dataRoot forwarding to dispatchStageHandler via a focused vi.mock('@helm/orchestrator') that captures options without running the full spec-writer flow Skipped H-3 (error-detail sanitization): internal tool — operators need full diagnostic detail from git/gh failures 72/72 orchestrator tests, 136/136 API tests, tsc + bun build clean Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

packages/orchestrator/src/specialists/spec-publisher.ts (1)

119-124: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Tighten externalId validation to Git branch rules.

This regex still accepts values like foo..bar, foo., and foo.lock, but those are not valid Git branch/ref names, so the input passes here and then fails later when Line 163 builds helm/spec/${externalId}. (git-scm.com)

Suggested direction

-  const EXTERNAL_ID_SAFE = /^(?!\.)[A-Za-z0-9._-]+$/;
+  const EXTERNAL_ID_SAFE = /^(?!\.)(?!.*\.\.)(?!.*\.lock$)(?!.*\.$)[A-Za-z0-9._-]+$/;

#!/bin/bash
set -euo pipefail

python - <<'PY'
import re
rx = re.compile(r'^(?!\.)[A-Za-z0-9._-]+$')
for value in ["HLM-99", "foo..bar", "foo.", "foo.lock"]:
    print(f"{value}: regex_accepts={bool(rx.fullmatch(value))}")
PY

for value in "HLM-99" "foo..bar" "foo." "foo.lock"; do
  if git check-ref-format --branch "helm/spec/$value" >/dev/null 2>&1; then
    echo "git_accepts=$value"
  else
    echo "git_rejects=$value"
  fi
done

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/orchestrator/src/specialists/spec-publisher.ts` around lines 119 -
124, The current EXTERNAL_ID_SAFE (/^(?!\.)[A-Za-z0-9._-]+$/) in
spec-publisher.ts still allows invalid Git ref names like "foo..bar", "foo.",
and "foo.lock"; update the validation for externalId used in the function that
throws `[spec-publisher] Invalid externalId: "${externalId}"` so it enforces Git
branch/ref rules: disallow consecutive dots, disallow a trailing dot, disallow
names ending with ".lock", and reject any leading dot; you can either replace
the single regex EXTERNAL_ID_SAFE with a stricter pattern that enforces those
constraints (no "\.\.", no trailing ".", no /\.lock$/ and still block leading
".") or call out to git check-ref-format --branch to validate
"helm/spec/${externalId}" and throw the same error if git rejects it; ensure the
change references EXTERNAL_ID_SAFE and externalId so callers still hit the same
error path.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 80-96: The current ensureKnowledgeRepo flow mutates the shared
checkout (knowledgeRepoLocalPath) via runGit operations (fetch/checkout/reset)
leading to race conditions when concurrent publishes run; fix by making
ensureKnowledgeRepo use a per-call isolated worktree/temporary clone or acquire
a mutex keyed by knowledgeRepoLocalPath before mutating the repo, perform all
runGit actions (clone/fetch/checkout/reset/copy/commit/push) while holding the
lock, and always release the lock in a finally block; reference
ensureKnowledgeRepo, knowledgeRepoLocalPath, and runGit when implementing the
per-call temp clone or lock to prevent interleaved commits/force-pushes (apply
same change to the related code region around the second occurrence).

---

Duplicate comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 119-124: The current EXTERNAL_ID_SAFE (/^(?!\.)[A-Za-z0-9._-]+$/)
in spec-publisher.ts still allows invalid Git ref names like "foo..bar", "foo.",
and "foo.lock"; update the validation for externalId used in the function that
throws `[spec-publisher] Invalid externalId: "${externalId}"` so it enforces Git
branch/ref rules: disallow consecutive dots, disallow a trailing dot, disallow
names ending with ".lock", and reject any leading dot; you can either replace
the single regex EXTERNAL_ID_SAFE with a stricter pattern that enforces those
constraints (no "\.\.", no trailing ".", no /\.lock$/ and still block leading
".") or call out to git check-ref-format --branch to validate
"helm/spec/${externalId}" and throw the same error if git rejects it; ensure the
change references EXTERNAL_ID_SAFE and externalId so callers still hit the same
error path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 62df23ef-70d7-435d-8fdf-f78fa307d1b0

📥 Commits

Reviewing files that changed from the base of the PR and between bb935ac and 70838c9.

📒 Files selected for processing (6)

apps/api/src/routes/dispatch-options.test.ts
packages/orchestrator/src/dispatcher.test.ts
packages/orchestrator/src/dispatcher.ts
packages/orchestrator/src/specialists/fetch-product-context.ts
packages/orchestrator/src/specialists/spec-publisher.test.ts
packages/orchestrator/src/specialists/spec-publisher.ts

H-new-1 (SSH URLs): parseGitHubRepoUrl now handles git@github.com:owner/repo.git before calling new URL() — previously threw and silently dropped context; 3 new tests cover SSH with/without .git, non-GitHub SSH host H-new-2 (concurrent publish isolation): knowledgeRepoLocalPath in dispatcher now includes item.externalId as a subdirectory so concurrent dispatches for different items of the same product each get their own git clone, preventing working-tree clobber; on re-dispatch the same path is reused (fetch+reset, not re-clone) ENOTEMPTY race fix in dispatch-options.test.ts: waitForJobDone() now polls job status instead of only checking mockDispatch.toHaveBeenCalled() — ensures the jobStore.updateJob write completes before afterEach rm() runs Skipped H-new-3 (sanitize-and-map-http-errors): 3rd occurrence, same reasoning — internal tool where operators need full diagnostic detail from git/gh failures 75/75 orchestrator tests, 136/136 API tests, build clean Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rsal tests Fixes a bug where parseGitHubRepoUrl rejected SSH URLs for repos whose names contain dots (e.g. git@github.com:org/my.repo.git) — the [^/.]+? character class was incorrectly excluding dots from the repo-name segment. Changed to [^/]+? so dotted names parse correctly while still stripping the optional .git suffix via the trailing (?:\.git)? group. Added tests: SSH repo with dots (.git), SSH repo with dots (no suffix), HTTPS repo with dots, dispatcher dot-prefix productSlug → error, dispatcher '..' externalId → error. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

SSH knowledge repo URLs are unsupported for token-based auth because GIT_HTTP_EXTRAHEADER only applies to HTTPS transport. publishSpecToPR now fails early with a clear message directing operators to use an HTTPS URL instead of silently attempting the operation and producing a confusing auth failure. Also threads runGit/runGh from DispatchOptions through to the publish step so integration tests can verify the full token+dataRoot→prUrl happy path without spawning real git/gh processes. Added test cases for both. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 136-145: The guard that rejects SSH knowledge repo URLs only
checks for scp-style "git@" strings and misses "ssh://" URLs; update the check
around knowledgeRepo.url in spec-publisher.ts (the existing if that throws the
`[spec-publisher] SSH knowledge repo URLs are not supported...` error) to also
detect URLs that start with "ssh://” (e.g., by expanding the condition to
include startsWith('ssh://') or a regex like /^git@|^ssh:\/\//) so both
scp-style and ssh://-prefixed SSH URLs are rejected with the same error message.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b7c376c9-0efc-4f58-ae51-126c21abb386

📥 Commits

Reviewing files that changed from the base of the PR and between 70838c9 and 2ed6798.

📒 Files selected for processing (7)

apps/api/src/routes/dispatch-options.test.ts
packages/orchestrator/src/dispatcher.test.ts
packages/orchestrator/src/dispatcher.ts
packages/orchestrator/src/specialists/fetch-product-context.test.ts
packages/orchestrator/src/specialists/fetch-product-context.ts
packages/orchestrator/src/specialists/spec-publisher.test.ts
packages/orchestrator/src/specialists/spec-publisher.ts

The existing git@ check missed URLs using the ssh:// URI scheme (e.g. ssh://git@github.com/org/repo.git), which would bypass validation and fail later with a confusing auth error. Extended the guard to reject both git@ and ssh:// prefixes, and added a matching test case. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ent cross-item races publishSpecToPR now clones the knowledge repo to a fresh temporary directory (tmpdir/helm-publish-{externalId}-{uuid}/) on every call instead of mutating a shared checkout. The temp dir is always removed in a finally block, even on failure. Two concurrent publishes for any combination of items can never interleave their git operations or corrupt each other's branches. Breaking: knowledgeRepoLocalPath removed from PublishSpecOpts and SpecPublishOptions — the publish function manages its own working directory. DispatchOptions.dataRoot is retained for future specialists but is no longer needed by the spec-writer publish step (githubToken alone enables it). Also removes the ensureKnowledgeRepo abstraction (always-fresh clone replaces the clone-or-update logic) and updates all affected tests, including a new concurrent-publish test and a cleanup-on-failure test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 261-264: The final cleanup call await rm(workDir, { recursive:
true, force: true }) in the finally block of spec-publisher.ts can throw and
mask the original error; wrap that rm call in a try/catch inside the same
finally block, catch any error, and log a non-fatal warning (using the module's
logger or console.warn) for errors other than expected "not found" cases, then
swallow the cleanup error so it doesn't overwrite the original exception from
the try block.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 05147e2b-7ff5-4eac-b18f-65f63702bc01

📥 Commits

Reviewing files that changed from the base of the PR and between 2ed6798 and 791c69d.

📒 Files selected for processing (6)

packages/orchestrator/src/dispatcher.test.ts
packages/orchestrator/src/dispatcher.ts
packages/orchestrator/src/specialists/spec-publisher.test.ts
packages/orchestrator/src/specialists/spec-publisher.ts
packages/orchestrator/src/specialists/spec-writer.test.ts
packages/orchestrator/src/specialists/spec-writer.ts

…inal error rm() in the finally block could throw (e.g. if the OS temp dir was already purged), which would replace the actual publish error with an ENOENT and make debugging harder. Wrapping with .catch(() => {}) ensures the original error always propagates. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…token from errors Root cause: GIT_CONFIG_COUNT/KEY/VALUE env vars for http.extraheader require Git 2.31+ and were silently ignored on older versions, causing 'invalid credentials' on private knowledge repos. Fix: embed the token as x-access-token credentials in the clone URL (https://x-access-token:{token}@github.com/{owner}/{repo}). The 'origin' remote inside the temp clone inherits this URL, so git push origin is authenticated without additional configuration. Security: the authenticated URL is never logged. All error paths now run through sanitizeToken() which replaces both the x-access-token:{token}@ pattern and any bare occurrence of the token with *** before propagating or logging. The token lives on-disk only in the temp clone's .git/config for the duration of the publish operation and is removed with the workdir in the finally block. Also folds two manual edits to spec-writer.ts: - Prompt: replaced the literal '{workdir}' placeholder with a clearer instruction to write relative to cwd (avoids confusing the agent with an unresolved template variable) - Artifact-missing error: includes agentResult.finalOutput in both the returned error string and a new console.error so operators can see WHY the agent didn't write the spec New tests: authenticated clone URL format, token sanitization in errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

packages/orchestrator/src/specialists/spec-publisher.ts (1)

64-81: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Remove the PAT from clone URLs.

Embedding githubToken in https://x-access-token:...@ leaks repo credentials via child-process argv and persists them in remote.origin.url. If the process crashes or the temp dir is inspected before cleanup, the token is still exposed. Use a non-URL git auth path and keep the remote URL clean.

#!/bin/bash
set -euo pipefail

file="packages/orchestrator/src/specialists/spec-publisher.ts"

echo "== Current implementation =="
nl -ba "$file" | sed -n '64,81p;148,175p;223,227p'

echo
echo "== Demonstrate that authenticated remote URLs are persisted in git config =="
tmp="$(mktemp -d)"
trap 'rm -rf "$tmp"' EXIT
git init "$tmp/repo" >/dev/null
git -C "$tmp/repo" remote add origin "https://x-access-token:demo-token@github.com/test-org/knowledge"
git -C "$tmp/repo" config --get remote.origin.url

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/orchestrator/src/specialists/spec-publisher.ts` around lines 64 -
81, The current buildAuthenticatedUrl embeds the PAT in the clone URL
(buildAuthenticatedUrl), which leaks credentials; change buildAuthenticatedUrl
to return the clean HTTPS repo URL (https://github.com/{owner}/{repo}) and stop
writing token into remote.origin.url, and update the git clone flow that calls
buildAuthenticatedUrl to instead pass credentials via a git config
http.extraHeader (e.g. run git with -c http.extraHeader="Authorization: token
${githubToken}" or set GIT_HTTP_EXTRA_HEADER for the subprocess) so the token is
used for authentication but not recorded in the repo config or child argv;
ensure any logging uses the plain knowledgeRepo.url and sanitizeToken for raw
errors.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 91-92: The sanitizeToken function currently uses String.replace
which only replaces the first match and can leak tokens when they appear
multiple times; update sanitizeToken (the function named sanitizeToken in
spec-publisher.ts) to use replaceAll (or a global RegExp) for both the
`x-access-token:${token}@` pattern and the bare token replacement so every
occurrence is redacted (e.g., replace(...).replace(...) ->
replaceAll(...).replaceAll(...) or use /.../g regexes referencing the token
safely).

In `@packages/orchestrator/src/specialists/spec-writer.ts`:
- Around line 182-193: The return payload in spec-writer.ts currently includes
raw agent output (agentResult.finalOutput) which must not be sent to callers;
keep the detailed console.error log as-is for server logs but change the
returned object (the block that currently references externalId, specPath and
agentResult.finalOutput) to return a bounded generic error string that omits
agentResult.finalOutput (e.g. "spec file not found at {specPath} — agent did not
create it.") while still returning transitioned: false and any safe identifiers
like specPath or externalId if needed.

---

Duplicate comments:
In `@packages/orchestrator/src/specialists/spec-publisher.ts`:
- Around line 64-81: The current buildAuthenticatedUrl embeds the PAT in the
clone URL (buildAuthenticatedUrl), which leaks credentials; change
buildAuthenticatedUrl to return the clean HTTPS repo URL
(https://github.com/{owner}/{repo}) and stop writing token into
remote.origin.url, and update the git clone flow that calls
buildAuthenticatedUrl to instead pass credentials via a git config
http.extraHeader (e.g. run git with -c http.extraHeader="Authorization: token
${githubToken}" or set GIT_HTTP_EXTRA_HEADER for the subprocess) so the token is
used for authentication but not recorded in the repo config or child argv;
ensure any logging uses the plain knowledgeRepo.url and sanitizeToken for raw
errors.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7377069f-2931-4320-b0e1-93a05aa5fa2d

📥 Commits

Reviewing files that changed from the base of the PR and between 791c69d and 678d6a2.

📒 Files selected for processing (3)

packages/orchestrator/src/specialists/spec-publisher.test.ts
packages/orchestrator/src/specialists/spec-publisher.ts
packages/orchestrator/src/specialists/spec-writer.ts

…put from error payload sanitizeToken used .replace() (replaces first occurrence only) — a verbose git error echoing the authenticated URL twice or including both the URL pattern and the bare token would leak the token in remaining occurrences. Changed both substitutions to .replaceAll() so every occurrence is redacted. handleSpecWriterResult was including agentResult.finalOutput in the error string returned to callers. With product context injected into the prompt, finalOutput can contain README/AGENT.md fragments. Removed finalOutput from the returned error string; the console.error server-side diagnostic retains the full output for operator debugging. Also fixes a pre-existing noUncheckedIndexedAccess TS error in fetch-product-context.test.ts (calls[0] → calls[0]!). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

lhpaul · 2026-05-24T22:53:07Z

@coderabbitai resume

coderabbitai · 2026-05-24T22:53:23Z

✅ Actions performed

Reviews resumed.

- Remove contradictory "Confirm once the file is written" step from the spec-writer prompt; the IMPORTANT block already says to write directly without asking for confirmation — having both caused agent ambiguity. - Guard buildContextSection call: only render the "## Product Context" section when context actually contains readme or agentMd; an empty object no longer emits a scaffolded section that wastes context tokens. - Add path-validation tests without a custom workdir to confirm the isSafePathPart guard fires before the default workdir (which embeds productSlug/externalId in a filesystem path) is computed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

lhpaul and others added 2 commits May 24, 2026 11:50

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts Outdated

lhpaul and others added 3 commits May 24, 2026 13:53

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts Outdated

lhpaul and others added 2 commits May 24, 2026 14:48

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts

lhpaul and others added 2 commits May 24, 2026 16:03

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread packages/orchestrator/src/specialists/spec-publisher.ts Outdated

Comment thread packages/orchestrator/src/specialists/spec-writer.ts

lhpaul merged commit 1e7e83d into develop May 24, 2026
1 check passed

lhpaul deleted the feature/session-12-context-and-publish branch May 24, 2026 23:36

Conversation

lhpaul commented May 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

haystack-code-reviewer-pr-hook Bot commented May 24, 2026

Haystack Code Reviewer: Analysis started

Uh oh!

coderabbitai Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

haystack-code-reviewer-pr-hook Bot commented May 24, 2026

Haystack Code Reviewer: PR Analysis Ready!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lhpaul commented May 24, 2026

Uh oh!

coderabbitai Bot commented May 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lhpaul commented May 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 24, 2026 •

edited

Loading