feat(@helm/api): helm sync command for manual item hydration from GitHub Projects by lhpaul · Pull Request #12 · lhpaul/helm

lhpaul · 2026-05-18T17:00:07Z

Summary

@helm/adapters: extends GET_PROJECT to query both organization and user fields — supports personal GitHub accounts (like lhpaul) alongside org accounts. ensureProjectId falls back to user.projectV2 when organization is null. No breaking change for org accounts.
apps/api/src/services/sync.ts: syncProductItems(product, token, dataRoot, options?) — fetches all items via GitHubProjectsAdapter.listItems() (handles pagination), maps to ItemState with triggeredBy: 'sync:github-projects', writes via writeJsonAtomic (idempotent overwrite). Injectable _listItems and _writeJson for testing.
apps/api/src/scripts/sync.ts: CLI entry point — validates args, reads GITHUB_TOKEN, finds product in registry, exits gracefully for non-github providers.
package.json: adds "sync": "bun run src/scripts/sync.ts".

Usage

pnpm --filter @helm/api sync helm-playground
# [sync] Starting sync for product "helm-playground"…
# [sync] product=helm-playground item=issue_1 title="Hello world endpoint" stage=discovery
# Synced 1 item in 800ms

Test plan

pnpm --filter @helm/adapters test — 68 tests pass (adapter org+user fallback tested via existing mock)
pnpm --filter @helm/api test — 75 tests pass (13 new sync tests)
pnpm --filter @helm/api build — clean

Summary by CodeRabbit

New Features
- Added a product-level CLI sync to import items from GitHub Projects (runs for products using the GitHub Projects provider).
Bug Fixes / Behavior
- Reading now supports personal-account (user) projects, while webhook registration is restricted to organization-owned projects and will be rejected for personal accounts.
- Sync is idempotent and skips unchanged items; invalid external IDs are skipped.
Tests
- Added comprehensive tests covering sync logic, idempotency, stage mapping, skips, and error cases.
Chores
- Updated API package scripts: explicit lint target and new sync script.

…script entry point Adapters: extend GET_PROJECT query to include user(login:) alongside organization(login:) so GitHub personal account projects are resolved correctly. ensureProjectId falls back to res.user?.projectV2 when organization returns null. No breaking change for org accounts. Services: syncProductItems(product, token, dataRoot, options?) fetches all items via GitHubProjectsAdapter.listItems() (paginates automatically), maps each NormalizedItem to ItemState with 'sync:github-projects' as triggeredBy, writes via writeJsonAtomic (idempotent overwrite). Supports _listItems and _writeJson injectables for tests. Scripts: src/scripts/sync.ts CLI reads slug from argv[2], validates GITHUB_TOKEN, finds product in registry, exits gracefully for non-github providers. package.json: adds "sync" script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Covers: happy path write shape, file path, multi-item, stage mapping (subStage used when set / falls back to INITIAL_STAGE when null), idempotency (two runs produce identical state), error on non-github provider, externalId validation (path traversal + leading dot skipped), and empty adapter response. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-05-18T17:00:19Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6cd7ce5f-afc7-4ecb-a684-95aae23a875e

📥 Commits

Reviewing files that changed from the base of the PR and between 2987440 and e32bc4b.

📒 Files selected for processing (4)

packages/adapters/src/github-projects/adapter.test.ts
packages/adapters/src/github-projects/adapter.ts
packages/adapters/src/github-projects/graphql-queries.ts
packages/adapters/src/github-projects/graphql-types.ts

📝 Walkthrough

Walkthrough

Adds end-to-end product sync: adapter owner-resolution updates for GitHub Projects, a core sync service that writes per-item ItemState JSON files, a CLI entry and npm script, and tests covering mapping, idempotency, and error cases.

Changes

Product sync feature

Layer / File(s)	Summary
GitHub Projects adapter: org + user account support `packages/adapters/src/github-projects/graphql-queries.ts`, `packages/adapters/src/github-projects/graphql-types.ts`, `packages/adapters/src/github-projects/adapter.ts`	Query and types adjusted to resolve `projectV2` from `repositoryOwner` (User
Adapter tests: personal-account and webhook `packages/adapters/src/github-projects/adapter.test.ts`	Fixtures and tests added to exercise `repositoryOwner` User vs Organization cases, not-found behaviors, regression coverage, and webhook registration rejection for personal accounts.
Sync service: core item processing `apps/api/src/services/sync.ts`	`syncProductItems` lists normalized items (injectable), Zod-validates existing ItemState, validates `externalId`, maps to `ItemState` (stage = `subStage ?? INITIAL_STAGE`), preserves `createdAt`, appends transition events on stage change, writes atomic JSON per item under `dataRoot/items/{externalId}.json`, and returns `{ synced, skipped, durationMs }`.
Sync service tests: behavior and error handling `apps/api/src/services/sync.test.ts`	Vitest suite validating synced/skipped counts, payload shape and filenames, stage mapping, idempotency (preserve createdAt), unsupported provider rejection, invalid `externalId` skipping (warn), and empty adapter list handling.
Sync CLI and package configuration `apps/api/package.json`, `apps/api/src/scripts/sync.ts`	New `sync` npm script; CLI accepts `<productSlug>`, validates `GITHUB_TOKEN` and `HELM_KNOWLEDGE_REPO_PATH` (and optional `HELM_DATA_DIR`), resolves product via registry, gates by `github_projects` provider, invokes `syncProductItems`, logs results, and exits with appropriate status codes.

Sequence Diagram(s)

sequenceDiagram
  participant ComponentA
  participant ComponentB
  ComponentA->>ComponentB: observable interaction

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

lhpaul/helm#7: Related to adapter interfaces and NormalizedItem shapes consumed by the sync service.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a helm sync command for manual item hydration from GitHub Projects, which is directly reflected in the new CLI script, service function, and package.json script additions.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/session-7.5-sync-command

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/api/src/scripts/sync.ts`:
- Line 9: Add an explicit validation for HELM_KNOWLEDGE_REPO_PATH similar to the
existing GITHUB_TOKEN check: after the GITHUB_TOKEN validation block (the same
place where you check process.env.GITHUB_TOKEN), verify
process.env.HELM_KNOWLEDGE_REPO_PATH is present and non-empty, call the same
logger (e.g., processLogger.error) with a clear message and exit with
process.exit(1) on failure; this prevents obscure errors later inside
getProductRegistry() and makes the required env var explicit.

In `@apps/api/src/services/sync.test.ts`:
- Around line 175-191: The test "writes the same data on two successive runs"
only compares externalId and currentStage so it misses drift in other fields;
update the test in sync.test.ts (the it block using makeItem, writes1/writes2,
opts1/opts2 and calling syncProductItems) to assert full payload equality
between writes: compare the entire recorded objects s1 and s2 (including
createdAt, updatedAt, history and any nested fields) using a deep equality
assertion (e.g., toEqual) or by normalizing/removing timestamps if those are
expected to differ, so the test truly verifies idempotent writes across runs.
- Around line 211-232: The test 'skips items with invalid externalIds and
increments skipped count' creates a console.warn spy (consoleSpy) but only
restores it at the end of the test, which can leak if an assertion throws; wrap
the test body in try/finally and call consoleSpy.mockRestore() in the finally
block (or remove the per-test spy and add a centralized afterEach(() =>
vi.restoreAllMocks()) to restore mocks globally). Update the test surrounding
syncProductItems/expect calls to ensure consoleSpy.mockRestore() always runs.

In `@apps/api/src/services/sync.ts`:
- Around line 33-34: The idempotency claim is broken because the code
regenerates a timestamp (`now`) on each run and forcibly overwrites stored
`createdAt`/`updatedAt` values, causing content drift; update the sync logic so
that when processing items (the block that computes `now` and assigns
`createdAt`/`updatedAt`) you only set `createdAt` if it does not already exist
and only update `updatedAt` when the content actually changes, or preserve
existing timestamps from the existing record/file instead of always using the
new `now` value; locate the timestamp generation and assignment sites (the `now`
variable and the code that sets `createdAt` and `updatedAt`) and change them to
read existing metadata first and conditionally write timestamps to maintain
idempotency.

In `@packages/adapters/src/github-projects/adapter.test.ts`:
- Around line 32-34: Add a new unit test in adapter.test.ts that exercises the
personal-account fallback branch by creating a GetProjectResponse fixture where
organization is null and user.projectV2 is present (mirror projectRes but set
organization: null and user: { projectV2: { id: PROJECT_ID, title: 'Helm' } }).
Call the same code path under test (the adapter function that reads
GetProjectResponse) and assert the adapter returns/uses the user-owned project
as the owner fallback (same assertions you have for organization-owned cases),
ensuring the personal-account behavior is locked in.

In `@packages/adapters/src/github-projects/adapter.ts`:
- Around line 352-359: The code currently accepts personal account projects by
using res.user?.projectV2 but still calls the org-only webhook endpoint in
registerWebhook, which will fail; update initialization to detect when the
resolved project is a user-owned project (e.g., when projectV2 came from
res.user) and either throw a clear GitHubNotFoundError-like rejection stating
"Personal account projects do not support webhooks" (set a flag or avoid setting
this.projectId for user-owned projects) or modify registerWebhook to check that
flag/ownership before calling the org endpoint and short-circuit with the same
clear error; use the symbols projectV2, res.user, this.projectId,
registerWebhook and this.config.org to locate and implement the detection and
error path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 33ff7c22-9c70-4232-af5c-2fd0c85d1276

📥 Commits

Reviewing files that changed from the base of the PR and between 4a6a0b4 and de6a01a.

📒 Files selected for processing (8)

apps/api/package.json
apps/api/src/scripts/sync.ts
apps/api/src/services/sync.test.ts
apps/api/src/services/sync.ts
packages/adapters/src/github-projects/adapter.test.ts
packages/adapters/src/github-projects/adapter.ts
packages/adapters/src/github-projects/graphql-queries.ts
packages/adapters/src/github-projects/graphql-types.ts

1. scripts/sync.ts: validate HELM_KNOWLEDGE_REPO_PATH upfront with clear error message (same pattern as GITHUB_TOKEN). 2. sync.ts idempotency: preserve existing createdAt on re-sync; skip write entirely when stage is unchanged; append transition event when stage changes. 3. sync.test.ts: vi.restoreAllMocks() in afterEach eliminates console.warn spy leak on assertion failure. 4. sync.test.ts: idempotency tests now verify skip-on-unchanged and createdAt preservation on stage change (replaces shallow field checks). 5. adapter.test.ts: add user-owner fallback tests — resolves via user when organization is null; throws GitHubNotFoundError when both null. 6. adapter.ts registerWebhook: detect personal account (resolved via user) and throw GitHubConfigError with clear explanation before attempting the org-only /orgs/{org}/hooks endpoint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/api/src/scripts/sync.ts`:
- Around line 31-38: Whitelist and validate the filesystem path environment
variables before using them: apply a regex allowlist (e.g., EXTERNAL_ID_REGEX or
a PATH_ALLOWLIST_REGEX) to process.env.HELM_KNOWLEDGE_REPO_PATH and
process.env.HELM_DATA_DIR (the variables assigned to knowledgeRepoPath and
envDataDir) and reject any values that do not match the pattern, logging a clear
error and exiting early; then only use the validated value to compute dataRoot
(do not forward or join raw env values without this check).

In `@apps/api/src/services/sync.test.ts`:
- Around line 70-72: Replace the POSIX-only split-based key extraction in
_readJson with path.basename to handle platform-specific separators: import
basename from 'path' (or use path.basename) and compute key = basename(filePath,
'.json') instead of filePath.split('/').at(-1)!.replace('.json',''); keep
returning (existingState[key] ?? null) as T | null so existingState lookup and
behavior remain unchanged.

In `@apps/api/src/services/sync.ts`:
- Around line 93-118: The code assumes the JSON returned by
readJsonFn<ItemState>(filePath) is a valid ItemState and directly dereferences
existing.history (causing crashes on malformed disk data); fix by validating the
parsed payload with a strict runtime schema (e.g. a Zod schema for
ItemState/WorkflowEvent) or a guarded type predicate right after readJsonFn
returns, use safeParse/guard to detect invalid shape and treat invalid result as
null (rebuild a fresh ItemState or initialize history = [] and createdAt
defaults) before referencing existing.history, and update the branch that builds
state (the event creation and state = { ...existing, history:
[...existing.history, event], ... }) to only use existing.history after
validation so you never spread undefined or non-array values.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1377d4b6-e6fe-495c-8aff-56302a1e4d1a

📥 Commits

Reviewing files that changed from the base of the PR and between de6a01a and f51ffef.

📒 Files selected for processing (5)

apps/api/src/scripts/sync.ts
apps/api/src/services/sync.test.ts
apps/api/src/services/sync.ts
packages/adapters/src/github-projects/adapter.test.ts
packages/adapters/src/github-projects/adapter.ts

7. scripts/sync.ts: validate HELM_KNOWLEDGE_REPO_PATH and HELM_DATA_DIR against SAFE_FS_PATH_REGEX whitelist before forwarding to file ops. 8. sync.test.ts: replace split('/').at(-1) with basename() for portable path key extraction across OS path separators. 9. sync.ts: validate ItemState read from disk with Zod before dereferencing — malformed JSON is treated as null and rebuilds the item from scratch rather than crashing on spread/property access mid-run. Addresses CodeRabbit review comments on PR #12. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

♻️ Duplicate comments (2)

apps/api/src/services/sync.ts (1)
13-27: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add .strict() to Zod schemas for robust validation of persisted state.

The WorkflowEventSchema and ItemStateSchema validate external data (disk JSON) but don't use .strict() mode, allowing unexpected fields to pass validation silently. This can mask schema drift or corruption in persisted files.
🔒 Proposed fix
 const WorkflowEventSchema = z.object({
   fromStage: z.string().nullable(),
   toStage: z.string(),
   triggeredBy: z.string(),
   at: z.string(),
   note: z.string().optional(),
-});
+}).strict();
+
 const ItemStateSchema = z.object({
   externalId: z.string(),
   productSlug: z.string(),
   currentStage: z.string(),
   history: z.array(WorkflowEventSchema),
   createdAt: z.string(),
   updatedAt: z.string(),
-});
+}).strict();
As per coding guidelines, Use Zod for API request/response validation with .strict() mode enabled. While this validates persisted state rather than API input, disk data is still external input that should be strictly validated.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/api/src/services/sync.ts` around lines 13 - 27, The Zod schemas
WorkflowEventSchema and ItemStateSchema are currently permissive and can accept
unexpected fields; update both schema definitions to enable strict mode by
chaining .strict() onto WorkflowEventSchema and ItemStateSchema so any extra
properties in persisted JSON will cause validation to fail, ensuring you apply
this to the top-level objects (the WorkflowEventSchema and the ItemStateSchema)
used for loading disk state.
apps/api/src/scripts/sync.ts (1)
37-37: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Regex pattern breaks Windows filesystem path compatibility.

The current pattern /^[A-Za-z0-9._/\-]+$/ rejects backslash and colon, which are essential for Windows paths like C:\Users\username\helm. The previous review suggested including \\ and : to support both Unix and Windows paths, but the current regex still omits them.
🔧 Proposed fix to support cross-platform paths
-  const SAFE_FS_PATH_REGEX = /^[A-Za-z0-9._/\-]+$/;
+  const SAFE_FS_PATH_REGEX = /^[A-Za-z0-9._/\\:-]+$/;
As per coding guidelines, validate external filesystem paths with regex whitelist before constructing file paths to prevent path traversal attacks (and ensure cross-platform compatibility).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/api/src/scripts/sync.ts` at line 37, The SAFE_FS_PATH_REGEX currently
disallows Windows-specific characters (backslash and colon) causing valid
Windows paths to be rejected; update SAFE_FS_PATH_REGEX in
apps/api/src/scripts/sync.ts to include both ":" and "\" inside the character
class (escape the backslash in the literal) so the pattern accepts
cross-platform paths while still using the start/end anchors to prevent
traversal (i.e., add ":" and "\\" to the allowed characters in
SAFE_FS_PATH_REGEX).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@apps/api/src/scripts/sync.ts`:
- Line 37: The SAFE_FS_PATH_REGEX currently disallows Windows-specific
characters (backslash and colon) causing valid Windows paths to be rejected;
update SAFE_FS_PATH_REGEX in apps/api/src/scripts/sync.ts to include both ":"
and "\" inside the character class (escape the backslash in the literal) so the
pattern accepts cross-platform paths while still using the start/end anchors to
prevent traversal (i.e., add ":" and "\\" to the allowed characters in
SAFE_FS_PATH_REGEX).

In `@apps/api/src/services/sync.ts`:
- Around line 13-27: The Zod schemas WorkflowEventSchema and ItemStateSchema are
currently permissive and can accept unexpected fields; update both schema
definitions to enable strict mode by chaining .strict() onto WorkflowEventSchema
and ItemStateSchema so any extra properties in persisted JSON will cause
validation to fail, ensuring you apply this to the top-level objects (the
WorkflowEventSchema and the ItemStateSchema) used for loading disk state.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 10b51e5b-7c0c-4517-be88-a1a897cd6e7a

📥 Commits

Reviewing files that changed from the base of the PR and between f51ffef and 2987440.

📒 Files selected for processing (3)

apps/api/src/scripts/sync.ts
apps/api/src/services/sync.test.ts
apps/api/src/services/sync.ts

…t support The dual organization+user query caused GraphqlResponseError on personal accounts: GitHub returns data+errors[] simultaneously when organization() fails, and @octokit/graphql throws on any errors[], so the user fallback was never reached (bug reproduced with lhpaul/helm-playground). Fix: replace dual query with repositoryOwner(login:) + inline fragments — polymorphic, no partial-data errors, works for both Org and User: query GetProject { repositoryOwner(login:) { __typename ... on Organization { projectV2(number:) { id title } } ... on User { projectV2(number:) { id title } } }} ensureProjectId now reads res.repositoryOwner.projectV2 directly. isPersonalAccount set via owner.__typename === 'User'. mapError: include gqlErrors[0].message in thrown error so sync logs show actionable detail instead of generic 'GitHub API error'. Tests: add 5 tests for org/user/null-owner/null-project/regression cases. Regression test documents the exact failure mode (data+errors) and proves repositoryOwner resolves without errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

lhpaul · 2026-05-18T22:48:37Z

@coderabbitai review

coderabbitai · 2026-05-18T22:48:43Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

lhpaul and others added 2 commits May 18, 2026 12:58

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread apps/api/src/scripts/sync.ts

Comment thread apps/api/src/services/sync.test.ts

Comment thread apps/api/src/services/sync.ts Outdated

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

lhpaul merged commit 976e382 into develop May 18, 2026
1 check passed

lhpaul deleted the feature/session-7.5-sync-command branch May 18, 2026 22:47

Conversation

lhpaul commented May 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Test plan

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lhpaul commented May 18, 2026

Uh oh!

coderabbitai Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lhpaul commented May 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 18, 2026 •

edited

Loading