feat(@helm/api): helm sync command for manual item hydration from GitHub Projects#12
Conversation
…script entry point Adapters: extend GET_PROJECT query to include user(login:) alongside organization(login:) so GitHub personal account projects are resolved correctly. ensureProjectId falls back to res.user?.projectV2 when organization returns null. No breaking change for org accounts. Services: syncProductItems(product, token, dataRoot, options?) fetches all items via GitHubProjectsAdapter.listItems() (paginates automatically), maps each NormalizedItem to ItemState with 'sync:github-projects' as triggeredBy, writes via writeJsonAtomic (idempotent overwrite). Supports _listItems and _writeJson injectables for tests. Scripts: src/scripts/sync.ts CLI reads slug from argv[2], validates GITHUB_TOKEN, finds product in registry, exits gracefully for non-github providers. package.json: adds "sync" script. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers: happy path write shape, file path, multi-item, stage mapping (subStage used when set / falls back to INITIAL_STAGE when null), idempotency (two runs produce identical state), error on non-github provider, externalId validation (path traversal + leading dot skipped), and empty adapter response. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds end-to-end product sync: adapter owner-resolution updates for GitHub Projects, a core sync service that writes per-item ItemState JSON files, a CLI entry and npm script, and tests covering mapping, idempotency, and error cases. ChangesProduct sync feature
Sequence Diagram(s)sequenceDiagram
participant ComponentA
participant ComponentB
ComponentA->>ComponentB: observable interaction
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/api/src/scripts/sync.ts`:
- Line 9: Add an explicit validation for HELM_KNOWLEDGE_REPO_PATH similar to the
existing GITHUB_TOKEN check: after the GITHUB_TOKEN validation block (the same
place where you check process.env.GITHUB_TOKEN), verify
process.env.HELM_KNOWLEDGE_REPO_PATH is present and non-empty, call the same
logger (e.g., processLogger.error) with a clear message and exit with
process.exit(1) on failure; this prevents obscure errors later inside
getProductRegistry() and makes the required env var explicit.
In `@apps/api/src/services/sync.test.ts`:
- Around line 175-191: The test "writes the same data on two successive runs"
only compares externalId and currentStage so it misses drift in other fields;
update the test in sync.test.ts (the it block using makeItem, writes1/writes2,
opts1/opts2 and calling syncProductItems) to assert full payload equality
between writes: compare the entire recorded objects s1 and s2 (including
createdAt, updatedAt, history and any nested fields) using a deep equality
assertion (e.g., toEqual) or by normalizing/removing timestamps if those are
expected to differ, so the test truly verifies idempotent writes across runs.
- Around line 211-232: The test 'skips items with invalid externalIds and
increments skipped count' creates a console.warn spy (consoleSpy) but only
restores it at the end of the test, which can leak if an assertion throws; wrap
the test body in try/finally and call consoleSpy.mockRestore() in the finally
block (or remove the per-test spy and add a centralized afterEach(() =>
vi.restoreAllMocks()) to restore mocks globally). Update the test surrounding
syncProductItems/expect calls to ensure consoleSpy.mockRestore() always runs.
In `@apps/api/src/services/sync.ts`:
- Around line 33-34: The idempotency claim is broken because the code
regenerates a timestamp (`now`) on each run and forcibly overwrites stored
`createdAt`/`updatedAt` values, causing content drift; update the sync logic so
that when processing items (the block that computes `now` and assigns
`createdAt`/`updatedAt`) you only set `createdAt` if it does not already exist
and only update `updatedAt` when the content actually changes, or preserve
existing timestamps from the existing record/file instead of always using the
new `now` value; locate the timestamp generation and assignment sites (the `now`
variable and the code that sets `createdAt` and `updatedAt`) and change them to
read existing metadata first and conditionally write timestamps to maintain
idempotency.
In `@packages/adapters/src/github-projects/adapter.test.ts`:
- Around line 32-34: Add a new unit test in adapter.test.ts that exercises the
personal-account fallback branch by creating a GetProjectResponse fixture where
organization is null and user.projectV2 is present (mirror projectRes but set
organization: null and user: { projectV2: { id: PROJECT_ID, title: 'Helm' } }).
Call the same code path under test (the adapter function that reads
GetProjectResponse) and assert the adapter returns/uses the user-owned project
as the owner fallback (same assertions you have for organization-owned cases),
ensuring the personal-account behavior is locked in.
In `@packages/adapters/src/github-projects/adapter.ts`:
- Around line 352-359: The code currently accepts personal account projects by
using res.user?.projectV2 but still calls the org-only webhook endpoint in
registerWebhook, which will fail; update initialization to detect when the
resolved project is a user-owned project (e.g., when projectV2 came from
res.user) and either throw a clear GitHubNotFoundError-like rejection stating
"Personal account projects do not support webhooks" (set a flag or avoid setting
this.projectId for user-owned projects) or modify registerWebhook to check that
flag/ownership before calling the org endpoint and short-circuit with the same
clear error; use the symbols projectV2, res.user, this.projectId,
registerWebhook and this.config.org to locate and implement the detection and
error path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 33ff7c22-9c70-4232-af5c-2fd0c85d1276
📒 Files selected for processing (8)
apps/api/package.jsonapps/api/src/scripts/sync.tsapps/api/src/services/sync.test.tsapps/api/src/services/sync.tspackages/adapters/src/github-projects/adapter.test.tspackages/adapters/src/github-projects/adapter.tspackages/adapters/src/github-projects/graphql-queries.tspackages/adapters/src/github-projects/graphql-types.ts
1. scripts/sync.ts: validate HELM_KNOWLEDGE_REPO_PATH upfront with clear error message (same pattern as GITHUB_TOKEN). 2. sync.ts idempotency: preserve existing createdAt on re-sync; skip write entirely when stage is unchanged; append transition event when stage changes. 3. sync.test.ts: vi.restoreAllMocks() in afterEach eliminates console.warn spy leak on assertion failure. 4. sync.test.ts: idempotency tests now verify skip-on-unchanged and createdAt preservation on stage change (replaces shallow field checks). 5. adapter.test.ts: add user-owner fallback tests — resolves via user when organization is null; throws GitHubNotFoundError when both null. 6. adapter.ts registerWebhook: detect personal account (resolved via user) and throw GitHubConfigError with clear explanation before attempting the org-only /orgs/{org}/hooks endpoint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/api/src/scripts/sync.ts`:
- Around line 31-38: Whitelist and validate the filesystem path environment
variables before using them: apply a regex allowlist (e.g., EXTERNAL_ID_REGEX or
a PATH_ALLOWLIST_REGEX) to process.env.HELM_KNOWLEDGE_REPO_PATH and
process.env.HELM_DATA_DIR (the variables assigned to knowledgeRepoPath and
envDataDir) and reject any values that do not match the pattern, logging a clear
error and exiting early; then only use the validated value to compute dataRoot
(do not forward or join raw env values without this check).
In `@apps/api/src/services/sync.test.ts`:
- Around line 70-72: Replace the POSIX-only split-based key extraction in
_readJson with path.basename to handle platform-specific separators: import
basename from 'path' (or use path.basename) and compute key = basename(filePath,
'.json') instead of filePath.split('/').at(-1)!.replace('.json',''); keep
returning (existingState[key] ?? null) as T | null so existingState lookup and
behavior remain unchanged.
In `@apps/api/src/services/sync.ts`:
- Around line 93-118: The code assumes the JSON returned by
readJsonFn<ItemState>(filePath) is a valid ItemState and directly dereferences
existing.history (causing crashes on malformed disk data); fix by validating the
parsed payload with a strict runtime schema (e.g. a Zod schema for
ItemState/WorkflowEvent) or a guarded type predicate right after readJsonFn
returns, use safeParse/guard to detect invalid shape and treat invalid result as
null (rebuild a fresh ItemState or initialize history = [] and createdAt
defaults) before referencing existing.history, and update the branch that builds
state (the event creation and state = { ...existing, history:
[...existing.history, event], ... }) to only use existing.history after
validation so you never spread undefined or non-array values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1377d4b6-e6fe-495c-8aff-56302a1e4d1a
📒 Files selected for processing (5)
apps/api/src/scripts/sync.tsapps/api/src/services/sync.test.tsapps/api/src/services/sync.tspackages/adapters/src/github-projects/adapter.test.tspackages/adapters/src/github-projects/adapter.ts
7. scripts/sync.ts: validate HELM_KNOWLEDGE_REPO_PATH and HELM_DATA_DIR
against SAFE_FS_PATH_REGEX whitelist before forwarding to file ops.
8. sync.test.ts: replace split('/').at(-1) with basename() for portable
path key extraction across OS path separators.
9. sync.ts: validate ItemState read from disk with Zod before dereferencing
— malformed JSON is treated as null and rebuilds the item from scratch
rather than crashing on spread/property access mid-run.
Addresses CodeRabbit review comments on PR #12.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
♻️ Duplicate comments (2)
apps/api/src/services/sync.ts (1)
13-27:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd
.strict()to Zod schemas for robust validation of persisted state.The
WorkflowEventSchemaandItemStateSchemavalidate external data (disk JSON) but don't use.strict()mode, allowing unexpected fields to pass validation silently. This can mask schema drift or corruption in persisted files.🔒 Proposed fix
const WorkflowEventSchema = z.object({ fromStage: z.string().nullable(), toStage: z.string(), triggeredBy: z.string(), at: z.string(), note: z.string().optional(), -}); +}).strict(); + const ItemStateSchema = z.object({ externalId: z.string(), productSlug: z.string(), currentStage: z.string(), history: z.array(WorkflowEventSchema), createdAt: z.string(), updatedAt: z.string(), -}); +}).strict();As per coding guidelines,
Use Zod for API request/response validation with .strict() mode enabled. While this validates persisted state rather than API input, disk data is still external input that should be strictly validated.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/services/sync.ts` around lines 13 - 27, The Zod schemas WorkflowEventSchema and ItemStateSchema are currently permissive and can accept unexpected fields; update both schema definitions to enable strict mode by chaining .strict() onto WorkflowEventSchema and ItemStateSchema so any extra properties in persisted JSON will cause validation to fail, ensuring you apply this to the top-level objects (the WorkflowEventSchema and the ItemStateSchema) used for loading disk state.apps/api/src/scripts/sync.ts (1)
37-37:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRegex pattern breaks Windows filesystem path compatibility.
The current pattern
/^[A-Za-z0-9._/\-]+$/rejects backslash and colon, which are essential for Windows paths likeC:\Users\username\helm. The previous review suggested including\\and:to support both Unix and Windows paths, but the current regex still omits them.🔧 Proposed fix to support cross-platform paths
- const SAFE_FS_PATH_REGEX = /^[A-Za-z0-9._/\-]+$/; + const SAFE_FS_PATH_REGEX = /^[A-Za-z0-9._/\\:-]+$/;As per coding guidelines, validate external filesystem paths with regex whitelist before constructing file paths to prevent path traversal attacks (and ensure cross-platform compatibility).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/scripts/sync.ts` at line 37, The SAFE_FS_PATH_REGEX currently disallows Windows-specific characters (backslash and colon) causing valid Windows paths to be rejected; update SAFE_FS_PATH_REGEX in apps/api/src/scripts/sync.ts to include both ":" and "\" inside the character class (escape the backslash in the literal) so the pattern accepts cross-platform paths while still using the start/end anchors to prevent traversal (i.e., add ":" and "\\" to the allowed characters in SAFE_FS_PATH_REGEX).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@apps/api/src/scripts/sync.ts`:
- Line 37: The SAFE_FS_PATH_REGEX currently disallows Windows-specific
characters (backslash and colon) causing valid Windows paths to be rejected;
update SAFE_FS_PATH_REGEX in apps/api/src/scripts/sync.ts to include both ":"
and "\" inside the character class (escape the backslash in the literal) so the
pattern accepts cross-platform paths while still using the start/end anchors to
prevent traversal (i.e., add ":" and "\\" to the allowed characters in
SAFE_FS_PATH_REGEX).
In `@apps/api/src/services/sync.ts`:
- Around line 13-27: The Zod schemas WorkflowEventSchema and ItemStateSchema are
currently permissive and can accept unexpected fields; update both schema
definitions to enable strict mode by chaining .strict() onto WorkflowEventSchema
and ItemStateSchema so any extra properties in persisted JSON will cause
validation to fail, ensuring you apply this to the top-level objects (the
WorkflowEventSchema and the ItemStateSchema) used for loading disk state.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 10b51e5b-7c0c-4517-be88-a1a897cd6e7a
📒 Files selected for processing (3)
apps/api/src/scripts/sync.tsapps/api/src/services/sync.test.tsapps/api/src/services/sync.ts
…t support
The dual organization+user query caused GraphqlResponseError on personal
accounts: GitHub returns data+errors[] simultaneously when organization()
fails, and @octokit/graphql throws on any errors[], so the user fallback
was never reached (bug reproduced with lhpaul/helm-playground).
Fix: replace dual query with repositoryOwner(login:) + inline fragments
— polymorphic, no partial-data errors, works for both Org and User:
query GetProject { repositoryOwner(login:) {
__typename
... on Organization { projectV2(number:) { id title } }
... on User { projectV2(number:) { id title } }
}}
ensureProjectId now reads res.repositoryOwner.projectV2 directly.
isPersonalAccount set via owner.__typename === 'User'.
mapError: include gqlErrors[0].message in thrown error so sync logs show
actionable detail instead of generic 'GitHub API error'.
Tests: add 5 tests for org/user/null-owner/null-project/regression cases.
Regression test documents the exact failure mode (data+errors) and
proves repositoryOwner resolves without errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
@helm/adapters: extendsGET_PROJECTto query bothorganizationanduserfields — supports personal GitHub accounts (likelhpaul) alongside org accounts.ensureProjectIdfalls back touser.projectV2whenorganizationis null. No breaking change for org accounts.apps/api/src/services/sync.ts:syncProductItems(product, token, dataRoot, options?)— fetches all items viaGitHubProjectsAdapter.listItems()(handles pagination), maps toItemStatewithtriggeredBy: 'sync:github-projects', writes viawriteJsonAtomic(idempotent overwrite). Injectable_listItemsand_writeJsonfor testing.apps/api/src/scripts/sync.ts: CLI entry point — validates args, readsGITHUB_TOKEN, finds product in registry, exits gracefully for non-github providers.package.json: adds"sync": "bun run src/scripts/sync.ts".Usage
Test plan
pnpm --filter @helm/adapters test— 68 tests pass (adapter org+user fallback tested via existing mock)pnpm --filter @helm/api test— 75 tests pass (13 new sync tests)pnpm --filter @helm/api build— cleanRelated
lhpaul/helm-knowledgePR — adds Step 4.5 (item hydration) tooperations/onboarding-product.md🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes / Behavior
Tests
Chores