Skip to content

SF-28 add monorepo-aware CI workflow#80

Merged
thyaravind merged 1 commit intodevfrom
SF-28-ci-posture
Apr 16, 2026
Merged

SF-28 add monorepo-aware CI workflow#80
thyaravind merged 1 commit intodevfrom
SF-28-ci-posture

Conversation

@thyaravind
Copy link
Copy Markdown
Member

@thyaravind thyaravind commented Apr 4, 2026

Summary

  • add a monorepo-aware CI workflow for JS, Python, and docs checks
  • compute impacted packages from the current branch without assuming origin/dev and next/next have the same project set
  • install local Python package dependencies in topological order before package checks

Validation

  • node scripts/ci-plan.mjs
  • node scripts/ci-run-python-package.mjs --dry-run authfn/python
  • node scripts/ci-run-python-package.mjs --dry-run searchfn/python

Summary by Sourcery

Introduce a monorepo-aware CI workflow that plans and runs targeted JavaScript, Python, and docs checks based on changed files.

New Features:

  • Add a CI planning script that determines impacted JS, Python, and docs components from Git diffs and emits workflow outputs.
  • Add a Python package CI runner that installs local package dependencies in topological order and runs linting, type checking, tests, and builds per package.
  • Add a GitHub Actions CI workflow that uses the planning script to conditionally run JS, Python, and docs jobs with appropriate scoping.

Enhancements:

  • Introduce shared CI utility functions for discovering JS/Python packages, parsing pyproject.toml metadata, and resolving local dependency order.

Summary by cubic

Implements SF-28 by adding a monorepo‑aware CI that runs only impacted JavaScript, Python, and docs checks. Speeds up CI and installs Python local deps in topological order for correct builds.

  • New Features
    • Adds .github/workflows/ci.yml with a Plan job that resolves a safe diff base and determines impacted JS, Python, and docs.
    • JavaScript: runs turbo build/test/lint/typecheck with scoped --filter targets; falls back to full runs on global changes.
    • Python: builds a matrix of affected packages; installs local deps in topo order; runs ruff, mypy, pytest, and build via scripts/ci-run-python-package.mjs.
    • Docs: builds docs and runs guardrails (check:coverage, check:content, check:snippets, check:links).
    • Adds scripts/ci-plan.mjs and scripts/ci-utils.mjs to detect changed packages and compute filters without assuming origin/dev and origin/next have the same projects.

Written for commit 52f9a01. Summary will update on new commits.

@linear
Copy link
Copy Markdown

linear bot commented Apr 4, 2026

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Apr 4, 2026

Reviewer's Guide

Introduces a monorepo-aware CI planning script and GitHub Actions workflow that determine impacted JS, Python, and docs projects from git diffs, then run targeted builds/tests/lint/typechecks, including topological installation and checking of Python packages and their local dependencies.

Sequence diagram for Python package CI execution per matrix entry

sequenceDiagram
  participant GH as GithubActions_python_job
  participant RUNPY as ci_run_python_package_mjs
  participant UTIL as ci_utils_mjs
  participant PIP as python_pip
  participant TOOLS as python_tools_build_ruff_mypy_pytest

  GH->>RUNPY: node scripts/ci-run-python-package.mjs packageDir
  RUNPY->>UTIL: discoverPythonPackages(repoRoot)
  UTIL-->>RUNPY: python package manifests
  RUNPY->>RUNPY: normalize names
  RUNPY->>RUNPY: find targetManifest by dir
  RUNPY->>UTIL: topologicalLocalDependencies(targetManifest.name, manifestsByName)
  UTIL-->>RUNPY: ordered manifests
  RUNPY->>RUNPY: split dependencyInstalls vs target

  alt has local dependencyInstalls
    RUNPY->>PIP: python -m pip install -e dep1 -e dep2 ...
    PIP-->>RUNPY: dependencies installed
  end

  RUNPY->>PIP: python -m pip install -e target[dev or base]
  PIP-->>RUNPY: target package installed

  alt targetManifest.hasDevDependencyGroup is false
    RUNPY->>PIP: python -m pip install build pytest pytest-asyncio pytest-cov mypy ruff
    PIP-->>RUNPY: dev tools installed
  end

  RUNPY->>TOOLS: python -m build --wheel --sdist .
  TOOLS-->>RUNPY: build artifacts

  opt targetManifest.hasRuff
    RUNPY->>TOOLS: python -m ruff check .
    TOOLS-->>RUNPY: ruff complete
  end

  opt targetManifest.hasMypy
    RUNPY->>TOOLS: python -m mypy .
    TOOLS-->>RUNPY: mypy complete
  end

  opt targetManifest.hasPytest or tests dir exists
    RUNPY->>TOOLS: python -m pytest .
    TOOLS-->>RUNPY: tests complete
  end

  RUNPY-->>GH: JSON summary (package, path, dependencies, dryRun)
Loading

File-Level Changes

Change Details Files
Add shared CI utilities to discover JS/Python packages and resolve their dependency relationships from manifests.
  • Implement a filesystem walker that finds manifest files while skipping common build/output directories.
  • Add discovery of JS packages from package.json files with basic metadata and scripts extraction.
  • Add parsing of Python pyproject.toml files to extract package name, dependency list, dev-dependency presence, and whether ruff, mypy, and pytest are configured.
  • Add utilities for resolving nearest manifest directory, computing topological order of local Python dependencies, and checking manifest existence.
scripts/ci-utils.mjs
Add a CI planning script that computes the diff base, determines changed files, and selects affected JS, Python, and docs targets for the workflow.
  • Resolve a suitable diff base commit using GitHub event data, falling back through PR base, push.before, remote default branch, parent commit, and HEAD.
  • Parse git diff output into structured change entries, including rename/copy handling, and collect unique changed paths.
  • Define patterns that trigger full JS/Python runs or docs checks, and detect per-package changes via nearest-manifest directory resolution, including deleted manifests.
  • Select impacted JS packages and Python packages (including reverse-dependents for Python), compute JS turbo filter flags by task type, and build a Python job matrix.
  • Emit a JSON summary to stdout and expose all relevant planning data via GITHUB_OUTPUT for downstream jobs.
scripts/ci-plan.mjs
Add a monorepo-aware GitHub Actions CI workflow that uses the plan to conditionally run JS, Python, and docs jobs.
  • Add a planning job that checks out the repo, refreshes common target branches, runs the CI planning script, and publishes its outputs as reusable job outputs.
  • Add a JavaScript job that installs root dependencies and runs turbo build/test/lint/typecheck either for all packages or with dynamically computed filter flags.
  • Add a Python matrix job that uses the planned package matrix to run per-package checks via a dedicated script, with cached Python setup.
  • Add a docs job that conditionally runs docs build and guardrail checks when docs-related paths change.
.github/workflows/ci.yml
Add a script to run Python package checks including local dependency installation, builds, linting, type checking, and tests in a topological and configurable way.
  • Discover all Python packages, normalize names, and map them by directory and name to validate the requested target package.
  • Compute topological order of local dependencies for the target package and install them in editable mode prior to the target package.
  • Install the target package (with [dev] extras if available), and fall back to installing build/pytest/mypy/ruff tooling explicitly when no dev group is declared.
  • Run build, ruff, mypy, and pytest conditionally based on manifest flags and test directory presence, supporting a --dry-run mode that only prints the planned commands and a JSON summary.
scripts/ci-run-python-package.mjs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

Warning

Rate limit exceeded

@thyaravind has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 43 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 7 minutes and 43 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 91a6921e-30fc-4df6-8ea6-b37a8902fbbd

📥 Commits

Reviewing files that changed from the base of the PR and between e105c93 and 52f9a01.

📒 Files selected for processing (4)
  • .github/workflows/ci.yml
  • scripts/ci-plan.mjs
  • scripts/ci-run-python-package.mjs
  • scripts/ci-utils.mjs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch SF-28-ci-posture

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add monorepo-aware CI workflow with smart change detection

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add monorepo-aware CI workflow for JavaScript, Python, and docs checks
• Compute impacted packages from branch without assuming origin/dev and next/next consistency
• Install local Python package dependencies in topological order before checks
• Implement intelligent change detection to run only affected packages
Diagram
flowchart LR
  A["Git Changes"] --> B["CI Plan Script"]
  B --> C["Detect JS Changes"]
  B --> D["Detect Python Changes"]
  B --> E["Detect Docs Changes"]
  C --> F["JavaScript Job"]
  D --> G["Python Job"]
  E --> H["Docs Job"]
  F --> I["Build/Test/Lint/Typecheck"]
  G --> J["Install Dependencies<br/>Topologically"]
  J --> K["Run Checks"]
  H --> L["Docs Checks"]
Loading

Grey Divider

File Changes

1. .github/workflows/ci.yml ⚙️ Configuration changes +191/-0

GitHub Actions CI workflow with monorepo support

• New GitHub Actions workflow triggered on push/PR to dev, next, main branches
• Plan job computes impacted packages and outputs filters for downstream jobs
• JavaScript job runs build, typecheck, test, lint with turbo using computed filters
• Python job runs matrix of affected packages with dependency installation
• Docs job runs build and multiple checks (coverage, content, snippets, links)

.github/workflows/ci.yml


2. scripts/ci-plan.mjs ✨ Enhancement +338/-0

CI planning script with change detection logic

• Resolves diff base from GitHub event, push event, or default branch
• Parses changed files and matches against JS/Python/Docs patterns
• Detects full runs when global files change or manifests are deleted
• Collects Python dependents transitively to ensure dependent packages are tested
• Outputs turbo filters and matrix for downstream jobs

scripts/ci-plan.mjs


3. scripts/ci-run-python-package.mjs ✨ Enhancement +105/-0

Python package CI runner with dependency ordering

• Validates target Python package directory exists
• Computes topological order of local dependencies
• Installs dependencies in order before target package installation
• Runs build, linting, type checking, and tests based on manifest configuration
• Supports dry-run mode for validation

scripts/ci-run-python-package.mjs


View more (1)
4. scripts/ci-utils.mjs ✨ Enhancement +211/-0

Shared utilities for package discovery and parsing

• Walks directory tree to discover JavaScript and Python packages
• Parses pyproject.toml files to extract package metadata and dependencies
• Normalizes dependency names and extracts tool configurations (ruff, mypy, pytest)
• Finds nearest manifest directory for changed files
• Computes topological sort of local dependencies for installation order

scripts/ci-utils.mjs


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review bot commented Apr 4, 2026

Code Review by Qodo

🐞 Bugs (3) 📘 Rule violations (0) 📎 Requirement gaps (0) 🎨 UX Issues (0)

Grey Divider


Action required

1. Broken docs workspace commands 🐞 Bug ≡ Correctness
Description
The docs job runs npm -w docs and several check:* scripts that are not defined by any existing
docs workspace, so the job will fail whenever it is triggered. Additionally, there is no docs
workspace declared in the repo’s root workspaces list, so npm -w docs cannot resolve a target.
Code

.github/workflows/ci.yml[R185-191]

+      - name: Docs checks
+        run: |
+          npm -w docs run build
+          npm -w docs run check:coverage
+          npm -w docs run check:content
+          npm -w docs run check:snippets
+          npm -w docs run check:links
Evidence
The workflow invokes npm -w docs run ..., but the root workspace list does not define a docs
workspace, and the existing docs apps are @datafn/docs and @searchfn/docs (neither defines
check:coverage, check:content, check:snippets, or check:links). This makes the docs job
commands unresolvable and/or the scripts missing, causing an immediate CI failure when the job runs.

.github/workflows/ci.yml[176-191]
package.json[6-27]
datafn/docs/package.json[1-10]
searchfn/docs/package.json[1-10]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The GitHub Actions `docs` job targets a non-existent workspace (`docs`) and runs `check:*` scripts that aren’t defined in the repo’s docs packages, causing the job to fail when triggered.

### Issue Context
- Root workspaces include `datafn/docs` and `searchfn/docs` packages (named `@datafn/docs` / `@searchfn/docs`).
- Neither of these packages defines `check:coverage`, `check:content`, `check:snippets`, or `check:links` scripts.

### Fix Focus Areas
- .github/workflows/ci.yml[176-191]
- package.json[6-27]
- datafn/docs/package.json[1-10]
- searchfn/docs/package.json[1-10]

### What to change
- Update the workflow to run docs checks against the actual workspace(s), e.g. `npm -w @datafn/docs run build` and/or `npm -w @searchfn/docs run build`.
- Either add the referenced `check:*` scripts to the targeted docs package(s) or remove/replace those commands with scripts that actually exist (e.g., only run `build` if that’s the intended guardrail).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Docs changes never trigger 🐞 Bug ≡ Correctness
Description
ci-plan.mjs only marks run_docs true for paths like docs/ and docsfn/, but the repo’s docs
apps live under datafn/docs/ and searchfn/docs/, so docs edits won’t trigger the Docs Guardrails
job. This creates a systematic coverage gap where docs changes bypass docs validation entirely.
Code

scripts/ci-plan.mjs[R31-37]

+const DOCS_PATTERNS = [
+  /^docs\//,
+  /^docsfn\//,
+  /^packages\/docs-theme\//,
+  /^scripts\/check-docs-/,
+  /^scripts\/docs-coverage-check\.mjs$/,
+];
Evidence
DOCS_PATTERNS does not match the repo’s actual docs directories, so runDocs remains false for
edits under datafn/docs or searchfn/docs, preventing the docs job from running for real docs
changes.

scripts/ci-plan.mjs[31-37]
datafn/docs/package.json[1-10]
searchfn/docs/package.json[1-10]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
Docs changes in this repo won’t set `run_docs=true` because `DOCS_PATTERNS` doesn’t include the actual docs app locations.

### Issue Context
The repo contains docs apps under `datafn/docs` and `searchfn/docs`, but `DOCS_PATTERNS` only covers `docs/`, `docsfn/`, and a few other paths.

### Fix Focus Areas
- scripts/ci-plan.mjs[31-37]
- datafn/docs/package.json[1-10]
- searchfn/docs/package.json[1-10]

### What to change
- Expand `DOCS_PATTERNS` to include the real docs directories (e.g., `^datafn/docs/` and `^searchfn/docs/`) and any other docs locations you intend to guard.
- Keep the patterns aligned with whatever docs job targets in `.github/workflows/ci.yml` to avoid drift.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. JS plan mixes subrepos 🐞 Bug ≡ Correctness
Description
ci-plan.mjs discovers JS packages by walking all package.json files in the entire repo, which
includes separate subprojects like searchfn/* that have their own package-lock.json and are not
included as root workspaces. The workflow then runs only a root-level npm ci and Turbo commands,
so changes under those subprojects can lead to skipped checks or failing task runs due to missing
install context.
Code

scripts/ci-plan.mjs[R226-233]

+const jsPackages = discoverJsPackages(repoRoot);
+const pythonPackages = discoverPythonPackages(repoRoot).map((manifest) => ({
+  ...manifest,
+  name: manifest.name.toLowerCase(),
+}));
+
+const jsByDir = new Map(jsPackages.map((manifest) => [manifest.dir, manifest]));
+const pythonByDir = new Map(pythonPackages.map((manifest) => [manifest.dir, manifest]));
Evidence
The CI installs dependencies only once at the repo root (npm ci), but the plan discovers packages
across the whole repo (including searchfn/core etc.). Root workspaces list does not include
searchfn/*, and searchfn has its own lockfile, indicating a separate install boundary;
planning/running Turbo tasks for packages in that boundary from the root install context is
inconsistent and can break CI coverage for those areas.

scripts/ci-plan.mjs[226-233]
scripts/ci-utils.mjs[37-49]
.github/workflows/ci.yml[76-84]
package.json[6-27]
searchfn/core/package.json[1-30]
searchfn/package-lock.json[1-20]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The JS CI plan currently discovers *all* `package.json` files across the repository, including packages that are not part of the root workspace/install context (notably the `searchfn/*` subproject which has its own lockfile). The JS workflow job installs only root dependencies (`npm ci` at repo root) and then runs Turbo from the root, so changes in those subprojects won’t be handled correctly.

### Issue Context
- Root `package.json` workspaces don’t include `searchfn/*`.
- `searchfn/` has its own `package-lock.json` and multiple nested packages (e.g. `searchfn/core/package.json`).

### Fix Focus Areas
- scripts/ci-utils.mjs[37-49]
- scripts/ci-plan.mjs[226-233]
- .github/workflows/ci.yml[76-84]
- package.json[6-27]

### What to change
- Adjust `discoverJsPackages`/`ci-plan` to only include packages that are actually part of the root workspace graph (e.g., derive allowed package roots from the root `workspaces` globs and only collect manifests under those).
- OR, if subprojects like `searchfn/*` are intended to be checked, add explicit CI handling for them (e.g., a separate job that runs `npm ci` in `searchfn/` and runs that subproject’s build/test/lint/typecheck commands).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

CI Feedback 🧐

A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

Action: Python (searchfn, searchfn/python)

Failed stage: Check package [❌]

Failed test name: ""

Failure summary:

The action failed during dependency installation for searchfn:
- python -m pip install -e
searchfn/python[dev] failed because searchfn requires datafn>=0.1.0, but only datafn version 0.0.1
was available (installed earlier in the job).
- Pip reported: Could not find a version that
satisfies the requirement datafn>=0.1.0 (from searchfn) (from versions: 0.0.1), leading to exit code
1.
- The failure surfaced in the CI script scripts/ci-run-python-package.mjs at lines 47 and 64,
where the pip command was executed.

Relevant error logs:
1:  ##[group]Runner Image Provisioner
2:  Hosted Compute Agent
...

250:  Stored in directory: /tmp/pip-ephem-wheel-cache-47gqltco/wheels/77/65/16/1d336c35a0f349b0786ffcc2d953e502951b537fb98b30c85f
251:  Successfully built datafn
252:  Installing collected packages: datafn
253:  Successfully installed datafn-0.0.1
254:  $ python -m pip install -e searchfn/python[dev]
255:  Obtaining file:///home/runner/work/super-functions/super-functions/searchfn/python
256:  Installing build dependencies: started
257:  Installing build dependencies: finished with status 'done'
258:  Checking if build backend supports build_editable: started
259:  Checking if build backend supports build_editable: finished with status 'done'
260:  Getting requirements to build editable: started
261:  Getting requirements to build editable: finished with status 'done'
262:  Preparing editable metadata (pyproject.toml): started
263:  Preparing editable metadata (pyproject.toml): finished with status 'done'
264:  INFO: pip is looking at multiple versions of searchfn to determine which version is compatible with other requirements. This could take a while.
265:  ERROR: Could not find a version that satisfies the requirement datafn>=0.1.0 (from searchfn) (from versions: 0.0.1)
266:  ERROR: No matching distribution found for datafn>=0.1.0
267:  node:internal/errors:984
268:  const err = new Error(message);
269:  ^
270:  Error: Command failed: python -m pip install -e searchfn/python[dev]
271:  at genericNodeError (node:internal/errors:984:15)
272:  at wrappedFn (node:internal/errors:538:14)
273:  at checkExecSyncError (node:child_process:891:11)
274:  at execFileSync (node:child_process:927:15)
275:  at run (file:///home/runner/work/super-functions/super-functions/scripts/ci-run-python-package.mjs:47:3)
276:  at file:///home/runner/work/super-functions/super-functions/scripts/ci-run-python-package.mjs:64:1
277:  at ModuleJob.run (node:internal/modules/esm/module_job:325:25)
278:  at async ModuleLoader.import (node:internal/modules/esm/loader:606:24)
279:  at async asyncRunEntryPointWithESMLoader (node:internal/modules/run_main:117:5) {
280:  status: 1,
281:  signal: null,
282:  output: [ null, null, null ],
283:  pid: 2474,
284:  stdout: null,
285:  stderr: null
286:  }
287:  Node.js v20.20.2
288:  ##[error]Process completed with exit code 1.
289:  Post job cleanup.

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The TOML parsing in parsePyprojectManifest (e.g., readTomlArray + extractQuotedStrings) assumes a very specific formatting (double-quoted strings, simple arrays, no inline tables), which may break on valid but differently formatted pyproject.toml files; consider either using a TOML parser or loosening these assumptions (e.g., support single quotes and more general array syntax).
  • The Python dev tooling detection is tightly coupled to a single optional-dependency group named dev; if a package uses different group names (e.g., test or lint) or defines tools only in [tool.*] sections, hasDevDependencyGroup/hasPytest/hasMypy/hasRuff may be inaccurate, so it could be worth making this detection more flexible or explicitly keyed on tool presence rather than a hard-coded group name.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The TOML parsing in `parsePyprojectManifest` (e.g., `readTomlArray` + `extractQuotedStrings`) assumes a very specific formatting (double-quoted strings, simple arrays, no inline tables), which may break on valid but differently formatted `pyproject.toml` files; consider either using a TOML parser or loosening these assumptions (e.g., support single quotes and more general array syntax).
- The Python dev tooling detection is tightly coupled to a single optional-dependency group named `dev`; if a package uses different group names (e.g., `test` or `lint`) or defines tools only in `[tool.*]` sections, `hasDevDependencyGroup`/`hasPytest`/`hasMypy`/`hasRuff` may be inaccurate, so it could be worth making this detection more flexible or explicitly keyed on tool presence rather than a hard-coded group name.

## Individual Comments

### Comment 1
<location path="scripts/ci-utils.mjs" line_range="6-13" />
<code_context>
+import { existsSync, readdirSync, readFileSync } from "node:fs";
+import path from "node:path";
+
+const IGNORED_DIRS = new Set([
+  ".git",
+  ".next",
+  ".turbo",
+  "build",
+  "coverage",
+  "dist",
+  "node_modules",
+]);
+
</code_context>
<issue_to_address>
**suggestion (performance):** Consider ignoring common Python artifact directories in the recursive file walk to avoid unnecessary IO.

`IGNORED_DIRS` currently covers JS build artifacts but not common Python ones such as `.venv`, `.mypy_cache`, `.pytest_cache`, `.ruff_cache`, and `__pycache__`. Since `walkFiles` only needs to find `pyproject.toml`, adding these would cut down unnecessary traversal and avoid walking into in-repo virtualenvs, especially in large repos.
</issue_to_address>

### Comment 2
<location path="scripts/ci-utils.mjs" line_range="52-42" />
<code_context>
+    });
+}
+
+function extractQuotedStrings(source) {
+  const values = [];
+  const matcher = /"([^"]+)"/g;
+  let match;
+
+  while ((match = matcher.exec(source)) !== null) {
+    values.push(match[1]);
+  }
+
+  return values;
+}
+
+function readTomlArray(lines, startIndex) {
+  let cursor = startIndex;
+  let combined = lines[cursor];
+
+  while (!combined.includes("]") && cursor + 1 < lines.length) {
+    cursor += 1;
+    combined += `\n${lines[cursor]}`;
+  }
+
+  return {
+    items: extractQuotedStrings(combined),
+    nextIndex: cursor + 1,
</code_context>
<issue_to_address>
**issue (bug_risk):** The ad‑hoc TOML parsing for pyproject dependencies is fairly brittle; consider hardening it or documenting the assumptions.

This pipeline currently only handles a narrow TOML subset (double-quoted strings, simple `key = value` lines, `[...]` arrays) and will misparse or ignore many valid `pyproject.toml` constructs (single quotes, inline tables, trailing comments, multi-line strings, etc.), which can silently drop dependencies from the graph.

If adding a TOML parser is out of scope, at least tighten the heuristics (e.g., explicitly ignoring lines with trailing comments or non-string values) so unexpected formats are skipped rather than producing incorrect dependency names.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread scripts/ci-utils.mjs
Comment on lines +6 to +13
const IGNORED_DIRS = new Set([
".git",
".next",
".turbo",
"build",
"coverage",
"dist",
"node_modules",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Consider ignoring common Python artifact directories in the recursive file walk to avoid unnecessary IO.

IGNORED_DIRS currently covers JS build artifacts but not common Python ones such as .venv, .mypy_cache, .pytest_cache, .ruff_cache, and __pycache__. Since walkFiles only needs to find pyproject.toml, adding these would cut down unnecessary traversal and avoid walking into in-repo virtualenvs, especially in large repos.

Comment thread scripts/ci-utils.mjs
.filter((relativePath) => relativePath !== "package.json")
.map((relativePath) => {
const manifest = JSON.parse(readFileSync(path.join(rootDir, relativePath), "utf8"));
return {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The ad‑hoc TOML parsing for pyproject dependencies is fairly brittle; consider hardening it or documenting the assumptions.

This pipeline currently only handles a narrow TOML subset (double-quoted strings, simple key = value lines, [...] arrays) and will misparse or ignore many valid pyproject.toml constructs (single quotes, inline tables, trailing comments, multi-line strings, etc.), which can silently drop dependencies from the graph.

If adding a TOML parser is out of scope, at least tighten the heuristics (e.g., explicitly ignoring lines with trailing comments or non-string values) so unexpected formats are skipped rather than producing incorrect dependency names.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="scripts/ci-utils.mjs">

<violation number="1" location="scripts/ci-utils.mjs:6">
P2: Missing Python-specific directories (`.venv`, `__pycache__`, `.mypy_cache`, `.pytest_cache`, `.ruff_cache`) from `IGNORED_DIRS`. Since `walkFiles` traverses the actual filesystem (not git), a local `.venv` directory would cause `discoverPythonPackages` to pick up `pyproject.toml` files from installed third-party packages, corrupting the dependency graph and CI matrix.</violation>

<violation number="2" location="scripts/ci-utils.mjs:184">
P2: Detect and fail on circular local dependencies instead of silently returning an invalid “topological” order.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread scripts/ci-utils.mjs
const visited = new Set();

function visit(name) {
if (visited.has(name)) {
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Detect and fail on circular local dependencies instead of silently returning an invalid “topological” order.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/ci-utils.mjs, line 184:

<comment>Detect and fail on circular local dependencies instead of silently returning an invalid “topological” order.</comment>

<file context>
@@ -0,0 +1,211 @@
+  const visited = new Set();
+
+  function visit(name) {
+    if (visited.has(name)) {
+      return;
+    }
</file context>
Fix with Cubic

Comment thread scripts/ci-utils.mjs
import { existsSync, readdirSync, readFileSync } from "node:fs";
import path from "node:path";

const IGNORED_DIRS = new Set([
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Missing Python-specific directories (.venv, __pycache__, .mypy_cache, .pytest_cache, .ruff_cache) from IGNORED_DIRS. Since walkFiles traverses the actual filesystem (not git), a local .venv directory would cause discoverPythonPackages to pick up pyproject.toml files from installed third-party packages, corrupting the dependency graph and CI matrix.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At scripts/ci-utils.mjs, line 6:

<comment>Missing Python-specific directories (`.venv`, `__pycache__`, `.mypy_cache`, `.pytest_cache`, `.ruff_cache`) from `IGNORED_DIRS`. Since `walkFiles` traverses the actual filesystem (not git), a local `.venv` directory would cause `discoverPythonPackages` to pick up `pyproject.toml` files from installed third-party packages, corrupting the dependency graph and CI matrix.</comment>

<file context>
@@ -0,0 +1,211 @@
+import { existsSync, readdirSync, readFileSync } from "node:fs";
+import path from "node:path";
+
+const IGNORED_DIRS = new Set([
+  ".git",
+  ".next",
</file context>
Fix with Cubic

Comment thread .github/workflows/ci.yml
Comment on lines +185 to +191
- name: Docs checks
run: |
npm -w docs run build
npm -w docs run check:coverage
npm -w docs run check:content
npm -w docs run check:snippets
npm -w docs run check:links
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Broken docs workspace commands 🐞 Bug ≡ Correctness

The docs job runs npm -w docs and several check:* scripts that are not defined by any existing
docs workspace, so the job will fail whenever it is triggered. Additionally, there is no docs
workspace declared in the repo’s root workspaces list, so npm -w docs cannot resolve a target.
Agent Prompt
### Issue description
The GitHub Actions `docs` job targets a non-existent workspace (`docs`) and runs `check:*` scripts that aren’t defined in the repo’s docs packages, causing the job to fail when triggered.

### Issue Context
- Root workspaces include `datafn/docs` and `searchfn/docs` packages (named `@datafn/docs` / `@searchfn/docs`).
- Neither of these packages defines `check:coverage`, `check:content`, `check:snippets`, or `check:links` scripts.

### Fix Focus Areas
- .github/workflows/ci.yml[176-191]
- package.json[6-27]
- datafn/docs/package.json[1-10]
- searchfn/docs/package.json[1-10]

### What to change
- Update the workflow to run docs checks against the actual workspace(s), e.g. `npm -w @datafn/docs run build` and/or `npm -w @searchfn/docs run build`.
- Either add the referenced `check:*` scripts to the targeted docs package(s) or remove/replace those commands with scripts that actually exist (e.g., only run `build` if that’s the intended guardrail).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread scripts/ci-plan.mjs
Comment on lines +31 to +37
const DOCS_PATTERNS = [
/^docs\//,
/^docsfn\//,
/^packages\/docs-theme\//,
/^scripts\/check-docs-/,
/^scripts\/docs-coverage-check\.mjs$/,
];
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Docs changes never trigger 🐞 Bug ≡ Correctness

ci-plan.mjs only marks run_docs true for paths like docs/ and docsfn/, but the repo’s docs
apps live under datafn/docs/ and searchfn/docs/, so docs edits won’t trigger the Docs Guardrails
job. This creates a systematic coverage gap where docs changes bypass docs validation entirely.
Agent Prompt
### Issue description
Docs changes in this repo won’t set `run_docs=true` because `DOCS_PATTERNS` doesn’t include the actual docs app locations.

### Issue Context
The repo contains docs apps under `datafn/docs` and `searchfn/docs`, but `DOCS_PATTERNS` only covers `docs/`, `docsfn/`, and a few other paths.

### Fix Focus Areas
- scripts/ci-plan.mjs[31-37]
- datafn/docs/package.json[1-10]
- searchfn/docs/package.json[1-10]

### What to change
- Expand `DOCS_PATTERNS` to include the real docs directories (e.g., `^datafn/docs/` and `^searchfn/docs/`) and any other docs locations you intend to guard.
- Keep the patterns aligned with whatever docs job targets in `.github/workflows/ci.yml` to avoid drift.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread scripts/ci-plan.mjs
Comment on lines +226 to +233
const jsPackages = discoverJsPackages(repoRoot);
const pythonPackages = discoverPythonPackages(repoRoot).map((manifest) => ({
...manifest,
name: manifest.name.toLowerCase(),
}));

const jsByDir = new Map(jsPackages.map((manifest) => [manifest.dir, manifest]));
const pythonByDir = new Map(pythonPackages.map((manifest) => [manifest.dir, manifest]));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Js plan mixes subrepos 🐞 Bug ≡ Correctness

ci-plan.mjs discovers JS packages by walking all package.json files in the entire repo, which
includes separate subprojects like searchfn/* that have their own package-lock.json and are not
included as root workspaces. The workflow then runs only a root-level npm ci and Turbo commands,
so changes under those subprojects can lead to skipped checks or failing task runs due to missing
install context.
Agent Prompt
### Issue description
The JS CI plan currently discovers *all* `package.json` files across the repository, including packages that are not part of the root workspace/install context (notably the `searchfn/*` subproject which has its own lockfile). The JS workflow job installs only root dependencies (`npm ci` at repo root) and then runs Turbo from the root, so changes in those subprojects won’t be handled correctly.

### Issue Context
- Root `package.json` workspaces don’t include `searchfn/*`.
- `searchfn/` has its own `package-lock.json` and multiple nested packages (e.g. `searchfn/core/package.json`).

### Fix Focus Areas
- scripts/ci-utils.mjs[37-49]
- scripts/ci-plan.mjs[226-233]
- .github/workflows/ci.yml[76-84]
- package.json[6-27]

### What to change
- Adjust `discoverJsPackages`/`ci-plan` to only include packages that are actually part of the root workspace graph (e.g., derive allowed package roots from the root `workspaces` globs and only collect manifests under those).
- OR, if subprojects like `searchfn/*` are intended to be checked, add explicit CI handling for them (e.g., a separate job that runs `npm ci` in `searchfn/` and runs that subproject’s build/test/lint/typecheck commands).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 52f9a0167f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .github/workflows/ci.yml

- name: Docs checks
run: |
npm -w docs run build
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Point docs workflow at an existing workspace

The docs job currently runs npm -w docs ..., but this repo has no docs workspace (the docs packages are @datafn/docs and @searchfn/docs), so the step fails immediately when triggered. I verified this in this repo by running npm -w docs run build, which returns No workspaces found: --workspace=docs; any change that sets run_docs=true will therefore fail CI regardless of code correctness.

Useful? React with 👍 / 👎.

Comment thread scripts/ci-plan.mjs
Comment on lines +31 to +33
const DOCS_PATTERNS = [
/^docs\//,
/^docsfn\//,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match docs change patterns to real docs directories

The run_docs selector only matches docs/ and docsfn/, but this repository’s actual docs apps live under datafn/docs/ and searchfn/docs/ (both already exist in the tree). As a result, edits in those docs directories do not set run_docs=true, so the Docs Guardrails job is skipped for the primary docs surfaces the workflow is meant to protect.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 4, 2026

Greptile Summary

Adds a monorepo-aware CI workflow that uses a planning step to compute impacted JS and Python packages from git diff, then runs targeted builds and checks. There is one parsing bug in ci-utils.mjs that can silently drop Python dependencies when any TOML array entry uses PEP 508 extras notation (e.g. "ruff[format]"), corrupting topological install ordering and tool-detection flags for packages listed after it.

  • P1 — readTomlArray silent data loss: !combined.includes("]") terminates early on ] inside quoted strings, dropping all subsequent entries in the array

Confidence Score: 4/5

Safe to merge after fixing the TOML array parser early-termination bug; no other logic issues found.

One confirmed P1 bug: readTomlArray silently drops array entries when any dependency uses PEP 508 extras notation (e.g. ruff[format], boto3[crt]), breaking topological install order and tool-detection flags. All other CI logic across the workflow, plan script, and runner script is correct.

scripts/ci-utils.mjs — readTomlArray function (lines 64–76)

Important Files Changed

Filename Overview
scripts/ci-utils.mjs Shared helpers for package discovery and TOML parsing; readTomlArray stops early when any array entry contains ] (e.g. PEP 508 extras), silently dropping subsequent entries
scripts/ci-plan.mjs Computes CI plan from git diff, discovers JS/Python packages, and writes GitHub Actions outputs; logic is sound
scripts/ci-run-python-package.mjs Installs a Python package with topological local dependency ordering before running build/lint/test
.github/workflows/ci.yml Monorepo-aware CI workflow with plan/javascript/python/docs jobs; structure and concurrency settings look correct

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Actions
    participant plan as plan job
    participant js as javascript job
    participant py as python job (matrix)
    participant docs as docs job

    GH->>plan: trigger (push / PR / manual)
    plan->>plan: git fetch origin dev/next/main
    plan->>plan: node scripts/ci-plan.mjs
    plan-->>GH: outputs (run_js, python_matrix, run_docs, filters)

    par if run_js == true
        GH->>js: checkout + npm ci
        js->>js: npx turbo run build [filters]
        js->>js: npx turbo run typecheck [filters]
        js->>js: npx turbo run test [filters]
        js->>js: npx turbo run lint [filters]
    and if run_python == true
        GH->>py: checkout + pip install (per matrix entry)
        py->>py: node ci-run-python-package.mjs path
        py->>py: pip install local deps in topo order
        py->>py: python -m build + ruff + mypy + pytest
    and if run_docs == true
        GH->>docs: checkout + npm ci
        docs->>docs: npm -w docs run build + checks
    end
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: scripts/ci-utils.mjs
Line: 64-76

Comment:
**`readTomlArray` stops early on `]` inside quoted strings**

The loop condition `!combined.includes("]")` halts as soon as any `]` character appears anywhere in `combined` — including inside a quoted PEP 508 specifier like `"ruff[format]"` or `"boto3[crt]"`. Every array entry after that line is silently dropped. A local monorepo package listed after a bracketed dep in `[project] dependencies` will never be discovered (breaking topological install order in `ci-run-python-package.mjs`), and entries in the `dev` optional-dependency group after a bracketed dep won't contribute to the `hasRuff`/`hasMypy`/`hasPytest` detection flags.

```suggestion
function readTomlArray(lines, startIndex) {
  let cursor = startIndex;
  let combined = lines[cursor];

  while (cursor + 1 < lines.length) {
    const unquoted = combined.replace(/"[^"]*"/g, '""');
    if (unquoted.includes("]")) break;
    cursor += 1;
    combined += `\n${lines[cursor]}`;
  }

  return {
    items: extractQuotedStrings(combined),
    nextIndex: cursor + 1,
  };
}
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "SF-28 add monorepo-aware CI workflow" | Re-trigger Greptile

Comment thread scripts/ci-utils.mjs
Comment on lines +64 to +76
function readTomlArray(lines, startIndex) {
let cursor = startIndex;
let combined = lines[cursor];

while (!combined.includes("]") && cursor + 1 < lines.length) {
cursor += 1;
combined += `\n${lines[cursor]}`;
}

return {
items: extractQuotedStrings(combined),
nextIndex: cursor + 1,
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 readTomlArray stops early on ] inside quoted strings

The loop condition !combined.includes("]") halts as soon as any ] character appears anywhere in combined — including inside a quoted PEP 508 specifier like "ruff[format]" or "boto3[crt]". Every array entry after that line is silently dropped. A local monorepo package listed after a bracketed dep in [project] dependencies will never be discovered (breaking topological install order in ci-run-python-package.mjs), and entries in the dev optional-dependency group after a bracketed dep won't contribute to the hasRuff/hasMypy/hasPytest detection flags.

Suggested change
function readTomlArray(lines, startIndex) {
let cursor = startIndex;
let combined = lines[cursor];
while (!combined.includes("]") && cursor + 1 < lines.length) {
cursor += 1;
combined += `\n${lines[cursor]}`;
}
return {
items: extractQuotedStrings(combined),
nextIndex: cursor + 1,
};
function readTomlArray(lines, startIndex) {
let cursor = startIndex;
let combined = lines[cursor];
while (cursor + 1 < lines.length) {
const unquoted = combined.replace(/"[^"]*"/g, '""');
if (unquoted.includes("]")) break;
cursor += 1;
combined += `\n${lines[cursor]}`;
}
return {
items: extractQuotedStrings(combined),
nextIndex: cursor + 1,
};
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/ci-utils.mjs
Line: 64-76

Comment:
**`readTomlArray` stops early on `]` inside quoted strings**

The loop condition `!combined.includes("]")` halts as soon as any `]` character appears anywhere in `combined` — including inside a quoted PEP 508 specifier like `"ruff[format]"` or `"boto3[crt]"`. Every array entry after that line is silently dropped. A local monorepo package listed after a bracketed dep in `[project] dependencies` will never be discovered (breaking topological install order in `ci-run-python-package.mjs`), and entries in the `dev` optional-dependency group after a bracketed dep won't contribute to the `hasRuff`/`hasMypy`/`hasPytest` detection flags.

```suggestion
function readTomlArray(lines, startIndex) {
  let cursor = startIndex;
  let combined = lines[cursor];

  while (cursor + 1 < lines.length) {
    const unquoted = combined.replace(/"[^"]*"/g, '""');
    if (unquoted.includes("]")) break;
    cursor += 1;
    combined += `\n${lines[cursor]}`;
  }

  return {
    items: extractQuotedStrings(combined),
    nextIndex: cursor + 1,
  };
}
```

How can I resolve this? If you propose a fix, please make it concise.

@thyaravind thyaravind merged commit 5ab0a1d into dev Apr 16, 2026
14 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant