fix(cactus,chrome-ai): security and correctness fixes from review by sroussey · Pull Request #531 · workglow-dev/libs

sroussey · 2026-05-23T08:29:31Z

Summary

Four small fixes surfaced by a security / DX / correctness review of recent commits to the Cactus and Chrome-AI providers (in particular #530 SHA-256 integrity + #514 Chrome AI). One commit per fix, no squashing.

fix(cactus): catch only ENOENT in fetchAssetBytesNode — the outer try/catch around the cache-read path used to swallow every filesystem error and silently refetch over the network, masking real problems like EACCES/EIO/EISDIR. The catch now lets ENOENT fall through (legitimate cache miss) and re-wraps every other code with a cause-carrying error so operators see what actually went wrong.
fix(cactus): validate model_id inside fetchAssetBytesBrowser — defense-in-depth parity with fetchAssetBytesNode. The browser helper now re-asserts assertSafeModelId at the top, matching the Node variant. A future refactor that hits the helper from a new entry point can no longer slip a hostile model_id past the allowlist.
fix(chrome-ai): touch chat session on each delta to defer idle eviction — long-running streams (>30 min) used to have their cached LanguageModel session destroyed mid-flight by the unified store's idle timer. The chat run-fn now calls touchWebBrowserSession(sessionId) on every text-delta, so active output defers eviction; post-stream silence still evicts at the normal 30-min mark.
fix(cactus): add hash-catalog script + production placeholder guard — a Bun-runnable providers/cactus/scripts/hash-catalog.ts fetches every catalog asset URL, computes sha256 + byte length, prints JSON, and (with --write) rewrites the catalog in place. Module-load throws in production / CACTUS_REQUIRE_REAL_HASHES=1 when any asset still uses the placeholder or has non-positive size; exports CATALOG_HAS_PLACEHOLDERS so tooling can opt-in. The script was run against the live HuggingFace URLs and the catalog now ships real hashes + sizes for the three needle-26m assets.

Findings source

These four items came from a focused review of the most-recent merged commits in providers/cactus and providers/chrome-ai. They are all small, locally-scoped, and have regression-test coverage (except commit 2 which mirrors an already-tested pattern from the Node variant).

Test plan

npx vitest run packages/test/src/test/ai-provider-cactus/ — 11 files, 31 tests (2 skipped integration)
npx vitest run packages/test/src/test/ai-provider/ — 14 files, 163 tests
bun test packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts — passes under Bun runner
bun run build:js — all 61 turbo tasks green
bun providers/cactus/scripts/hash-catalog.ts — verified against live HuggingFace asset URLs; real hashes/sizes committed

New / modified files at a glance:

providers/cactus/src/ai/common/Cactus_Runtime.ts — ENOENT-only catch
providers/cactus/src/ai/common/Cactus_Runtime.browser.ts — model_id re-validation in helper
providers/cactus/src/ai/common/Cactus_ModelCatalog.ts — production guard, CATALOG_HAS_PLACEHOLDERS, real hashes
providers/cactus/scripts/hash-catalog.ts — new script
providers/cactus/package.json — hash-catalog npm script
providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts — touch on text-delta
packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts — new
packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts — new
packages/test/src/test/helpers/cactus-placeholder-guard-runner.ts — new (child-process driver for the guard test)
packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts — new

Open follow-up

None blocking — the hash-catalog script was successfully run against HuggingFace and the catalog now contains real hashes for needle-26m. Re-running bun run --filter @workglow/cactus hash-catalog -- --write is the documented path after any future revision bump.

https://claude.ai/code/session_014HFcmuoPdbo9YWs29okjyF

Generated by Claude Code

Previously the outer try/catch around the cache-read path swallowed every filesystem error (EACCES, EIO, EISDIR, EMFILE, ...) and silently fell through to the network refetch. That masked real misconfigurations as innocuous cache misses, hiding the underlying cause from operators and forcing unnecessary HuggingFace fetches. The catch block now distinguishes: - CactusIntegrityError re-throws (unreachable; handled inside), - ENOENT falls through (the legitimate cache-miss case), - any other code wraps the original error with a descriptive message and `cause` so callers can see what actually went wrong. Regression test in Cactus_Runtime.node.test.ts covers both branches by driving the production code through real fs state — ENOENT via an empty temp dir, and a non-ENOENT (EISDIR) by making the asset path a directory.

`fetchAssetBytes` (the public entry) already called `assertSafeModelId` once before delegating, but the helper `fetchAssetBytesBrowser` did not — so a future refactor that hits the helper from a different code path could slip a hostile model_id past the allowlist. The Node variant `fetchAssetBytesNode` already re-validates at the call site; this brings the browser helper to parity. The helper now takes `model_id` as an explicit parameter and asserts it as the first statement, matching the Node helper's pattern. The single call site in `fetchAssetBytes` threads `model_id` through.

The unified session store evicts idle entries after 30 minutes (WEB_BROWSER_SESSION_IDLE_MS). Before this fix, a single long-running chat turn whose stream exceeded the idle window would have its cached session destroyed mid-stream by the idle timer, even though the model was actively producing output. The next turn would then look up a now-missing session. The fix wires `touchWebBrowserSession(sessionId)` into the chat run-fn's `trackingEmit` helper alongside the existing `deltaEmitted` bookkeeping, so every `text-delta` event resets the idle clock. Idle eviction still fires after 30 minutes of true silence (no further deltas), so the mechanism is preserved — only active streams defer it. Regression test in WebBrowser_Chat.idleTouch.test.ts: pre-populates the cache, drives a controllable ReadableStream through two chunks separated by 25 simulated minutes (50 min total elapsed), and asserts the session is still cached. A second test confirms post-stream silence still evicts at the 30-min mark.

Three changes that close the loop on the SHA-256 integrity machinery that landed in #530: 1. New `providers/cactus/scripts/hash-catalog.ts` (Bun) fetches every asset URL referenced by `CACTUS_CATALOG`, computes `sha256Hex` and byte length, prints a JSON report, and rewrites the catalog file in place when `--write` is passed. Wired as `bun run hash-catalog` in the cactus package. Re-running after a `revision` bump regenerates everything; the matcher is conservative and only replaces placeholder blocks so partially-populated catalogs are safe. 2. `Cactus_ModelCatalog.ts` gains a production guard: when `NODE_ENV === "production"` or `CACTUS_REQUIRE_REAL_HASHES === "1"`, module load throws if any asset still has the placeholder sha256 or a non-positive size. Dev/test stays permissive (no env var) so contributors can iterate against `TODO_FILL_AT_RELEASE`. Exposes `CATALOG_HAS_PLACEHOLDERS` so release tooling can opt-in to the same check. 3. Catalog values populated with real hashes + sizes obtained by running the script against the live HuggingFace asset URLs: - needle.safetensors (22,259,039 bytes) - vocab.txt (122,132 bytes) - config.json (320 bytes) With real sizes now in place, the network-fallthrough assertion in `Cactus_Runtime.node.test.ts` is tightened to assert via the downstream integrity rejection (which can only fire if fetch ran), keeping the ENOENT-branch coverage intact. Tests in `Cactus_ModelCatalog.test.ts` verify both env-var-gated states by spawning a child Bun process with and without CACTUS_REQUIRE_REAL_HASHES set.

Copilot

Pull request overview

This PR applies a set of targeted security/correctness fixes across the Cactus and Chrome-AI providers, plus adds release/tooling safeguards around Cactus’ asset hash catalog to ensure integrity checks can’t be silently bypassed in production.

Changes:

Cactus: tighten filesystem error handling to only fall back to network fetch on ENOENT, and add defense-in-depth model_id validation in the browser cache helper.
Chrome-AI: prevent active streaming sessions from being evicted mid-stream by touching the unified session store on each text-delta.
Cactus: add a hash-catalog generation script, populate real hashes/sizes in the catalog, and add a production/env-gated placeholder/size guard with tests.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts	Touches session on each streamed delta to defer idle eviction during long generations.
providers/cactus/src/ai/common/Cactus_Runtime.ts	Restricts cache-read fallthrough to `ENOENT`; rethrows other fs errors with `cause`.
providers/cactus/src/ai/common/Cactus_Runtime.browser.ts	Re-validates `model_id` inside the browser cache helper; threads `model_id` through.
providers/cactus/src/ai/common/Cactus_ModelCatalog.ts	Adds production/env guard against placeholder hashes or non-positive sizes; exports `CATALOG_HAS_PLACEHOLDERS`; updates catalog to real hashes/sizes.
providers/cactus/scripts/hash-catalog.ts	New Bun script to fetch assets, compute sha256/size, and optionally rewrite the catalog.
providers/cactus/package.json	Adds `hash-catalog` script entry.
packages/test/src/test/helpers/cactus-placeholder-guard-runner.ts	Child-process helper to validate placeholder guard behavior under env toggles.
packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts	Regression tests ensuring streaming defers idle eviction while post-stream eviction still occurs.
packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts	Regression tests for `fetchAssetBytesNode` cache-read error handling.
packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts	Tests for catalog placeholder detection/export and env-gated placeholder rejection behavior (via runner).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+function rewriteCatalogFile(catalogPath: string, reports: readonly AssetReport[]): number {
+  const source = readFileSync(catalogPath, "utf8");
+  let next = source;
+  let replaced = 0;
+  for (const r of reports) {
+    // Match block:
+    //   filename: "<r.filename>",
+    //   sha256: CACTUS_HASH_PLACEHOLDER,
+    //   size: 0,
+    // (whitespace-tolerant; size may be any number, sha256 may be the constant or quoted hex placeholder)
+    const blockRe = new RegExp(
+      `(filename:\\s*"${escapeRegex(r.filename)}",\\s*\\n\\s*)` +
+        `sha256:\\s*(?:CACTUS_HASH_PLACEHOLDER|"${escapeRegex(CACTUS_HASH_PLACEHOLDER)}"),` +
+        `(\\s*\\n\\s*)size:\\s*\\d+,`,
+      "m"
+    );
+    const updated = next.replace(
+      blockRe,
+      `$1sha256: "${r.sha256}",$2size: ${r.size},`
+    );


+  it("CACTUS_REQUIRE_REAL_HASHES=1 throws when a placeholder is present", () => {
+    // Spawn a child process that:
+    //   - injects a stubbed Cactus_Integrity that exposes the placeholder
+    //     under the constant the catalog imports, then
+    //   - injects a temporary catalog source whose first asset's sha256 is
+    //     the placeholder and size=0, then
+    //   - imports the module under CACTUS_REQUIRE_REAL_HASHES=1.
+    // We assert the child exits non-zero. To avoid file mutation in the
+    // working tree, we use a TS evaluator script that constructs the
+    // placeholder situation inline by re-importing the constants and
+    // running the guard predicate directly.
+    const here = dirname(fileURLToPath(import.meta.url));


+ * The test temporarily mutates the catalog source through Bun's `import()`
+ * resolver. We avoid `vi.mock` because mocking the module from outside while
+ * also exercising its module-load side effects is brittle across runners;
+ * instead we drive the assertions by directly invoking the same validation
+ * predicate (`CATALOG_HAS_PLACEHOLDERS`) and by spawning a child process
+ * with the env var set so the import-time throw is observable.


…ouch test The run-fn only reads input.messages (and optionally input.temperature) at runtime, but AiChatProviderInput requires model+prompt at the schema layer, which the production dispatcher provides but is irrelevant for this unit test. Match the dispatcher convention (any-typed input) via an explicit cast so the test compiles under bun run build:types. Unblocks CI build/CodeQL/publish-preview/vitest jobs that were skipping after the build job failed on this TS2345.

pkg-pr-new · 2026-05-23T18:16:34Z

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@531

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@531

@workglow/browser-control

npm i https://pkg.pr.new/@workglow/browser-control@531

@workglow/indexeddb

npm i https://pkg.pr.new/@workglow/indexeddb@531

@workglow/javascript

npm i https://pkg.pr.new/@workglow/javascript@531

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@531

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@531

@workglow/mcp

npm i https://pkg.pr.new/@workglow/mcp@531

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@531

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@531

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@531

@workglow/util

npm i https://pkg.pr.new/@workglow/util@531

workglow

npm i https://pkg.pr.new/workglow@531

@workglow/anthropic

npm i https://pkg.pr.new/@workglow/anthropic@531

@workglow/bun-webview

npm i https://pkg.pr.new/@workglow/bun-webview@531

@workglow/cactus

npm i https://pkg.pr.new/@workglow/cactus@531

@workglow/chrome-ai

npm i https://pkg.pr.new/@workglow/chrome-ai@531

@workglow/electron

npm i https://pkg.pr.new/@workglow/electron@531

@workglow/google-gemini

npm i https://pkg.pr.new/@workglow/google-gemini@531

@workglow/huggingface-inference

npm i https://pkg.pr.new/@workglow/huggingface-inference@531

@workglow/huggingface-transformers

npm i https://pkg.pr.new/@workglow/huggingface-transformers@531

@workglow/node-llama-cpp

npm i https://pkg.pr.new/@workglow/node-llama-cpp@531

@workglow/ollama

npm i https://pkg.pr.new/@workglow/ollama@531

@workglow/openai

npm i https://pkg.pr.new/@workglow/openai@531

@workglow/playwright

npm i https://pkg.pr.new/@workglow/playwright@531

@workglow/postgres

npm i https://pkg.pr.new/@workglow/postgres@531

@workglow/sqlite

npm i https://pkg.pr.new/@workglow/sqlite@531

@workglow/supabase

npm i https://pkg.pr.new/@workglow/supabase@531

@workglow/tf-mediapipe

npm i https://pkg.pr.new/@workglow/tf-mediapipe@531

commit: d49dbbe

github-actions · 2026-05-23T18:19:10Z

Coverage Report

Status	Category	Percentage	Covered / Total
🔵	Lines	61.92%	23851 / 38513
🔵	Statements	61.81%	24683 / 39930
🔵	Functions	63.09%	4536 / 7189
🔵	Branches	50.42%	11624 / 23050

File Coverage

No changed files found.

Generated in workflow #2415 for commit 74c5391 by the Vitest Coverage Report Action

claude added 4 commits May 23, 2026 08:19

sroussey requested a review from Copilot May 23, 2026 18:09

sroussey self-assigned this May 23, 2026

Copilot started reviewing on behalf of sroussey May 23, 2026 18:09 View session

Copilot AI reviewed May 23, 2026

View reviewed changes

Merge branch 'main' into claude/sweet-edison-pScsM

74c5391

sroussey merged commit 891c518 into main May 23, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cactus,chrome-ai): security and correctness fixes from review#531

fix(cactus,chrome-ai): security and correctness fixes from review#531
sroussey merged 6 commits into
mainfrom
claude/sweet-edison-pScsM

sroussey commented May 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

pkg-pr-new Bot commented May 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sroussey commented May 23, 2026

Summary

Findings source

Test plan

Open follow-up

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

pkg-pr-new Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pkg-pr-new Bot commented May 23, 2026 •

edited

Loading

github-actions Bot commented May 23, 2026 •

edited

Loading