Skip to content

fix(cactus,chrome-ai): security and correctness fixes from review#531

Merged
sroussey merged 6 commits into
mainfrom
claude/sweet-edison-pScsM
May 23, 2026
Merged

fix(cactus,chrome-ai): security and correctness fixes from review#531
sroussey merged 6 commits into
mainfrom
claude/sweet-edison-pScsM

Conversation

@sroussey
Copy link
Copy Markdown
Collaborator

Summary

Four small fixes surfaced by a security / DX / correctness review of recent commits to the Cactus and Chrome-AI providers (in particular #530 SHA-256 integrity + #514 Chrome AI). One commit per fix, no squashing.

  1. fix(cactus): catch only ENOENT in fetchAssetBytesNode — the outer try/catch around the cache-read path used to swallow every filesystem error and silently refetch over the network, masking real problems like EACCES/EIO/EISDIR. The catch now lets ENOENT fall through (legitimate cache miss) and re-wraps every other code with a cause-carrying error so operators see what actually went wrong.

  2. fix(cactus): validate model_id inside fetchAssetBytesBrowser — defense-in-depth parity with fetchAssetBytesNode. The browser helper now re-asserts assertSafeModelId at the top, matching the Node variant. A future refactor that hits the helper from a new entry point can no longer slip a hostile model_id past the allowlist.

  3. fix(chrome-ai): touch chat session on each delta to defer idle eviction — long-running streams (>30 min) used to have their cached LanguageModel session destroyed mid-flight by the unified store's idle timer. The chat run-fn now calls touchWebBrowserSession(sessionId) on every text-delta, so active output defers eviction; post-stream silence still evicts at the normal 30-min mark.

  4. fix(cactus): add hash-catalog script + production placeholder guard — a Bun-runnable providers/cactus/scripts/hash-catalog.ts fetches every catalog asset URL, computes sha256 + byte length, prints JSON, and (with --write) rewrites the catalog in place. Module-load throws in production / CACTUS_REQUIRE_REAL_HASHES=1 when any asset still uses the placeholder or has non-positive size; exports CATALOG_HAS_PLACEHOLDERS so tooling can opt-in. The script was run against the live HuggingFace URLs and the catalog now ships real hashes + sizes for the three needle-26m assets.

Findings source

These four items came from a focused review of the most-recent merged commits in providers/cactus and providers/chrome-ai. They are all small, locally-scoped, and have regression-test coverage (except commit 2 which mirrors an already-tested pattern from the Node variant).

Test plan

  • npx vitest run packages/test/src/test/ai-provider-cactus/ — 11 files, 31 tests (2 skipped integration)
  • npx vitest run packages/test/src/test/ai-provider/ — 14 files, 163 tests
  • bun test packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts — passes under Bun runner
  • bun run build:js — all 61 turbo tasks green
  • bun providers/cactus/scripts/hash-catalog.ts — verified against live HuggingFace asset URLs; real hashes/sizes committed

New / modified files at a glance:

  • providers/cactus/src/ai/common/Cactus_Runtime.ts — ENOENT-only catch
  • providers/cactus/src/ai/common/Cactus_Runtime.browser.ts — model_id re-validation in helper
  • providers/cactus/src/ai/common/Cactus_ModelCatalog.ts — production guard, CATALOG_HAS_PLACEHOLDERS, real hashes
  • providers/cactus/scripts/hash-catalog.ts — new script
  • providers/cactus/package.jsonhash-catalog npm script
  • providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts — touch on text-delta
  • packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts — new
  • packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts — new
  • packages/test/src/test/helpers/cactus-placeholder-guard-runner.ts — new (child-process driver for the guard test)
  • packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts — new

Open follow-up

None blocking — the hash-catalog script was successfully run against HuggingFace and the catalog now contains real hashes for needle-26m. Re-running bun run --filter @workglow/cactus hash-catalog -- --write is the documented path after any future revision bump.

https://claude.ai/code/session_014HFcmuoPdbo9YWs29okjyF


Generated by Claude Code

claude added 4 commits May 23, 2026 08:19
Previously the outer try/catch around the cache-read path swallowed every
filesystem error (EACCES, EIO, EISDIR, EMFILE, ...) and silently fell through
to the network refetch. That masked real misconfigurations as innocuous
cache misses, hiding the underlying cause from operators and forcing
unnecessary HuggingFace fetches.

The catch block now distinguishes:
  - CactusIntegrityError re-throws (unreachable; handled inside),
  - ENOENT falls through (the legitimate cache-miss case),
  - any other code wraps the original error with a descriptive message
    and `cause` so callers can see what actually went wrong.

Regression test in Cactus_Runtime.node.test.ts covers both branches by
driving the production code through real fs state — ENOENT via an empty
temp dir, and a non-ENOENT (EISDIR) by making the asset path a directory.
`fetchAssetBytes` (the public entry) already called `assertSafeModelId` once
before delegating, but the helper `fetchAssetBytesBrowser` did not — so a
future refactor that hits the helper from a different code path could
slip a hostile model_id past the allowlist. The Node variant
`fetchAssetBytesNode` already re-validates at the call site; this brings
the browser helper to parity.

The helper now takes `model_id` as an explicit parameter and asserts it
as the first statement, matching the Node helper's pattern. The single
call site in `fetchAssetBytes` threads `model_id` through.
The unified session store evicts idle entries after 30 minutes
(WEB_BROWSER_SESSION_IDLE_MS). Before this fix, a single long-running
chat turn whose stream exceeded the idle window would have its cached
session destroyed mid-stream by the idle timer, even though the model
was actively producing output. The next turn would then look up a
now-missing session.

The fix wires `touchWebBrowserSession(sessionId)` into the chat run-fn's
`trackingEmit` helper alongside the existing `deltaEmitted` bookkeeping,
so every `text-delta` event resets the idle clock. Idle eviction still
fires after 30 minutes of true silence (no further deltas), so the
mechanism is preserved — only active streams defer it.

Regression test in WebBrowser_Chat.idleTouch.test.ts: pre-populates the
cache, drives a controllable ReadableStream through two chunks separated
by 25 simulated minutes (50 min total elapsed), and asserts the session
is still cached. A second test confirms post-stream silence still evicts
at the 30-min mark.
Three changes that close the loop on the SHA-256 integrity machinery
that landed in #530:

1. New `providers/cactus/scripts/hash-catalog.ts` (Bun) fetches every
   asset URL referenced by `CACTUS_CATALOG`, computes `sha256Hex` and
   byte length, prints a JSON report, and rewrites the catalog file in
   place when `--write` is passed. Wired as `bun run hash-catalog` in
   the cactus package. Re-running after a `revision` bump regenerates
   everything; the matcher is conservative and only replaces
   placeholder blocks so partially-populated catalogs are safe.

2. `Cactus_ModelCatalog.ts` gains a production guard: when
   `NODE_ENV === "production"` or `CACTUS_REQUIRE_REAL_HASHES === "1"`,
   module load throws if any asset still has the placeholder sha256
   or a non-positive size. Dev/test stays permissive (no env var) so
   contributors can iterate against `TODO_FILL_AT_RELEASE`. Exposes
   `CATALOG_HAS_PLACEHOLDERS` so release tooling can opt-in to the
   same check.

3. Catalog values populated with real hashes + sizes obtained by
   running the script against the live HuggingFace asset URLs:
     - needle.safetensors  (22,259,039 bytes)
     - vocab.txt           (122,132 bytes)
     - config.json         (320 bytes)
   With real sizes now in place, the network-fallthrough assertion in
   `Cactus_Runtime.node.test.ts` is tightened to assert via the
   downstream integrity rejection (which can only fire if fetch ran),
   keeping the ENOENT-branch coverage intact.

Tests in `Cactus_ModelCatalog.test.ts` verify both env-var-gated
states by spawning a child Bun process with and without
CACTUS_REQUIRE_REAL_HASHES set.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR applies a set of targeted security/correctness fixes across the Cactus and Chrome-AI providers, plus adds release/tooling safeguards around Cactus’ asset hash catalog to ensure integrity checks can’t be silently bypassed in production.

Changes:

  • Cactus: tighten filesystem error handling to only fall back to network fetch on ENOENT, and add defense-in-depth model_id validation in the browser cache helper.
  • Chrome-AI: prevent active streaming sessions from being evicted mid-stream by touching the unified session store on each text-delta.
  • Cactus: add a hash-catalog generation script, populate real hashes/sizes in the catalog, and add a production/env-gated placeholder/size guard with tests.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts Touches session on each streamed delta to defer idle eviction during long generations.
providers/cactus/src/ai/common/Cactus_Runtime.ts Restricts cache-read fallthrough to ENOENT; rethrows other fs errors with cause.
providers/cactus/src/ai/common/Cactus_Runtime.browser.ts Re-validates model_id inside the browser cache helper; threads model_id through.
providers/cactus/src/ai/common/Cactus_ModelCatalog.ts Adds production/env guard against placeholder hashes or non-positive sizes; exports CATALOG_HAS_PLACEHOLDERS; updates catalog to real hashes/sizes.
providers/cactus/scripts/hash-catalog.ts New Bun script to fetch assets, compute sha256/size, and optionally rewrite the catalog.
providers/cactus/package.json Adds hash-catalog script entry.
packages/test/src/test/helpers/cactus-placeholder-guard-runner.ts Child-process helper to validate placeholder guard behavior under env toggles.
packages/test/src/test/ai-provider/WebBrowser_Chat.idleTouch.test.ts Regression tests ensuring streaming defers idle eviction while post-stream eviction still occurs.
packages/test/src/test/ai-provider-cactus/Cactus_Runtime.node.test.ts Regression tests for fetchAssetBytesNode cache-read error handling.
packages/test/src/test/ai-provider-cactus/Cactus_ModelCatalog.test.ts Tests for catalog placeholder detection/export and env-gated placeholder rejection behavior (via runner).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +73 to +92
function rewriteCatalogFile(catalogPath: string, reports: readonly AssetReport[]): number {
const source = readFileSync(catalogPath, "utf8");
let next = source;
let replaced = 0;
for (const r of reports) {
// Match block:
// filename: "<r.filename>",
// sha256: CACTUS_HASH_PLACEHOLDER,
// size: 0,
// (whitespace-tolerant; size may be any number, sha256 may be the constant or quoted hex placeholder)
const blockRe = new RegExp(
`(filename:\\s*"${escapeRegex(r.filename)}",\\s*\\n\\s*)` +
`sha256:\\s*(?:CACTUS_HASH_PLACEHOLDER|"${escapeRegex(CACTUS_HASH_PLACEHOLDER)}"),` +
`(\\s*\\n\\s*)size:\\s*\\d+,`,
"m"
);
const updated = next.replace(
blockRe,
`$1sha256: "${r.sha256}",$2size: ${r.size},`
);
Comment on lines +62 to +73
it("CACTUS_REQUIRE_REAL_HASHES=1 throws when a placeholder is present", () => {
// Spawn a child process that:
// - injects a stubbed Cactus_Integrity that exposes the placeholder
// under the constant the catalog imports, then
// - injects a temporary catalog source whose first asset's sha256 is
// the placeholder and size=0, then
// - imports the module under CACTUS_REQUIRE_REAL_HASHES=1.
// We assert the child exits non-zero. To avoid file mutation in the
// working tree, we use a TS evaluator script that constructs the
// placeholder situation inline by re-importing the constants and
// running the guard predicate directly.
const here = dirname(fileURLToPath(import.meta.url));
Comment on lines +18 to +23
* The test temporarily mutates the catalog source through Bun's `import()`
* resolver. We avoid `vi.mock` because mocking the module from outside while
* also exercising its module-load side effects is brittle across runners;
* instead we drive the assertions by directly invoking the same validation
* predicate (`CATALOG_HAS_PLACEHOLDERS`) and by spawning a child process
* with the env var set so the import-time throw is observable.
…ouch test

The run-fn only reads input.messages (and optionally input.temperature) at
runtime, but AiChatProviderInput requires model+prompt at the schema layer,
which the production dispatcher provides but is irrelevant for this unit test.
Match the dispatcher convention (any-typed input) via an explicit cast so the
test compiles under bun run build:types.

Unblocks CI build/CodeQL/publish-preview/vitest jobs that were skipping after
the build job failed on this TS2345.
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 23, 2026

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@531

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@531

@workglow/browser-control

npm i https://pkg.pr.new/@workglow/browser-control@531

@workglow/indexeddb

npm i https://pkg.pr.new/@workglow/indexeddb@531

@workglow/javascript

npm i https://pkg.pr.new/@workglow/javascript@531

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@531

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@531

@workglow/mcp

npm i https://pkg.pr.new/@workglow/mcp@531

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@531

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@531

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@531

@workglow/util

npm i https://pkg.pr.new/@workglow/util@531

workglow

npm i https://pkg.pr.new/workglow@531

@workglow/anthropic

npm i https://pkg.pr.new/@workglow/anthropic@531

@workglow/bun-webview

npm i https://pkg.pr.new/@workglow/bun-webview@531

@workglow/cactus

npm i https://pkg.pr.new/@workglow/cactus@531

@workglow/chrome-ai

npm i https://pkg.pr.new/@workglow/chrome-ai@531

@workglow/electron

npm i https://pkg.pr.new/@workglow/electron@531

@workglow/google-gemini

npm i https://pkg.pr.new/@workglow/google-gemini@531

@workglow/huggingface-inference

npm i https://pkg.pr.new/@workglow/huggingface-inference@531

@workglow/huggingface-transformers

npm i https://pkg.pr.new/@workglow/huggingface-transformers@531

@workglow/node-llama-cpp

npm i https://pkg.pr.new/@workglow/node-llama-cpp@531

@workglow/ollama

npm i https://pkg.pr.new/@workglow/ollama@531

@workglow/openai

npm i https://pkg.pr.new/@workglow/openai@531

@workglow/playwright

npm i https://pkg.pr.new/@workglow/playwright@531

@workglow/postgres

npm i https://pkg.pr.new/@workglow/postgres@531

@workglow/sqlite

npm i https://pkg.pr.new/@workglow/sqlite@531

@workglow/supabase

npm i https://pkg.pr.new/@workglow/supabase@531

@workglow/tf-mediapipe

npm i https://pkg.pr.new/@workglow/tf-mediapipe@531

commit: d49dbbe

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 23, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 61.92% 23851 / 38513
🔵 Statements 61.81% 24683 / 39930
🔵 Functions 63.09% 4536 / 7189
🔵 Branches 50.42% 11624 / 23050
File CoverageNo changed files found.
Generated in workflow #2415 for commit 74c5391 by the Vitest Coverage Report Action

@sroussey sroussey merged commit 891c518 into main May 23, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants