Skip to content

fix(cactus): SHA-256 integrity verification + cross-bundle runtime singleton#529

Closed
sroussey wants to merge 11 commits into
mainfrom
claude/sweet-edison-NSph2-cactus
Closed

fix(cactus): SHA-256 integrity verification + cross-bundle runtime singleton#529
sroussey wants to merge 11 commits into
mainfrom
claude/sweet-edison-NSph2-cactus

Conversation

@sroussey
Copy link
Copy Markdown
Collaborator

Fixes 2 HIGH-priority findings in the Cactus provider (PR #524 merged).

Summary

  • HIGH-2: cross-bundle state desync. The package emits four bundles (./ai, ./ai.browser, ./ai-runtime, ./ai-runtime.browser), each with its own compiled copy of Cactus_Runtime.ts. Module-level Maps duplicated across bundles -> double engine loads, leaked engines on dispose, stale-negative isModelLoaded reads. Fixed by routing all state through a globalThis[Symbol.for("@workglow/cactus.runtime.v1")] singleton in a new Cactus_RuntimeState.ts module.
  • HIGH-1: model weights fetched and WASM-loaded with no integrity check. Trust boundary was effectively "anything HF serves is good." Added per-asset SHA-256 verification in both Node (fs) and browser (Cache Storage) paths. Hashes verified BEFORE atomic rename / cache write. Corrupt cached assets unlinked + refetched on next call.

Maintainer action required

The catalog entries (CACTUS_NEEDLE_26M.assets.*.sha256 and .size) are placeholder values ("TODO_FILL_AT_RELEASE") in this PR. You must populate them with real hashes of the pinned-revision artifacts. While the placeholder is present, verifySha256 skips the check and logs a one-time warning per file so the gap cannot ship silently. A follow-up providers/cactus/scripts/hash-catalog.ts script would automate this; out of scope for this PR.

The pinned revision in Cactus_Constants.ts is currently "main" (a mutable ref). To make the hash trust boundary meaningful, this should be moved to an immutable commit SHA in the same release that populates the hashes.

Recommended follow-up: require signed commits on Cactus_ModelCatalog.ts so the trust boundary is anchored at "the commit that updated the catalog."

Notable design choices / deviations

  • The _testOnly.cactusEngines and _testOnly.cactusConfigJson exports changed from being the Map instances themselves to accessor functions (() => Map). This is required so that __resetRuntimeForTests() produces fresh state instead of stale captures. Existing call sites in packages/test/src/test/ai-provider-cactus/ were updated to invoke the accessors. Source-level imports of cactusEngines/cactusConfigJson from Cactus_Runtime are preserved as accessor re-exports (getCactusEngines, etc.) for the few internal callers; the legacy bare-Map re-export was dropped because it would have re-introduced the desync.
  • fetchAssetBytes now accepts either a CactusAssetSpec or a bare filename (resolved against the catalog). Hot callers in getOrLoadEngine and the download run-fns pass the full spec.
  • Browser cache write was changed from cache.put(url, resp.clone()) to cache.put(url, new Response(bytes, { headers })) after verification. This is intentional: cloning the original response would persist unverified bytes if the digest check then failed.
  • Integration tests in packages/test/src/test/ai-provider-cactus/ already gate themselves behind RUN_CACTUS_TESTS=1 and will keep working since they don't touch the catalog shape; once real hashes are populated they will also exercise the new verification path against live HF traffic.

Test plan

  • Cactus_Integrity.test.ts - hash computation, mismatch rejection, malformed-expected rejection, placeholder bypass with warning
  • Cactus_RuntimeState.test.ts - lazy init, singleton identity, version mismatch, reset
  • Cactus_Runtime.crossBundle.test.ts - singleton dedup across import paths, version mismatch error path
  • Existing provider-cactus tests pass after the spec plumbing + accessor change
  • Maintainer manually populates real hashes and runs an end-to-end download against HF

https://claude.ai/code/session_01B2fT1tf7YoTrGzukoHnrCK


Generated by Claude Code

Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.browser.ts Fixed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the @workglow/cactus provider by (1) ensuring runtime state is shared across separately-bundled entrypoints and (2) adding SHA-256 integrity verification for downloaded/cached model assets in both Node/Bun and browser paths.

Changes:

  • Introduces a globalThis[Symbol.for("@workglow/cactus.runtime.v1")] singleton (Cactus_RuntimeState.ts) and routes all runtime Maps/Sets through it to prevent cross-bundle desync/leaks.
  • Adds asset integrity verification (Cactus_Integrity.ts) and plumbs CactusAssetSpec (filename/sha256/size) through the catalog, runtime fetchers, and download run-fns.
  • Updates _testOnly and integration tests to use accessor-style runtime state, and adds focused unit tests for integrity and runtime singleton behavior.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
providers/cactus/src/ai/common/Cactus_RuntimeState.ts New globalThis-keyed singleton backing runtime state across bundles.
providers/cactus/src/ai/common/Cactus_Runtime.ts Node/Bun runtime now uses singleton state + verifies asset integrity before cache promotion.
providers/cactus/src/ai/common/Cactus_Runtime.browser.ts Browser runtime now uses singleton state + verifies asset integrity before Cache Storage writes.
providers/cactus/src/ai/common/Cactus_ModelCatalog.ts Catalog assets move from filenames to {filename, sha256, size} specs; adds helpers.
providers/cactus/src/ai/common/Cactus_Integrity.ts New SHA-256 hashing/verification + placeholder bypass with warning.
providers/cactus/src/ai/common/Cactus_Download.ts Download run-fn now iterates asset specs and surfaces integrity failures.
providers/cactus/src/ai/common/Cactus_Download.browser.ts Browser download run-fn mirrors spec-based download + integrity error surfacing.
providers/cactus/src/ai/common/Cactus_JobRunFns.ts Re-exports updated runtime accessors instead of bare Maps.
providers/cactus/src/ai/common/Cactus_JobRunFns.browser.ts Browser run-fn registry updated to re-export runtime accessors.
providers/cactus/src/ai/common/tests/Cactus_RuntimeState.test.ts New unit coverage for singleton init/reset/version mismatch.
providers/cactus/src/ai/common/tests/Cactus_Runtime.crossBundle.test.ts New test modeling cross-bundle singleton behavior via globalThis.
providers/cactus/src/ai/common/tests/Cactus_Integrity.test.ts New unit coverage for SHA-256 vectors, mismatch handling, placeholder bypass.
providers/cactus/src/ai.ts _testOnly switches to accessor functions; docs updated to reflect singleton state.
providers/cactus/src/ai.browser.ts Browser _testOnly switches to accessor functions; docs updated.
packages/test/src/test/ai-provider-cactus/Cactus_ToolCalling.integration.test.ts Updates to call accessor functions for runtime state cleanup.
packages/test/src/test/ai-provider-cactus/Cactus_DownloadRemove.test.ts Updates to call accessor functions and validate cleared state via accessors.
packages/test/src/test/ai-provider-cactus/Cactus_Download.integration.test.ts Updates to call accessor functions for per-test cleanup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +19 to +26
import {
getCactusCachedModelIds,
getCactusConfigJson,
getCactusEngineLoadsInFlight,
getCactusEngines,
getCactusSessions,
getRuntime,
} from "./Cactus_RuntimeState";
Comment on lines +24 to +27
getCactusCachedModelIds,
getCactusConfigJson,
getCactusEngineLoadsInFlight,
getCactusEngines,
constructor(opts: { url: string; filename: string; expected: string; actual: string }) {
super(
`Integrity check failed for ${opts.filename} from ${opts.url}: ` +
`expected sha256 ${opts.expected}, got ${opts.actual}`
if (err instanceof CactusIntegrityError) {
emit({
type: "phase",
message: `Integrity check failed for ${spec.filename}: expected sha256 ${err.expected}, got ${err.actual}`,
if (err instanceof CactusIntegrityError) {
emit({
type: "phase",
message: `Integrity check failed for ${spec.filename}: expected sha256 ${err.expected}, got ${err.actual}`,
Comment on lines +21 to +25
export interface CactusAssetSpec {
readonly filename: string;
/** Lowercase hex SHA-256, exactly 64 characters. */
readonly sha256: string;
/** Expected byte length — used as a cheap pre-check before hashing. */
Comment on lines +43 to +48
/**
* Asserts that `s` is a lowercase hex SHA-256 (64 hex chars).
*
* Used at catalog load time to surface malformed entries before any
* verification call sees them.
*/
sroussey added 6 commits May 22, 2026 08:27
…e in Download

CactusIntegrityError previously hard-coded "expected sha256 ..." in its
message, producing confusing output like "expected sha256 22000000 bytes"
for the size-mismatch callers. Make the constructor message label-agnostic
so it reads naturally for both hash and byte-length mismatches.

Cactus_Download(.browser) was reformatting the error message under the
assumption it was always a SHA mismatch; just propagate err.message,
which now phrases itself correctly for both cases.
Document the placeholder semantics on CactusAssetSpec.sha256 so the
"TODO_FILL_AT_RELEASE" contract is discoverable from the type itself,
and put assertHexSha256 to work by validating every non-placeholder
entry the moment the catalog module loads. Catalog-author bugs
(wrong length, uppercase, non-hex chars) now surface immediately at
import time instead of on first verification.
…uffer types

build-types was failing with two distinct shape mismatches:

1. emit({ type: "phase", message }) in Cactus_Download(.browser).ts omits
   `progress`, but StreamPhase declares `progress: number | undefined`
   (required, not optional). Re-emit with `progress: undefined` for the
   error path so the discriminated-union members keep their explicit
   shape.

2. sha256Hex / verifySha256 pass a `Uint8Array<ArrayBufferLike>` (the
   default of the unparameterized `Uint8Array` since the recent lib.dom
   tightening) into crypto.subtle.digest, which now expects
   `ArrayBufferView<ArrayBuffer>` / a concrete `BufferSource`. Copy
   incoming bytes into a fresh `ArrayBuffer` and pass that buffer
   directly to digest, then read the result back through a Uint8Array
   view. The hash result is unchanged.
…d fix)

The "accepts ArrayBuffer input" test was passing `asciiBytes("abc").buffer`
to sha256Hex. `Uint8Array.prototype.buffer` is typed `ArrayBufferLike` in
the lib.dom.d.ts version this repo pulls in — meaning TypeScript treats
it as potentially `SharedArrayBuffer`, which is missing properties
(`resizable`, `resize`, `detached`, `transfer`, `transferToFixedLength`)
that `ArrayBuffer` requires.

The test still wants to exercise sha256Hex's ArrayBuffer branch, so build
a fresh concrete `ArrayBuffer` via `new ArrayBuffer(n)` and copy the
ASCII bytes into it before calling sha256Hex. Same coverage, no
ArrayBufferLike leakage into the call site.
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 22, 2026

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@529

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@529

@workglow/browser-control

npm i https://pkg.pr.new/@workglow/browser-control@529

@workglow/indexeddb

npm i https://pkg.pr.new/@workglow/indexeddb@529

@workglow/javascript

npm i https://pkg.pr.new/@workglow/javascript@529

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@529

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@529

@workglow/mcp

npm i https://pkg.pr.new/@workglow/mcp@529

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@529

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@529

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@529

@workglow/util

npm i https://pkg.pr.new/@workglow/util@529

workglow

npm i https://pkg.pr.new/workglow@529

@workglow/anthropic

npm i https://pkg.pr.new/@workglow/anthropic@529

@workglow/bun-webview

npm i https://pkg.pr.new/@workglow/bun-webview@529

@workglow/cactus

npm i https://pkg.pr.new/@workglow/cactus@529

@workglow/chrome-ai

npm i https://pkg.pr.new/@workglow/chrome-ai@529

@workglow/electron

npm i https://pkg.pr.new/@workglow/electron@529

@workglow/google-gemini

npm i https://pkg.pr.new/@workglow/google-gemini@529

@workglow/huggingface-inference

npm i https://pkg.pr.new/@workglow/huggingface-inference@529

@workglow/huggingface-transformers

npm i https://pkg.pr.new/@workglow/huggingface-transformers@529

@workglow/node-llama-cpp

npm i https://pkg.pr.new/@workglow/node-llama-cpp@529

@workglow/ollama

npm i https://pkg.pr.new/@workglow/ollama@529

@workglow/openai

npm i https://pkg.pr.new/@workglow/openai@529

@workglow/playwright

npm i https://pkg.pr.new/@workglow/playwright@529

@workglow/postgres

npm i https://pkg.pr.new/@workglow/postgres@529

@workglow/sqlite

npm i https://pkg.pr.new/@workglow/sqlite@529

@workglow/supabase

npm i https://pkg.pr.new/@workglow/supabase@529

@workglow/tf-mediapipe

npm i https://pkg.pr.new/@workglow/tf-mediapipe@529

commit: 4ae8132

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 22, 2026

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@529

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@529

@workglow/browser-control

npm i https://pkg.pr.new/@workglow/browser-control@529

@workglow/indexeddb

npm i https://pkg.pr.new/@workglow/indexeddb@529

@workglow/javascript

npm i https://pkg.pr.new/@workglow/javascript@529

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@529

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@529

@workglow/mcp

npm i https://pkg.pr.new/@workglow/mcp@529

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@529

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@529

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@529

@workglow/util

npm i https://pkg.pr.new/@workglow/util@529

workglow

npm i https://pkg.pr.new/workglow@529

@workglow/anthropic

npm i https://pkg.pr.new/@workglow/anthropic@529

@workglow/bun-webview

npm i https://pkg.pr.new/@workglow/bun-webview@529

@workglow/cactus

npm i https://pkg.pr.new/@workglow/cactus@529

@workglow/chrome-ai

npm i https://pkg.pr.new/@workglow/chrome-ai@529

@workglow/electron

npm i https://pkg.pr.new/@workglow/electron@529

@workglow/google-gemini

npm i https://pkg.pr.new/@workglow/google-gemini@529

@workglow/huggingface-inference

npm i https://pkg.pr.new/@workglow/huggingface-inference@529

@workglow/huggingface-transformers

npm i https://pkg.pr.new/@workglow/huggingface-transformers@529

@workglow/node-llama-cpp

npm i https://pkg.pr.new/@workglow/node-llama-cpp@529

@workglow/ollama

npm i https://pkg.pr.new/@workglow/ollama@529

@workglow/openai

npm i https://pkg.pr.new/@workglow/openai@529

@workglow/playwright

npm i https://pkg.pr.new/@workglow/playwright@529

@workglow/postgres

npm i https://pkg.pr.new/@workglow/postgres@529

@workglow/sqlite

npm i https://pkg.pr.new/@workglow/sqlite@529

@workglow/supabase

npm i https://pkg.pr.new/@workglow/supabase@529

@workglow/tf-mediapipe

npm i https://pkg.pr.new/@workglow/tf-mediapipe@529

commit: 809c03b

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 62.37% 23712 / 38016
🔵 Statements 62.25% 24540 / 39420
🔵 Functions 63.48% 4515 / 7112
🔵 Branches 50.82% 11530 / 22686
File CoverageNo changed files found.
Generated in workflow #2396 for commit 4ae8132 by the Vitest Coverage Report Action

…th-injection sanitizer

CodeQL's `js/path-injection` query couldn't recognize the regex-based
allowlist as a sanitizer — `model_id` from `provider_config.model_id`
was still flagged as flowing into `path.resolve(models_dir, model_id)`
and the subsequent `fs.*` calls.

Add `safeJoinUnderRoot(root, ...segments)` that normalizes with
`path.resolve` and verifies containment via `startsWith(resolvedRoot + sep)`
— the canonical sanitizer pattern CodeQL recognizes for the
js/path-injection query. Apply at every fs entry point in this module:

- `resolveModelDir`: now safe-joins `model_id` onto a resolved `models_dir`
  root, throws on escape.
- `getNodeAssetCacheInfo`: each `fs.stat(path.join(resolvedDir, filename))`
  becomes `fs.stat(safeJoinUnderRoot(resolvedDir, filename))`.
- `fetchAssetBytesNode`: `filePath = safeJoinUnderRoot(resolvedDir, spec.filename)`.
  `tmpPath` is a sibling string concat, inheriting the contained `filePath`.
- `removeNodeCacheDir`: explicit `resolveModelDir` call locally so the
  sanitizer trace is visible per-fs-call.

The character allowlists (assertSafeModelId / assertSafeFilename) remain
as fast-fail defense in depth — they reject malformed input at the
boundary before any path operation runs.
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
function resolveAssetSpec(
entry: CactusCatalogEntry,
specOrFilename: CactusAssetSpec | string
): CactusAssetSpec {
CodeQL's js/path-injection query doesn't trace through user-defined
helper functions like safeJoinUnderRoot. The sanitizer pattern
(path.resolve + startsWith containment check) must be visible in the
same function scope as the fs call for the analyzer to recognize it.

Inline the check at each fs call site in getNodeAssetCacheInfo,
fetchAssetBytesNode, and removeNodeCacheDir. The safeJoinUnderRoot
helper stays in place for resolveModelDir and the assertSafe* allowlists
remain as fast-fail defense-in-depth, but every fs call site now has
its own inline sanitizer that CodeQL can trace.
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Dismissed
`Path escape detected: ${JSON.stringify(readPath)} is not within ${JSON.stringify(resolvedDir)}`
);
}
const buf = await fs.readFile(readPath);
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Dismissed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
Comment thread providers/cactus/src/ai/common/Cactus_Runtime.ts Fixed
…h (build fix)

The inline-sanitizer commit (4214fa0) replaced every call to
`resolveModelDir` and `safeJoinUnderRoot` with the inline
`path.resolve` + `startsWith` containment check, but left both
helpers defined. `noUnusedLocals` then failed the build:

  'resolveModelDir' is declared but its value is never read

Remove both helpers and the now-unused `tmpPath` binding (the inline
`writeTarget` / `renameFrom` recompute the same path). Update the
path-safety section header to reflect the inline-only approach.

Behavior unchanged: the inline sanitizer at every fs call site, plus
the `assertSafeModelId` / `assertSafeFilename` allowlists, remain in
place.
`Path escape detected: ${JSON.stringify(mkdirTarget)} is not within ${JSON.stringify(safeRoot)}`
);
}
await fs.mkdir(mkdirTarget, { recursive: true });
`Path escape detected: ${JSON.stringify(writeTarget)} is not within ${JSON.stringify(resolvedDir)}`
);
}
await fs.writeFile(writeTarget, bytes);
`Path escape detected: ${JSON.stringify(renameTo)} is not within ${JSON.stringify(resolvedDir)}`
);
}
await fs.rename(renameFrom, renameTo);
`Path escape detected: ${JSON.stringify(renameTo)} is not within ${JSON.stringify(resolvedDir)}`
);
}
await fs.rename(renameFrom, renameTo);
if (cleanupTarget !== resolvedDir && !cleanupTarget.startsWith(resolvedDir + path.sep)) {
throw err;
}
await fs.unlink(cleanupTarget).catch(() => {});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants