fix(cactus): SHA-256 integrity verification + cross-bundle runtime singleton#529
Closed
sroussey wants to merge 11 commits into
Closed
fix(cactus): SHA-256 integrity verification + cross-bundle runtime singleton#529sroussey wants to merge 11 commits into
sroussey wants to merge 11 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the @workglow/cactus provider by (1) ensuring runtime state is shared across separately-bundled entrypoints and (2) adding SHA-256 integrity verification for downloaded/cached model assets in both Node/Bun and browser paths.
Changes:
- Introduces a
globalThis[Symbol.for("@workglow/cactus.runtime.v1")]singleton (Cactus_RuntimeState.ts) and routes all runtime Maps/Sets through it to prevent cross-bundle desync/leaks. - Adds asset integrity verification (
Cactus_Integrity.ts) and plumbsCactusAssetSpec(filename/sha256/size) through the catalog, runtime fetchers, and download run-fns. - Updates
_testOnlyand integration tests to use accessor-style runtime state, and adds focused unit tests for integrity and runtime singleton behavior.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| providers/cactus/src/ai/common/Cactus_RuntimeState.ts | New globalThis-keyed singleton backing runtime state across bundles. |
| providers/cactus/src/ai/common/Cactus_Runtime.ts | Node/Bun runtime now uses singleton state + verifies asset integrity before cache promotion. |
| providers/cactus/src/ai/common/Cactus_Runtime.browser.ts | Browser runtime now uses singleton state + verifies asset integrity before Cache Storage writes. |
| providers/cactus/src/ai/common/Cactus_ModelCatalog.ts | Catalog assets move from filenames to {filename, sha256, size} specs; adds helpers. |
| providers/cactus/src/ai/common/Cactus_Integrity.ts | New SHA-256 hashing/verification + placeholder bypass with warning. |
| providers/cactus/src/ai/common/Cactus_Download.ts | Download run-fn now iterates asset specs and surfaces integrity failures. |
| providers/cactus/src/ai/common/Cactus_Download.browser.ts | Browser download run-fn mirrors spec-based download + integrity error surfacing. |
| providers/cactus/src/ai/common/Cactus_JobRunFns.ts | Re-exports updated runtime accessors instead of bare Maps. |
| providers/cactus/src/ai/common/Cactus_JobRunFns.browser.ts | Browser run-fn registry updated to re-export runtime accessors. |
| providers/cactus/src/ai/common/tests/Cactus_RuntimeState.test.ts | New unit coverage for singleton init/reset/version mismatch. |
| providers/cactus/src/ai/common/tests/Cactus_Runtime.crossBundle.test.ts | New test modeling cross-bundle singleton behavior via globalThis. |
| providers/cactus/src/ai/common/tests/Cactus_Integrity.test.ts | New unit coverage for SHA-256 vectors, mismatch handling, placeholder bypass. |
| providers/cactus/src/ai.ts | _testOnly switches to accessor functions; docs updated to reflect singleton state. |
| providers/cactus/src/ai.browser.ts | Browser _testOnly switches to accessor functions; docs updated. |
| packages/test/src/test/ai-provider-cactus/Cactus_ToolCalling.integration.test.ts | Updates to call accessor functions for runtime state cleanup. |
| packages/test/src/test/ai-provider-cactus/Cactus_DownloadRemove.test.ts | Updates to call accessor functions and validate cleared state via accessors. |
| packages/test/src/test/ai-provider-cactus/Cactus_Download.integration.test.ts | Updates to call accessor functions for per-test cleanup. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+19
to
+26
| import { | ||
| getCactusCachedModelIds, | ||
| getCactusConfigJson, | ||
| getCactusEngineLoadsInFlight, | ||
| getCactusEngines, | ||
| getCactusSessions, | ||
| getRuntime, | ||
| } from "./Cactus_RuntimeState"; |
Comment on lines
+24
to
+27
| getCactusCachedModelIds, | ||
| getCactusConfigJson, | ||
| getCactusEngineLoadsInFlight, | ||
| getCactusEngines, |
| constructor(opts: { url: string; filename: string; expected: string; actual: string }) { | ||
| super( | ||
| `Integrity check failed for ${opts.filename} from ${opts.url}: ` + | ||
| `expected sha256 ${opts.expected}, got ${opts.actual}` |
| if (err instanceof CactusIntegrityError) { | ||
| emit({ | ||
| type: "phase", | ||
| message: `Integrity check failed for ${spec.filename}: expected sha256 ${err.expected}, got ${err.actual}`, |
| if (err instanceof CactusIntegrityError) { | ||
| emit({ | ||
| type: "phase", | ||
| message: `Integrity check failed for ${spec.filename}: expected sha256 ${err.expected}, got ${err.actual}`, |
Comment on lines
+21
to
+25
| export interface CactusAssetSpec { | ||
| readonly filename: string; | ||
| /** Lowercase hex SHA-256, exactly 64 characters. */ | ||
| readonly sha256: string; | ||
| /** Expected byte length — used as a cheap pre-check before hashing. */ |
Comment on lines
+43
to
+48
| /** | ||
| * Asserts that `s` is a lowercase hex SHA-256 (64 hex chars). | ||
| * | ||
| * Used at catalog load time to surface malformed entries before any | ||
| * verification call sees them. | ||
| */ |
…e in Download CactusIntegrityError previously hard-coded "expected sha256 ..." in its message, producing confusing output like "expected sha256 22000000 bytes" for the size-mismatch callers. Make the constructor message label-agnostic so it reads naturally for both hash and byte-length mismatches. Cactus_Download(.browser) was reformatting the error message under the assumption it was always a SHA mismatch; just propagate err.message, which now phrases itself correctly for both cases.
Document the placeholder semantics on CactusAssetSpec.sha256 so the "TODO_FILL_AT_RELEASE" contract is discoverable from the type itself, and put assertHexSha256 to work by validating every non-placeholder entry the moment the catalog module loads. Catalog-author bugs (wrong length, uppercase, non-hex chars) now surface immediately at import time instead of on first verification.
…uffer types
build-types was failing with two distinct shape mismatches:
1. emit({ type: "phase", message }) in Cactus_Download(.browser).ts omits
`progress`, but StreamPhase declares `progress: number | undefined`
(required, not optional). Re-emit with `progress: undefined` for the
error path so the discriminated-union members keep their explicit
shape.
2. sha256Hex / verifySha256 pass a `Uint8Array<ArrayBufferLike>` (the
default of the unparameterized `Uint8Array` since the recent lib.dom
tightening) into crypto.subtle.digest, which now expects
`ArrayBufferView<ArrayBuffer>` / a concrete `BufferSource`. Copy
incoming bytes into a fresh `ArrayBuffer` and pass that buffer
directly to digest, then read the result back through a Uint8Array
view. The hash result is unchanged.
…d fix)
The "accepts ArrayBuffer input" test was passing `asciiBytes("abc").buffer`
to sha256Hex. `Uint8Array.prototype.buffer` is typed `ArrayBufferLike` in
the lib.dom.d.ts version this repo pulls in — meaning TypeScript treats
it as potentially `SharedArrayBuffer`, which is missing properties
(`resizable`, `resize`, `detached`, `transfer`, `transferToFixedLength`)
that `ArrayBuffer` requires.
The test still wants to exercise sha256Hex's ArrayBuffer branch, so build
a fresh concrete `ArrayBuffer` via `new ArrayBuffer(n)` and copy the
ASCII bytes into it before calling sha256Hex. Same coverage, no
ArrayBufferLike leakage into the call site.
@workglow/cli
@workglow/ai
@workglow/browser-control
@workglow/indexeddb
@workglow/javascript
@workglow/job-queue
@workglow/knowledge-base
@workglow/mcp
@workglow/storage
@workglow/task-graph
@workglow/tasks
@workglow/util
workglow
@workglow/anthropic
@workglow/bun-webview
@workglow/cactus
@workglow/chrome-ai
@workglow/electron
@workglow/google-gemini
@workglow/huggingface-inference
@workglow/huggingface-transformers
@workglow/node-llama-cpp
@workglow/ollama
@workglow/openai
@workglow/playwright
@workglow/postgres
@workglow/sqlite
@workglow/supabase
@workglow/tf-mediapipe
commit: |
@workglow/cli
@workglow/ai
@workglow/browser-control
@workglow/indexeddb
@workglow/javascript
@workglow/job-queue
@workglow/knowledge-base
@workglow/mcp
@workglow/storage
@workglow/task-graph
@workglow/tasks
@workglow/util
workglow
@workglow/anthropic
@workglow/bun-webview
@workglow/cactus
@workglow/chrome-ai
@workglow/electron
@workglow/google-gemini
@workglow/huggingface-inference
@workglow/huggingface-transformers
@workglow/node-llama-cpp
@workglow/ollama
@workglow/openai
@workglow/playwright
@workglow/postgres
@workglow/sqlite
@workglow/supabase
@workglow/tf-mediapipe
commit: |
Coverage Report
File CoverageNo changed files found. |
…th-injection sanitizer CodeQL's `js/path-injection` query couldn't recognize the regex-based allowlist as a sanitizer — `model_id` from `provider_config.model_id` was still flagged as flowing into `path.resolve(models_dir, model_id)` and the subsequent `fs.*` calls. Add `safeJoinUnderRoot(root, ...segments)` that normalizes with `path.resolve` and verifies containment via `startsWith(resolvedRoot + sep)` — the canonical sanitizer pattern CodeQL recognizes for the js/path-injection query. Apply at every fs entry point in this module: - `resolveModelDir`: now safe-joins `model_id` onto a resolved `models_dir` root, throws on escape. - `getNodeAssetCacheInfo`: each `fs.stat(path.join(resolvedDir, filename))` becomes `fs.stat(safeJoinUnderRoot(resolvedDir, filename))`. - `fetchAssetBytesNode`: `filePath = safeJoinUnderRoot(resolvedDir, spec.filename)`. `tmpPath` is a sibling string concat, inheriting the contained `filePath`. - `removeNodeCacheDir`: explicit `resolveModelDir` call locally so the sanitizer trace is visible per-fs-call. The character allowlists (assertSafeModelId / assertSafeFilename) remain as fast-fail defense in depth — they reject malformed input at the boundary before any path operation runs.
| function resolveAssetSpec( | ||
| entry: CactusCatalogEntry, | ||
| specOrFilename: CactusAssetSpec | string | ||
| ): CactusAssetSpec { |
CodeQL's js/path-injection query doesn't trace through user-defined helper functions like safeJoinUnderRoot. The sanitizer pattern (path.resolve + startsWith containment check) must be visible in the same function scope as the fs call for the analyzer to recognize it. Inline the check at each fs call site in getNodeAssetCacheInfo, fetchAssetBytesNode, and removeNodeCacheDir. The safeJoinUnderRoot helper stays in place for resolveModelDir and the assertSafe* allowlists remain as fast-fail defense-in-depth, but every fs call site now has its own inline sanitizer that CodeQL can trace.
| `Path escape detected: ${JSON.stringify(readPath)} is not within ${JSON.stringify(resolvedDir)}` | ||
| ); | ||
| } | ||
| const buf = await fs.readFile(readPath); |
…h (build fix) The inline-sanitizer commit (4214fa0) replaced every call to `resolveModelDir` and `safeJoinUnderRoot` with the inline `path.resolve` + `startsWith` containment check, but left both helpers defined. `noUnusedLocals` then failed the build: 'resolveModelDir' is declared but its value is never read Remove both helpers and the now-unused `tmpPath` binding (the inline `writeTarget` / `renameFrom` recompute the same path). Update the path-safety section header to reflect the inline-only approach. Behavior unchanged: the inline sanitizer at every fs call site, plus the `assertSafeModelId` / `assertSafeFilename` allowlists, remain in place.
| `Path escape detected: ${JSON.stringify(mkdirTarget)} is not within ${JSON.stringify(safeRoot)}` | ||
| ); | ||
| } | ||
| await fs.mkdir(mkdirTarget, { recursive: true }); |
| `Path escape detected: ${JSON.stringify(writeTarget)} is not within ${JSON.stringify(resolvedDir)}` | ||
| ); | ||
| } | ||
| await fs.writeFile(writeTarget, bytes); |
| `Path escape detected: ${JSON.stringify(renameTo)} is not within ${JSON.stringify(resolvedDir)}` | ||
| ); | ||
| } | ||
| await fs.rename(renameFrom, renameTo); |
| `Path escape detected: ${JSON.stringify(renameTo)} is not within ${JSON.stringify(resolvedDir)}` | ||
| ); | ||
| } | ||
| await fs.rename(renameFrom, renameTo); |
| if (cleanupTarget !== resolvedDir && !cleanupTarget.startsWith(resolvedDir + path.sep)) { | ||
| throw err; | ||
| } | ||
| await fs.unlink(cleanupTarget).catch(() => {}); |
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes 2 HIGH-priority findings in the Cactus provider (PR #524 merged).
Summary
./ai,./ai.browser,./ai-runtime,./ai-runtime.browser), each with its own compiled copy ofCactus_Runtime.ts. Module-level Maps duplicated across bundles -> double engine loads, leaked engines on dispose, stale-negativeisModelLoadedreads. Fixed by routing all state through aglobalThis[Symbol.for("@workglow/cactus.runtime.v1")]singleton in a newCactus_RuntimeState.tsmodule.fs) and browser (Cache Storage) paths. Hashes verified BEFORE atomic rename / cache write. Corrupt cached assets unlinked + refetched on next call.Maintainer action required
The catalog entries (
CACTUS_NEEDLE_26M.assets.*.sha256and.size) are placeholder values ("TODO_FILL_AT_RELEASE") in this PR. You must populate them with real hashes of the pinned-revision artifacts. While the placeholder is present,verifySha256skips the check and logs a one-time warning per file so the gap cannot ship silently. A follow-upproviders/cactus/scripts/hash-catalog.tsscript would automate this; out of scope for this PR.The pinned
revisioninCactus_Constants.tsis currently"main"(a mutable ref). To make the hash trust boundary meaningful, this should be moved to an immutable commit SHA in the same release that populates the hashes.Recommended follow-up: require signed commits on
Cactus_ModelCatalog.tsso the trust boundary is anchored at "the commit that updated the catalog."Notable design choices / deviations
_testOnly.cactusEnginesand_testOnly.cactusConfigJsonexports changed from being theMapinstances themselves to accessor functions (() => Map). This is required so that__resetRuntimeForTests()produces fresh state instead of stale captures. Existing call sites inpackages/test/src/test/ai-provider-cactus/were updated to invoke the accessors. Source-level imports ofcactusEngines/cactusConfigJsonfromCactus_Runtimeare preserved as accessor re-exports (getCactusEngines, etc.) for the few internal callers; the legacy bare-Map re-export was dropped because it would have re-introduced the desync.fetchAssetBytesnow accepts either aCactusAssetSpecor a bare filename (resolved against the catalog). Hot callers ingetOrLoadEngineand the download run-fns pass the full spec.cache.put(url, resp.clone())tocache.put(url, new Response(bytes, { headers }))after verification. This is intentional: cloning the original response would persist unverified bytes if the digest check then failed.packages/test/src/test/ai-provider-cactus/already gate themselves behindRUN_CACTUS_TESTS=1and will keep working since they don't touch the catalog shape; once real hashes are populated they will also exercise the new verification path against live HF traffic.Test plan
Cactus_Integrity.test.ts- hash computation, mismatch rejection, malformed-expected rejection, placeholder bypass with warningCactus_RuntimeState.test.ts- lazy init, singleton identity, version mismatch, resetCactus_Runtime.crossBundle.test.ts- singleton dedup across import paths, version mismatch error pathspecplumbing + accessor changehttps://claude.ai/code/session_01B2fT1tf7YoTrGzukoHnrCK
Generated by Claude Code