Chrome-ai by sroussey · Pull Request #514 · workglow-dev/libs

sroussey · 2026-05-20T04:47:31Z

Added support for @types/dom-chromium-ai to improve type definitions for Chrome AI APIs.
Introduced a new WebBrowser_Chat function to handle multi-turn chat sessions, allowing for better session management and context retention.
Implemented structured generation capabilities with WebBrowser_StructuredGeneration, enabling JSON output from the AI model.
Enhanced tool-calling functionality with WebBrowser_ToolCalling, allowing the model to invoke tools and handle their results seamlessly.
Updated existing capabilities to include new features such as json-mode and tool-use, expanding the range of tasks the AI can perform.
Refactored session management with WebBrowser_Sessions to cache and manage AI sessions effectively, improving performance and resource utilization.
Removed deprecated type definitions from WebBrowser_ChromeAI.d.ts to streamline the codebase.

This commit significantly enhances the functionality and usability of the Chrome AI provider, paving the way for more complex interactions and improved user experience.

pkg-pr-new · 2026-05-20T04:48:58Z

Open in StackBlitz

@workglow/cli

npm i https://pkg.pr.new/@workglow/cli@514

@workglow/ai

npm i https://pkg.pr.new/@workglow/ai@514

@workglow/browser-control

npm i https://pkg.pr.new/@workglow/browser-control@514

@workglow/indexeddb

npm i https://pkg.pr.new/@workglow/indexeddb@514

@workglow/javascript

npm i https://pkg.pr.new/@workglow/javascript@514

@workglow/job-queue

npm i https://pkg.pr.new/@workglow/job-queue@514

@workglow/knowledge-base

npm i https://pkg.pr.new/@workglow/knowledge-base@514

@workglow/mcp

npm i https://pkg.pr.new/@workglow/mcp@514

@workglow/storage

npm i https://pkg.pr.new/@workglow/storage@514

@workglow/task-graph

npm i https://pkg.pr.new/@workglow/task-graph@514

@workglow/tasks

npm i https://pkg.pr.new/@workglow/tasks@514

@workglow/util

npm i https://pkg.pr.new/@workglow/util@514

workglow

npm i https://pkg.pr.new/workglow@514

@workglow/anthropic

npm i https://pkg.pr.new/@workglow/anthropic@514

@workglow/bun-webview

npm i https://pkg.pr.new/@workglow/bun-webview@514

@workglow/cactus

npm i https://pkg.pr.new/@workglow/cactus@514

@workglow/chrome-ai

npm i https://pkg.pr.new/@workglow/chrome-ai@514

@workglow/electron

npm i https://pkg.pr.new/@workglow/electron@514

@workglow/google-gemini

npm i https://pkg.pr.new/@workglow/google-gemini@514

@workglow/huggingface-inference

npm i https://pkg.pr.new/@workglow/huggingface-inference@514

@workglow/huggingface-transformers

npm i https://pkg.pr.new/@workglow/huggingface-transformers@514

@workglow/node-llama-cpp

npm i https://pkg.pr.new/@workglow/node-llama-cpp@514

@workglow/ollama

npm i https://pkg.pr.new/@workglow/ollama@514

@workglow/openai

npm i https://pkg.pr.new/@workglow/openai@514

@workglow/playwright

npm i https://pkg.pr.new/@workglow/playwright@514

@workglow/postgres

npm i https://pkg.pr.new/@workglow/postgres@514

@workglow/sqlite

npm i https://pkg.pr.new/@workglow/sqlite@514

@workglow/supabase

npm i https://pkg.pr.new/@workglow/supabase@514

@workglow/tf-mediapipe

npm i https://pkg.pr.new/@workglow/tf-mediapipe@514

commit: b382050

github-actions · 2026-05-20T04:51:34Z

Coverage Report

Status	Category	Percentage	Covered / Total
🔵	Lines	62.52%	23684 / 37877
🔵	Statements	62.4%	24511 / 39278
🔵	Functions	63.56%	4509 / 7093
🔵	Branches	50.99%	11506 / 22561

File Coverage

No changed files found.

Generated in workflow #2390 for commit b382050 by the Vitest Coverage Report Action

… (#520) * feat(chrome-ai): probe-gate tool-use and json-mode capabilities (C1) Chrome's `LanguageModel.create` did not universally accept `tools` or `responseConstraint` options, yet `inferWebBrowserCapabilities` always advertised `tool-use` + `json-mode` for `chrome-prompt`/`gemini-nano`. This caused the dispatcher to route json-mode and tool-use tasks to the WebBrowser provider on Chrome builds that would reject them at runtime. Adds a one-shot capability probe (`probeWebBrowserCapabilities`) that smoke-tests `factory.create({ responseConstraint })` and `factory.create({ tools })`, with module-level coalescing so concurrent callers share one probe round-trip. `WebBrowserProvider` kicks the probe off in its constructor; until it resolves, `inferCapabilities` returns the conservative subset (no `json-mode`, no `tool-use`). Tests cover all four probe outcome combinations, coalescing, and pre/post-ready inference. https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC * feat(chrome-ai): StructuredGeneration accepts sessionId with schema fingerprint (H1) The structured-generation run-fn dropped `sessionId` from its signature, so successive calls with the same id always rebuilt the underlying Chrome `LanguageModel` even though the surface supports session reuse. This matched the pre-session-cache behaviour rather than the post-cache shape adopted by `WebBrowser_Chat`. Accept `sessionId` as the 6th positional parameter, mirroring chat. Cache reuse is gated on a canonical schema fingerprint stored on the cache entry — a schema change forces a rebuild because Chrome's `responseConstraint` state is bound at first-prompt and re-feeding a different schema is undefined behaviour. On stream failure the entry is dropped + destroyed via the same `cacheWritten` / `dropChromeSessionEntry` dance as chat. `ChromeChatSessionState` grows an optional `schemaFingerprint` field. https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC * feat(chrome-ai): ToolCalling accepts sessionId with tools fingerprint (H2) `WebBrowser_ToolCalling` ignored both `outputSchema` and `sessionId` — the 5th and 6th positional parameters of the run-fn contract — so multi-turn tool-calling rebuilt the `LanguageModel` each turn. Accept both parameters. Cache reuse keys on a sorted-tool-name fingerprint (Chrome binds `tools` at `create()` time and can't hot-swap them per turn). We only cache when the orchestrator drives via `input.messages` because Chrome's tool-calling loop appends tool-result turns to the session's internal state opaquely — reusing a cached session across a turn the orchestrator hasn't fully replayed would double-feed those results. Bare-prompt callers always rebuild. On any error we drop + destroy the cache entry: Chrome's internal state may be mid-tool-call-cycle. `ChromeChatSessionState` grows an optional `toolsFingerprint` field. https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC * fix(chrome-ai): validate tool-call arguments against tool inputSchema (H3) Chrome's `LanguageModel` invokes our stub `execute` callback with whatever arguments the model emits. `filterValidToolCalls` only checked the tool name, so a hallucinated arg shape was forwarded to the orchestrator verbatim — leaving the downstream tool runner to either fail or silently produce garbage. Compile each tool's `inputSchema` once via `compileSchema` (cached by name) before the stream starts. After streaming we validate every captured call's `input` against its tool's validator; failures are dropped + warn-logged in the same shape as `filterValidToolCalls`'s existing name-only warning. Tools whose `inputSchema` fails to compile emit a single warning and fall through to the name-only check rather than failing the whole run. https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC * fix(chrome-ai): validate StructuredGeneration final JSON against schema (H4) Chrome's `responseConstraint` is best-effort, not a hard guarantee — the model can still produce a partial or shape-mismatched payload. The existing fallback (`parsePartialJson(...) ?? {}`) handed downstream code an empty object cast to the output type, indistinguishable from a legitimate empty payload. Worse, that path emitted a `finish` event, so `StructuredGenerationTask`'s retry loop had no signal to retry on. Compile the validator once via `compileSchema`. After streaming: - If neither `JSON.parse` nor `parsePartialJson` produces a value: throw `PermanentJobError("Chrome AI returned unparseable JSON")`. - If validation fails: throw with the first validator error message. - Only on success do we emit `finish` and write the cache entry. `StructuredGenerationTask.executeStream` catches per-attempt errors and retries, so throwing here is the correct signal — no `finish` so the loop knows this attempt failed. Schema compile failures are also surfaced as `PermanentJobError` (so retries don't burn through quota on a malformed schema). https://claude.ai/code/session_013PqntVCfKgKmJ5396w7BPC * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

sroussey · 2026-05-21T08:36:13Z

HIGH-priority review findings — fix plans

Following an end-of-day automated review of PRs updated in the last 24h. 5 HIGH findings, 0 CRITICAL. Recommend addressing H1–H4 before merge; H5 is a dead-code cleanup that can be deferred. Full detail per finding below.

H1 — `WebBrowser_StructuredGeneration` defeats `StructuredGenerationTask.maxRetries`

Where: providers/chrome-ai/src/ai/common/WebBrowser_StructuredGeneration.ts:175-195

Why: The run-fn throws PermanentJobError on JSON.parse or schema-validation failure. The inline comment claims StructuredGenerationTask runs inside a retry loop that catches per-attempt errors — but packages/ai/src/task/StructuredGenerationTask.ts:177 consumes the inner stream via for await (const event of super.executeStream(...)), which lets exceptions propagate straight out of the generator. The task's retry/repair pipeline only runs when the run-fn emits a finish event whose data.object parses but fails validation. Chrome's path therefore silently disables maxRetries.

Fix: Mirror Anthropic_StructuredGeneration_Stream / HFT_StructuredGeneration_Stream. On parse failure, fall back to parsePartialJson(accumulatedJson) ?? {}. On validation failure, do NOT throw — still emit { type: "finish", data: { object: finalObject } } and let the task's compiled validator decide retry. Compile-time invalid-schema (the PermanentJobError at lines 113-118) legitimately stays as throw.

Effort: S (~25 lines).

H2 — `WebBrowser_ToolCalling` cache reuse double-feeds tool results

Where: providers/chrome-ai/src/ai/common/WebBrowser_ToolCalling.ts:200-260 (cache check), :102-118 (trade-off comment)

Why: The run-fn caches LanguageModel and reuses across orchestrator turns when input.messages is present, gated on toolsFingerprint and priorMessageCount. Chrome's promptStreaming() internally appends tool-result turns to the session's history. On the orchestrator's next turn, input.messages re-supplies the tool result as a text frame, while the cached session also has its own appended copy → model sees the same tool result twice. The trade-off comment acknowledges the failure mode; the chosen priorMessageCount watermark does not actually defend against it.

Fix: Delete the cache-reuse path for tool-calling entirely. Match WebBrowser_TextGeneration.ts's pattern (always factory.create + session.destroy in finally). Remove cached / usedCachedSession / cacheable / cacheWritten locals and the getChromeSession / setChromeSession / dropChromeSessionEntry imports. Chrome session-create is cheap on a warm model; the correctness win outweighs the amortization. If amortization matters later, gate on capturedCalls.length === 0 from the previous turn (only safe when the prior turn was text-only).

Effort: S (~40 lines deletion).

H3 — `WebBrowser_Chat` history-replay watermark drifts from Chrome's effective initial prompt

Where: providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts:67-78, providers/chrome-ai/src/ai/common/WebBrowser_ChatHistory.ts:38-67

Why: Cache watermark is priorHistory.length (raw count of ChatMessages). But buildInitialPromptsFromHistory silently filters out empty-text messages, tool-role messages, and mid-history system messages. Two histories that produce identical initialPrompts but differ in dropped-frame count force a needless rebuild (cache miss). Conversely, a mid-conversation mutation of the leading system text — also silently swallowed by the filter — keeps the cache hot with the old system prompt baked in.

Fix: Compute the watermark over the FILTERED history that actually got fed to Chrome. Change buildInitialPromptsFromHistory to return { initialPrompts, fingerprint } where fingerprint is canonicalStringify(initialPrompts) (or a hash). Gate cache reuse on the fingerprint match, not raw length. Add a separate leading-system fingerprint check so mid-conversation system-prompt mutations force a rebuild. Update ChromeChatSessionState (WebBrowser_Sessions.ts:17-34) to carry historyFingerprint?: string alongside the existing messageCount (preserved for tool-calling/structured-gen which don't change in this fix).

Effort: M (~80 lines + interface touch).

H4 — Bare `LanguageModel` (and friends) references can throw at module load on non-Chrome

Where: Eight files — WebBrowser_Chat.ts:40-44, WebBrowser_StructuredGeneration.ts:100-104, WebBrowser_ToolCalling.ts:159-163, WebBrowser_TextGeneration.ts:42-46, WebBrowser_TextLanguageDetection.ts, WebBrowser_TextRewriter.ts, WebBrowser_TextSummary.ts, WebBrowser_TextTranslation.ts

Why: Each file uses typeof LanguageModel !== "undefined" ? LanguageModel : undefined. The typeof X form is correct in classic scripts, but in strict-mode ES modules under some bundlers the second reference to LanguageModel (after the ?) is treated as a binding lookup and can throw ReferenceError at module evaluation on Firefox/Safari/Node. WebBrowser_CapabilityProbe.ts:80-90 already routes through globalThis and got this right — the probe fix just didn't propagate to the run-fns.

Fix: Add a tiny helper in WebBrowser_ChromeHelpers.ts:

export function getChromeGlobal<T = unknown>(name: string): T | undefined {
  return (globalThis as unknown as Record<string, T | undefined>)[name];
}

Replace each call site:

// before
getApi("LanguageModel", typeof LanguageModel !== "undefined" ? LanguageModel : undefined)
// after
getApi("LanguageModel", getChromeGlobal<typeof LanguageModel>("LanguageModel"))

Apply to LanguageModel, Summarizer, Rewriter, Translator, LanguageDetector across all 8 files.

Effort: S (~10 lines per site × 8 + helper).

H5 — `WebBrowser_StructuredGeneration` cache code is dead in production

Where: providers/chrome-ai/src/ai/common/WebBrowser_StructuredGeneration.ts:125-160 (cache branch); packages/ai/src/task/StructuredGenerationTask.ts (no sessionId plumbing)

Why: The fingerprint and cache infrastructure adds ~80 lines plus three live failure modes (cache poisoning, schema-mismatch invalidation, watermark drift). But StructuredGenerationTask doesn't allocate a sessionId and the JobInput plumbing that produces one is owned by AiChatTask. No production caller passes sessionId. Dead in production.

Fix: Delete the cache code in this file (always create + destroy like WebBrowser_TextGeneration). Leave a TODO comment pointing at the follow-up: wiring StructuredGenerationTask to allocate a sessionId for json-mode providers is a cross-provider contract change and belongs in a separate PR. If canonicalStringify is needed by the H3 fix, promote it to a shared helper; otherwise inline-delete.

Effort: S (~90 lines deletion).

Sequencing

H1, H2, H4 are independent — any order.
H1 + H5 both touch WebBrowser_StructuredGeneration.ts — land H5 first (deletes more), then H1 on the simplified file.
H3 touches a shared ChromeChatSessionState interface that H2 reads (messageCount); H3's plan keeps messageCount and adds historyFingerprint as an optional sibling, so the two don't conflict.
If H4 lands first, H1/H5 pick up getChromeGlobal for free.

Generated by Claude Code

sroussey · 2026-05-22T06:15:34Z

@copilot Some of this was recently added to main branch, but not all. It needs to be rebased and checked carefully. main has some work on sessions and making sure that streaming does not use snapshot.

Copilot · 2026-05-22T06:32:40Z

@copilot Some of this was recently added to main branch, but not all. It needs to be rebased and checked carefully. main has some work on sessions and making sure that streaming does not use s...

Rechecked this against main and updated the Chrome AI paths in 789c0e4. The branch now uses globalThis-safe Chrome API lookup, text-delta-only streaming fallback, chat history fingerprinting for session reuse, and fresh sessions for tool-calling / structured generation so the main-branch session and streaming concerns are covered.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

huggingface.co
- Triggering command: /opt/hostedtoolcache/node/24.15.0/x64/bin/node /opt/hostedtoolcache/node/24.15.0/x64/bin/node --experimental-import-meta-resolve --require /home/REDACTED/work/libs/libs/node_modules/.bun/vitest@4.1.6+01bed7d671562612/node_modules/vitest/suppress-warnings.cjs --conditions node --conditions development /home/REDACTED/work/libs/libs/node_modules/.bun/vitest@4.1.6+01bed7d671562612/node_modules/vitest/dist/workers/forks.js owser /native-preview-linux-x64/lib/tsbun run build-code n ncur�� hema /native-preview-linux-x64/lib/tsgo go ackages/test/src/test/ai-provider/AiProviderRegistry.test.ts packages/test/src/test/ai-provider mpress-browser mpress-node fgrep (dns block)
- Triggering command: /opt/hostedtoolcache/node/24.15.0/x64/bin/node /opt/hostedtoolcache/node/24.15.0/x64/bin/node --experimental-import-meta-resolve --require /home/REDACTED/work/libs/libs/node_modules/.bun/vitest@4.1.6+01bed7d671562612/node_modules/vitest/suppress-warnings.cjs --conditions node --conditions development /home/REDACTED/work/libs/libs/node_modules/.bun/vitest@4.1.6+01bed7d671562612/node_modules/vitest/dist/workers/forks.js bun run build-me/home/REDACTED/work/libs/libs/node_modules/.bin/turbo bun run build-merun ules/.bin/bun /hom�� -c auto odules/npm/node_ACCEPT ackages/test/src/test/ai-provider/AiProviderRegistry.test.ts packages/test/src/test/ai-provider bun run build-browser linux-x64/lib/tsgo node (dns block)
- Triggering command: /opt/hostedtoolcache/node/24.15.0/x64/bin/node /opt/hostedtoolcache/node/24.15.0/x64/bin/node --experimental-import-meta-resolve --require /home/REDACTED/work/libs/libs/node_modules/.bun/vitest@4.1.6+01bed7d671562612/node_modules/vitest/suppress-warnings.cjs --conditions node --conditions development /home/REDACTED/work/libs/libs/node_modules/.bun/vitest@4.1.6+01bed7d671562612/node_modules/vitest/dist/workers/forks.js bun run build-br--norc bun run build-no--noprofile node test�� (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Copilot

Pull request overview

This PR expands the Chrome Built-in AI provider to support capability probing, multi-turn chat/session handling, structured JSON generation (json-mode), and tool-calling (tool-use). It also introduces persistent, machine-readable error codes for FetchUrlTask failures (propagated through the job queue/client), plus accompanying test refactors/utilities.

Changes:

Chrome AI provider: add capability probing + new run-fns for chat, structured generation, and tool calling; refactor streaming helpers and session management.
FetchUrl: introduce FetchUrlErrorCode + helpers, persist error_code reliably, and update SafeFetch/FetchUrlTask logic + tests.
Tests/tooling: add cross-runner fake-timer helper and expand provider + fetch error-code coverage.

Reviewed changes

Copilot reviewed 36 out of 38 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
scripts/lib/preload-credentials.ts	Add warning log when credential unlock/hydration fails.
providers/chrome-ai/tsconfig.json	Include `dom-chromium-ai` ambient types for Chrome AI globals.
providers/chrome-ai/src/ai/WebBrowserProvider.ts	Add capability probe lifecycle (`ready()`), gated inference, and session disposal hook.
providers/chrome-ai/src/ai/index.ts	Expand `_testOnly` exports to cover new run-fns, probing, sessions, and chat helpers.
providers/chrome-ai/src/ai/common/WebBrowser_ToolCalling.ts	New tool-calling run-fn bridging Chrome’s internal tool loop to Workglow toolCalls protocol.
providers/chrome-ai/src/ai/common/WebBrowser_TextTranslation.ts	Refactor Translator access via `getChromeGlobal` + add download-progress monitoring.
providers/chrome-ai/src/ai/common/WebBrowser_TextSummary.ts	Add download-progress monitoring, signal plumbing, and normalize `tl;dr` → `tldr`.
providers/chrome-ai/src/ai/common/WebBrowser_TextRewriter.ts	Add download-progress monitoring and signal plumbing.
providers/chrome-ai/src/ai/common/WebBrowser_TextLanguageDetection.ts	Add download-progress monitoring, signal plumbing, and defensive mapping of detection results.
providers/chrome-ai/src/ai/common/WebBrowser_TextGeneration.ts	Add download-progress monitoring, signal plumbing, and unify delta streaming helper usage.
providers/chrome-ai/src/ai/common/WebBrowser_StructuredGeneration.ts	New json-mode structured generation run-fn using `responseConstraint` with partial JSON streaming.
providers/chrome-ai/src/ai/common/WebBrowser_Sessions.ts	New `LanguageModel` session cache utilities keyed by `AiChatTask` sessionId.
providers/chrome-ai/src/ai/common/WebBrowser_JobRunFns.ts	Register new capability sets and add unified text.generation dispatcher (chat vs one-shot).
providers/chrome-ai/src/ai/common/WebBrowser_ChromeHelpers.ts	Add global lookup helper, download monitor, canonical stringify, and adjust streaming snapshot→delta conversion.
providers/chrome-ai/src/ai/common/WebBrowser_ChromeAI.d.ts	Remove deprecated in-repo ambient Chrome AI type declarations (replaced by `@types/dom-chromium-ai`).
providers/chrome-ai/src/ai/common/WebBrowser_ChatHistory.ts	New helpers to map Workglow chat history into Chrome `initialPrompts` + fingerprinting.
providers/chrome-ai/src/ai/common/WebBrowser_Chat.ts	New multi-turn chat run-fn with session caching and failure-path cache hygiene.
providers/chrome-ai/src/ai/common/WebBrowser_CapabilitySets.ts	Add `json-mode` and `tool-use` capability sets.
providers/chrome-ai/src/ai/common/WebBrowser_CapabilityProbe.ts	New module-level cached probe for json/tool support via `LanguageModel.create()` smoke tests.
providers/chrome-ai/src/ai/common/WebBrowser_Capabilities.ts	Gate `json-mode`/`tool-use` capability inference behind probe results (+ async helper).
providers/chrome-ai/package.json	Add `@types/dom-chromium-ai` dev dependency.
packages/test/src/test/util/WorkerManager.idle.test.ts	Refactor timer advancement to shared helper for Vitest/Bun compatibility.
packages/test/src/test/task/FetchTask.test.ts	Assert `FetchUrlErrorCode` propagation and persisted `errorCode` behavior through queue.
packages/test/src/test/resource/DisposeStrategy.test.ts	Refactor timer advancement to shared helper.
packages/test/src/test/helpers/advanceFakeTimers.ts	New helper to advance timers portably across Vitest/Bun (with optional microtask flush).
packages/test/src/test/browser-control/SequentialTasks.test.ts	Refactor timer advancement to shared helper.
packages/test/src/test/ai-provider/WebBrowserProvider.test.ts	Add extensive tests for probing, unified dispatcher, structured generation, tool calling, sessions, and helpers.
packages/test/src/test/ai-provider/DownloadModelAbort.integration.test.ts	Refactor model setup + improve abort-error classification helper.
packages/tasks/src/util/SafeFetch.ts	Replace generic errors with structured FetchUrl error codes for SSRF/redirect failures.
packages/tasks/src/util/SafeFetch.server.ts	Replace generic errors with structured FetchUrl error codes for DNS/SSRF/redirect failures.
packages/tasks/src/task/FetchUrlTask.ts	Emit structured FetchUrl errors, classify parse failures, and route HTTP errors through code-bearing helpers.
packages/tasks/src/task/FetchUrlJobError.ts	New centralized FetchUrl error-code definitions + constructors/wrappers.
packages/tasks/src/common.ts	Export `FetchUrlJobError` and reorder/re-export image + media filter utilities.
packages/task-graph/src/task/TaskError.ts	Propagate underlying `JobError.code` onto `JobTaskFailedError.code` when available.
packages/job-queue/src/job/JobQueueWorker.ts	Persist `error_code` via `jobErrorPersistedCode()` and emit consistent event payloads.
packages/job-queue/src/job/JobQueueClient.ts	Rehydrate `FETCH_*` persisted codes into retryable/permanent errors and preserve `code`.
packages/job-queue/src/job/JobError.ts	Add `JobError.code` and helper to derive persisted error codes.
bun.lock	Lockfile updates for new dependency and trustedDependencies ordering.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

      } else {
-        previousSnapshot = value;
-        yield { type: "snapshot", data: buildFallbackOutput(value) };
+        accumulatedText += value;


+  // Cache hygiene: only reuse the cached session if its watermark exactly
+  // matches the history we'd otherwise re-feed. Out-of-sync caches (task
+  // reset mid-conversation, retroactive edits to `messages`) are torn down
+  // and rebuilt.
+  let cached = sessionId ? getChromeSession(sessionId) : undefined;
+  if (sessionId !== undefined && cached && cached.historyFingerprint !== historyFingerprint) {


+    if (err instanceof Error && err.name === "AbortSignalJobError") {
+      throw err;


sroussey · 2026-05-22T14:28:21Z

@claude Some of this was recently added to main branch, but not all. It needs to be rebased and checked carefully. main has some work on chrome sessions too and making sure that chrome ai streaming does not send snapshot when they aren't actually snapshots. Double check docs at google. Text generation and translation may be different.

…totype pollution, snapshot reset (#528) * fix(chrome-ai): repair WebBrowser_Chat session-cache reuse (HIGH) The previous fingerprint-based cache key recomputed the fingerprint from the *prior* history on every turn, so turn 2's cache lookup always missed and rebuilt the session from scratch. Switch to a messageCount high-water mark: cache hits when cached.messageCount === lastUserIdx (i.e., the session has already heard everything before the trailing user message). After a successful turn the session has heard messages.length + 1 messages (history + new assistant reply), which we record for the next call. * fix(chrome-ai): sanitize tool-call arguments to prevent prototype pollution (HIGH) Many tool input schemas don't set `additionalProperties: false`, so a hallucinated `{__proto__: {polluted: true}, ok: true}` payload would pass validation and propagate through to consumers. Add a `sanitizeToolArgs` helper that recursively rebuilds the value with a plain Object.prototype, dropping `__proto__`, `constructor`, and `prototype` keys at every depth. Sanitize BEFORE validation so the validator sees the cleaned object. * fix(chrome-ai): reset accumulator on non-prefix snapshots (HIGH) `snapshotStreamToTextDeltas` was concatenating instead of resetting when a snapshot was not a prefix-extension of the previously accumulated text. For self-correction snapshots (Chrome replacing, not extending, prior text) this corrupted consumer state with duplicated content like `"hello worldhello sailor"`. Reset the accumulator to the new snapshot and emit it as the delta so consumers treat the non-prefix boundary as a replace, matching the documented streaming-convention exception. Also add `snapshotStreamToTextDeltas` to `_testOnly` so the helper is testable from the test package, and add coverage for: - HIGH-1: chat cache reuse and rebuild-on-divergence - HIGH-2: __proto__/constructor/prototype scrubbing (top-level + recursive) - HIGH-3: prefix-extend, non-prefix-reset, identical-snapshot semantics Also fix a stale comment in the existing tool-calling lifecycle test that claimed cache reuse — tool-calling intentionally rebuilds per turn. * docs(chrome-ai): align test comment with actual shrink-rebuild behavior

…apability probe Integrates the chrome-ai branch (7 commits — PR #514/#520/#528) with main's parallel chrome-ai work (model.download, model.dispose, ApiBinding): - Chat-session cache keyed by AiChatTask sessionId, with messageCount high-water mark for reuse (replaces fingerprint-based invalidation) - StructuredGeneration + ToolCalling run-fns gated by an async capability probe; pre-probe state advertises a conservative subset (no json-mode, no tool-use) so the provider never claims a capability it can't fulfil - ChatHistory helpers + WebBrowser_TextGeneration_Unified dispatcher (text.generation shared by AiChatTask + TextGenerationTask) - ChromeHelpers ships both assertAvailability and ensureAvailable; both session APIs (chrome-chat cache + idle-evict store) coexist - Drops main's WebBrowser_Chat.test.ts (chrome-ai's WebBrowserProvider.test already covers chat behavior under the new cache semantics)

…rn streams Tool calling utilities (packages/ai/src/task/ToolCallingUtils.ts): - sanitizeToolArgs: recursive __proto__/constructor/prototype scrubbing for model-supplied tool args (prototype-pollution defence) - compileToolValidators + validateToolCallArgs: per-tool inputSchema validation with graceful fallback for tools whose schema fails to compile Stream helpers converted from generators to emit-callback so run-fns no longer need a for-await/yield pump: - snapshotStreamToTextDeltas / snapshotStreamToSnapshots (chrome-ai) - accumulateOpenAIStream (@workglow/ai provider-utils, used by OpenAI + HFI) Run-fns updated to call helpers with emit directly and emit their own final 'finish' event. chrome-ai's WebBrowser_ToolCalling drops its private sanitization + validation copy and reuses the shared utils.

…viders Addresses review of #514/#520/#528 rebase: CRITICAL fix — `model.dispose` now reaches chat-cached sessions. The post-rebase chrome-ai branch had two parallel session maps (`chromeSessions` for chat reuse, `sessions` for idle-evict + ModelDispose lookup) but only the chat map was populated by runtime code, making `model.dispose` a functional no-op in production. Unified into a single Map<sessionId, WebBrowserSessionEntry> with both chat-cache fields (messageCount, fingerprints) and lifecycle fields (modelKey, lastUsedAt, idleTimer). `ChromeChatSessionState` now requires `modelKey`. `disposeWebBrowserSessionsForModel(modelKey)` iterates the unified store, so model.dispose destroys chat-cached sessions. Chat sessions become subject to idle eviction (free bonus). IMPORTANT — sanitizeToolArgs applied across the codebase per intent of the prior refactor: - OpenAIShapedChat (parseOpenAIToolCallMessage + accumulateOpenAIStream) → covers OpenAI + HFI - ToolCallParsers (adaptParserResult + parseToolCallsFromText) → covers llama.cpp Hermes/Liquid/Qwen35/Llama paths + HFT - Anthropic_ToolCalling (input_json_delta + content_block_stop) - Gemini_ToolCalling (functionCall.args) - Ollama_ToolCalling (parsed function.arguments) - LlamaCpp_ToolCalling (extractNativeFunctionCalls) - Cactus_ToolCalling[.browser] (JSON-parse parseToolCalls paths) Every model-supplied tool-arg payload now passes through sanitizeToolArgs before reaching downstream consumers, closing the prototype-pollution vector across the provider matrix. Also: - Added packages/test/src/test/ai/ToolCallingUtils.test.ts (14 unit tests for sanitizeToolArgs, compileToolValidators, validateToolCallArgs, plus a sanitize→validate→name-check integration test). - Added WebBrowser_Sessions.test regression for the unified-store behavior (disposeWebBrowserSessionsForModel sees chat-cached entries). - Documented WebBrowser_Chat's rebuild-on-next-turn recovery model (vs the in-fn retry that main's now-deleted test exercised).

…n is destroyed Chrome can destroy a `LanguageModel` session out from under us (tab backgrounding, GPU process restart, memory pressure). When a cached session's `promptStreaming` throws DOMException("...destroyed...", "InvalidStateError") we now rebuild the session from full history via `initialPrompts` and retry the prompt once. Retry is gated on three conditions, all required: - We were using a CACHED session (a fresh-session failure means the model is broken; retrying won't help). - No text-delta has reached the consumer yet (we can't unsend deltas). - The error name is `InvalidStateError` (matches Chrome's InvalidStateError DOMException; tolerant of message-text changes). Tests: - "retries once with a fresh session when a cached session is destroyed" seeds the cache on turn 1, has the cached session's promptStreaming throw on turn 2's reuse, asserts rebuild + retry + cache replacement. - "does not retry when a fresh (non-cached) session fails" guards the first gate.

sroussey mentioned this pull request May 20, 2026

fix(chrome-ai): probe-gate caps + session/validation correctness (#514) #520

Merged

5 tasks

Copilot started work on behalf of sroussey May 22, 2026 06:15 View session

Copilot finished work on behalf of sroussey May 22, 2026 06:33

Copilot started work on behalf of sroussey May 22, 2026 06:37 View session

sroussey requested a review from Copilot May 22, 2026 06:39

sroussey self-assigned this May 22, 2026

Copilot started reviewing on behalf of sroussey May 22, 2026 06:39 View session

Copilot AI reviewed May 22, 2026

View reviewed changes

sroussey mentioned this pull request May 22, 2026

fix(chrome-ai): three HIGH-priority bugs in PR #514 — chat cache, prototype pollution, snapshot reset #528

Merged

2 tasks

sroussey added 2 commits May 22, 2026 16:03

sroussey force-pushed the chrome-ai branch from 31d1abc to de0fc0d Compare May 22, 2026 17:37

sroussey added 2 commits May 22, 2026 18:33

sroussey merged commit ab2ceec into main May 22, 2026
16 checks passed

sroussey deleted the chrome-ai branch May 22, 2026 20:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chrome-ai#514

Chrome-ai#514
sroussey merged 4 commits into
mainfrom
chrome-ai

sroussey commented May 20, 2026

Uh oh!

pkg-pr-new Bot commented May 20, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 20, 2026 •

edited

Loading

Uh oh!

sroussey commented May 21, 2026

Uh oh!

sroussey commented May 22, 2026

Uh oh!

Copilot AI commented May 22, 2026 •

edited

Loading

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI left a comment

Uh oh!

sroussey commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if (err instanceof Error && err.name === "AbortSignalJobError") {
		throw err;

Conversation

sroussey commented May 20, 2026

Uh oh!

pkg-pr-new Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report

Uh oh!

sroussey commented May 21, 2026

HIGH-priority review findings — fix plans

H1 — WebBrowser_StructuredGeneration defeats StructuredGenerationTask.maxRetries

H2 — WebBrowser_ToolCalling cache reuse double-feeds tool results

H3 — WebBrowser_Chat history-replay watermark drifts from Chrome's effective initial prompt

H4 — Bare LanguageModel (and friends) references can throw at module load on non-Chrome

H5 — WebBrowser_StructuredGeneration cache code is dead in production

Sequencing

Uh oh!

sroussey commented May 22, 2026

Uh oh!

Copilot AI commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

sroussey commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pkg-pr-new Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading

H1 — `WebBrowser_StructuredGeneration` defeats `StructuredGenerationTask.maxRetries`

H2 — `WebBrowser_ToolCalling` cache reuse double-feeds tool results

H3 — `WebBrowser_Chat` history-replay watermark drifts from Chrome's effective initial prompt

H4 — Bare `LanguageModel` (and friends) references can throw at module load on non-Chrome

H5 — `WebBrowser_StructuredGeneration` cache code is dead in production

Copilot AI commented May 22, 2026 •

edited

Loading