Skip to content

fix(security): 6 browse bugs found via sqry semantic code graph#664

Open
mr-k-man wants to merge 20 commits intogarrytan:mainfrom
mr-k-man:fix/security-audit-sqry-findings
Open

fix(security): 6 browse bugs found via sqry semantic code graph#664
mr-k-man wants to merge 20 commits intogarrytan:mainfrom
mr-k-man:fix/security-audit-sqry-findings

Conversation

@mr-k-man
Copy link
Copy Markdown

@mr-k-man mr-k-man commented Mar 30, 2026

Summary

10 security/correctness bugs in the gstack codebase, discovered using sqry AST-based semantic code graph analysis. Cross-validated by Claude Opus 4.6, OpenAI Codex (GPT-5.4), and Gemini Flash — all findings verified with 0 false positives.

Round 1 — Browse subsystem (6 bugs)

  • validateOutputPath symlink bypass — added realpathSync to match validateReadPath pattern
  • upload command file exfiltration — added safe-directory validation with realpathSync before setInputFiles
  • cookie-import symlink bypass — replaced normalize+includes("..") with realpathSync-based validation
  • DNS rebinding IPv6 gapresolvesToBlockedIp now checks both resolve4 and resolve6
  • killAgent() no-op — server writes cancel file, sidebar-agent polls and kills subprocess
  • cookies command secret leak — now redacts sensitive values matching same patterns as storage command
  • TOCTOU hardening (caught by Codex review): upload and cookie-import use resolved paths for actual operations

Round 2 — Cross-subsystem (4 bugs)

  • design/src/serve.ts path traversal/api/reload accepted arbitrary file paths; now anchored to initial HTML's parent directory with realpathSync
  • extension auth token broadcast — token removed from chrome.runtime.sendMessage health broadcast; sidepanel now requests via targeted getToken message
  • browser-manager restoreState URL validationpage.goto(saved.url) now runs through validateNavigationUrl first
  • telemetry-ingest RLS bypass — replaced SUPABASE_SERVICE_ROLE_KEY with caller's anon key; RLS enforces access control

Changed files

File Change
browse/src/meta-commands.ts validateOutputPath resolves symlinks via realpathSync
browse/src/write-commands.ts upload + cookie-import path validation with TOCTOU fix
browse/src/url-validation.ts resolvesToBlockedIp checks IPv4 + IPv6, export blocklist
browse/src/server.ts killAgent writes cancel file for sidebar-agent
browse/src/sidebar-agent.ts Polls cancel file, kills active subprocess
browse/src/read-commands.ts Cookie redaction + exported regex constants
browse/src/browser-manager.ts restoreState validates URLs before navigation
design/src/serve.ts /api/reload path traversal fix with realpathSync
extension/background.js Token removed from broadcast, getToken handler added
extension/sidepanel.js Requests token via getToken instead of broadcast
supabase/functions/telemetry-ingest/index.ts Anon key client replaces service role key
browse/test/path-validation.test.ts 10 new tests (symlink, redaction, DNS blocklist)
browse/test/url-validation.test.ts 5 new tests (restoreState URL patterns)
design/test/serve.test.ts 3 integration tests (path traversal via HTTP)

Test plan

  • bun test browse/test/path-validation.test.ts — 26/26 pass
  • bun test browse/test/url-validation.test.ts — 22/22 pass
  • bun test design/test/serve.test.ts — 14/14 pass
  • Total: 62 tests, all pass
  • Tests import production constants (no duplicated literals)
  • Codex (GPT-5.4) unconditional APPROVE on all fixes + tests
  • Gemini Flash cross-validation on all findings
  • bun run test:evals — needs ANTHROPIC_API_KEY (contributor environment)

Discovery methodology

All bugs found via sqry complexity_metrics → manual inspection of highest-complexity functions, plus find_unused, get_insights, and deep explain_code / pattern_search across 174 indexed files. Three AI auditors (Claude, Codex, Gemini) independently analyzed the codebase and cross-validated findings.

Issues

Round 1: #665, #666, #667, #668, #669, #670
Round 2: #672, #673, #674, #675

validateOutputPath used path.resolve() but never fs.realpathSync(),
unlike validateReadPath. A symlink under /tmp or cwd could redirect
screenshot/PDF writes outside safe directories.

Now matches the symlink-aware pattern from validateReadPath including
realpathSync on safe directories themselves (macOS /tmp → /private/tmp).

Found via sqry AST-based semantic code graph analysis.
upload command had no safe-directory check — any file the process could
read was exfiltrable via setInputFiles() to an attacker-controlled page.
Now validates paths with realpathSync against safe directories.

cookie-import used path.normalize() to check for ".." but never resolved
symlinks. A relative symlink could bypass validation. Now uses
realpathSync like validateReadPath.

Found via sqry AST-based semantic code graph analysis.
resolvesToBlockedIp() only called dns.resolve4() — AAAA records pointing
to blocked IPv6 metadata addresses (e.g., fd00::) bypassed protection.
Now checks both resolve4 and resolve6 in parallel.

Found via sqry AST-based semantic code graph analysis.
killAgent() signaled agentProcess which was never assigned after the
queue-based refactor. Kill/stop endpoints reset local state but the
real claude subprocess kept running.

Server now writes a cancel file that sidebar-agent.ts polls for. On
detection, the agent kills its active claude subprocess and cleans up.

Found via sqry AST-based semantic code graph analysis.
The cookies command returned the full cookie jar verbatim while the
storage command redacted secrets. Now applies the same pattern-based
redaction (sensitive name patterns + known token prefixes) to cookie
values for consistency.

Found via sqry AST-based semantic code graph analysis.
Codex review caught that upload and cookie-import validated the resolved
path but then used the original path for the actual operation. A symlink
could be swapped between validation and use.

Now upload passes resolvedPaths to setInputFiles, and cookie-import reads
from realFilePath instead of filePath.

Found via sqry AST-based semantic code graph analysis.
…NS IPv6

Tests for all new security fixes:
- validateOutputPath symlink resolution (3 tests)
- Cookie value redaction patterns (4 tests)
- DNS rebinding IPv6 blocklist coverage (3 tests)

Found via sqry AST-based semantic code graph analysis.
Codex review caught that cookie redaction and DNS blocklist tests
duplicated constants locally, creating false-confidence tests that
would stay green even if production code drifted.

Now exports BLOCKED_METADATA_HOSTS from url-validation.ts and
SENSITIVE_COOKIE_NAME/VALUE from read-commands.ts, and tests
import them directly.

Found via sqry AST-based semantic code graph analysis.
…raversal

handleReload accepted arbitrary file paths via POST body.html with only
an existsSync check. Any readable file could be served back through /.

Now anchors all file reads to the initial HTML file's parent directory
using realpathSync, blocking symlink and traversal attacks.

Found via sqry AST-based semantic code graph analysis.
Auth token was included in chrome.runtime.sendMessage health broadcast,
exposing it to all extension components. If sidepanel had an XSS, the
token would be immediately compromised.

Now the token is excluded from broadcasts. Sidepanel requests it via
a targeted getToken message with sendResponse, which only delivers
to the requesting component.

Found via sqry AST-based semantic code graph analysis.
restoreState() called page.goto(saved.url) without validateNavigationUrl.
State files loaded from disk could contain file://, chrome://, or cloud
metadata URLs that would bypass the scheme/host blocklist.

Now validates each saved URL through the same validateNavigationUrl used
by newTab(), skipping navigation for blocked URLs.

Found via sqry AST-based semantic code graph analysis.
…elemetry-ingest

The edge function used SUPABASE_SERVICE_ROLE_KEY which bypasses RLS
entirely. This was unnecessary — RLS policies already allow anon INSERT
and SELECT on telemetry_events, plus INSERT/SELECT/UPDATE on
installations (for upsert).

Now requires the apikey header (returns 401 if missing) and uses it to
create the Supabase client. RLS enforces access control. No server-side
fallback — the header is mandatory.

The client (gstack-telemetry-sync) already sends the anon key in the
apikey header — no client changes needed.

Found via sqry AST-based semantic code graph analysis.
- design/test/serve.test.ts: 3 integration tests for /api/reload path
  traversal — uses inline Bun.serve with production validation logic:
  blocks path outside allowedDir (403), blocks symlink to /etc (403),
  allows file within allowedDir (200 + content served)
- browse/test/url-validation.test.ts: 5 tests for restoreState URL
  patterns — blocks file://, chrome://, metadata IPs; allows https,
  localhost

All 62 tests pass across 3 files (26 path-validation + 22 url-validation
+ 14 serve).

Found via sqry AST-based semantic code graph analysis.
verivusai and others added 4 commits March 30, 2026 13:46
Migration 002 dropped anon_update_last_seen on installations, but the
telemetry-ingest fix (which switched from service role key to anon key)
needs UPDATE for the installations upsert. Without this policy, the
upsert silently fails after 002 is applied.

Found via sqry AST-based semantic code graph analysis.
* feat: CDP inspector module — persistent sessions, CSS cascade, style modification

New browse/src/cdp-inspector.ts with full CDP inspection engine:
- inspectElement() via CSS.getMatchedStylesForNode + DOM.getBoxModel
- modifyStyle() via CSS.setStyleTexts with headless page.evaluate fallback
- Persistent CDP session lifecycle (create, reuse, detach on nav, re-create)
- Specificity sorting, overridden property detection, UA rule filtering
- Modification history with undo support
- formatInspectorResult() for CLI output


* feat: browse server inspector endpoints + inspect/style/cleanup/prettyscreenshot CLI

Server endpoints: POST /inspector/pick, GET /inspector, POST /inspector/apply,
POST /inspector/reset, GET /inspector/history, GET /inspector/events (SSE).
CLI commands: inspect (CDP cascade), style (live CSS mod), cleanup (page clutter
removal), prettyscreenshot (clean screenshot pipeline).


* feat: sidebar CSS inspector — element picker, box model, rule cascade, quick edit

Extension changes for the visual CSS inspector:
- inspector.js: element picker with hover highlight, CSS selector generation,
  basic mode fallback (getComputedStyle + CSSOM), page alteration handlers
- inspector.css: picker overlay styles (blue highlight + tooltip)
- background.js: inspector message routing (picker <-> server <-> sidepanel)
- sidepanel: Inspector tab with box model viz (gstack palette), matched rules
  with specificity badges, computed styles, click-to-edit quick edit,
  Send to Agent/Code button, empty/loading/error states


* docs: document inspect, style, cleanup, prettyscreenshot browse commands


* feat: auto-track user-created tabs and handle tab close

browser-manager.ts changes:
- context.on('page') listener: automatically tracks tabs opened by the user
  (Cmd+T, right-click open in new tab, window.open). Previously only
  programmatic newTab() was tracked, so user tabs were invisible.
- page.on('close') handler in wirePageEvents: removes closed tabs from the
  pages map and switches activeTabId to the last remaining tab.
- syncActiveTabByUrl: match Chrome extension's active tab URL to the correct
  Playwright page for accurate tab identity.


* feat: per-tab agent isolation via BROWSE_TAB environment variable

Prevents parallel sidebar agents from interfering with each other's tab context.

Three-layer fix:
- sidebar-agent.ts: passes BROWSE_TAB=<tabId> env var to each claude process,
  per-tab processing set allows concurrent agents across tabs
- cli.ts: reads process.env.BROWSE_TAB and includes tabId in command request body
- server.ts: handleCommand() temporarily switches activeTabId when tabId is present,
  restores after command completes (safe: Bun event loop is single-threaded)

Also: per-tab agent state (TabAgentState map), per-tab message queuing,
per-tab chat buffers, verbose streaming narration, stop button endpoint.


* feat: sidebar per-tab chat context, tab bar sync, stop button, UX polish

Extension changes:
- sidepanel.js: per-tab chat history (tabChatHistories map), switchChatTab()
  swaps entire chat view, browserTabActivated handler for instant tab sync,
  stop button wired to /sidebar-agent/stop, pollTabs renders tab bar
- sidepanel.html: updated banner text ("Browser co-pilot"), stop button markup,
  input placeholder "Ask about this page..."
- sidepanel.css: tab bar styles, stop button styles, loading state fixes
- background.js: chrome.tabs.onActivated sends browserTabActivated to sidepanel
  with tab URL for instant tab switch detection


* test: per-tab isolation, BROWSE_TAB pinning, tab tracking, sidebar UX

sidebar-agent.test.ts (new tests):
- BROWSE_TAB env var passed to claude process
- CLI reads BROWSE_TAB and sends tabId in body
- handleCommand accepts tabId, saves/restores activeTabId
- Tab pinning only activates when tabId provided
- Per-tab agent state, queue, concurrency
- processingTabs set for parallel agents

sidebar-ux.test.ts (new tests):
- context.on('page') tracks user-created tabs
- page.on('close') removes tabs from pages map
- Tab isolation uses BROWSE_TAB not system prompt hack
- Per-tab chat context in sidepanel
- Tab bar rendering, stop button, banner text


* fix: resolve merge conflicts — keep security defenses + per-tab isolation

Merged main's security improvements (XML escaping, prompt injection defense,
allowed commands whitelist, --model opus, Write tool, stderr capture) with
our branch's per-tab isolation (BROWSE_TAB env var, processingTabs set,
no --resume). Updated test expectations for expanded system prompt.


* chore: bump version and changelog (v0.13.9.0)


* fix: add inspector message types to background.js allowlist

Pre-existing bug found by Codex: ALLOWED_TYPES in background.js was missing
all inspector message types (startInspector, stopInspector, elementPicked,
pickerCancelled, applyStyle, toggleClass, injectCSS, resetAll, inspectResult).
Messages were silently rejected, making the inspector broken on ALL pages.

Also: separate executeScript and insertCSS into individual try blocks in
injectInspector(), store inspectorMode for routing, and add content.js
fallback when script injection fails (CSP, chrome:// pages).


* feat: basic element picker in content.js for CSP-restricted pages

When inspector.js can't be injected (CSP, chrome:// pages), content.js
provides a basic picker using getComputedStyle + CSSOM:
- startBasicPicker/stopBasicPicker message handlers
- captureBasicData() with ~30 key CSS properties, box model, matched rules
- Hover highlight with outline save/restore (never leaves artifacts)
- Click uses e.target directly (no re-querying by selector)
- Sends inspectResult with mode:'basic' for sidebar rendering
- Escape key cancels picker and restores outlines


* feat: cleanup + screenshot buttons in sidebar inspector toolbar

Two action buttons in the inspector toolbar:
- Cleanup (🧹): POSTs cleanup --all to server, shows spinner, chat
  notification on success, resets inspector state (element may be removed)
- Screenshot (📸): POSTs screenshot to server, shows spinner, chat
  notification with saved file path

Shared infrastructure:
- .inspector-action-btn CSS with loading spinner via ::after pseudo-element
- chat-notification type in addChatEntry() for system messages
- package.json version bump to 0.13.9.0


* test: inspector allowlist, CSP fallback, cleanup/screenshot buttons

16 new tests in sidebar-ux.test.ts:
- Inspector message allowlist includes all inspector types
- content.js basic picker (startBasicPicker, captureBasicData, CSSOM,
  outline save/restore, inspectResult with mode basic, Escape cleanup)
- background.js CSP fallback (separate try blocks, inspectorMode, fallback)
- Cleanup button (POST /command, inspector reset after success)
- Screenshot button (POST /command, notification rendering)
- Chat notification type and CSS styles


* docs: update project documentation for v0.13.9.0


* feat: cleanup + screenshot buttons in chat toolbar (not just inspector)

Quick actions toolbar (🧹 Cleanup, 📸 Screenshot) now appears above the chat
input, always visible. Both inspector and chat buttons share runCleanup() and
runScreenshot() helper functions. Clicking either set shows loading state on
both simultaneously.


* test: chat toolbar buttons, shared helpers, quick-action-btn styles

Tests that chat toolbar exists (chat-cleanup-btn, chat-screenshot-btn,
quick-actions container), CSS styles (.quick-action-btn, .quick-action-btn.loading),
shared runCleanup/runScreenshot helper functions, and cleanup inspector reset.


* feat: aggressive cleanup heuristics — overlays, scroll unlock, blur removal

Massively expanded CLEANUP_SELECTORS with patterns from uBlock Origin and
Readability.js research:
- ads: 30+ selectors (Google, Amazon, Outbrain, Taboola, Criteo, etc.)
- cookies: OneTrust, Cookiebot, TrustArc, Quantcast + generic patterns
- overlays (NEW): paywalls, newsletter popups, interstitials, push prompts,
  app download banners, survey modals
- social: follow prompts, share tools
- Cleanup now defaults to --all when no args (sidebar button fix)
- Uses !important on all display:none (overrides inline styles)
- Unlocks body/html scroll (overflow:hidden from modal lockout)
- Removes blur/filter effects (paywall content blur)
- Removes max-height truncation (article teaser truncation)
- Collapses empty ad placeholder whitespace (empty divs after ad removal)
- Skips gstack-ctrl indicator in sticky removal


* fix: disable action buttons when disconnected, no error spam

- setActionButtonsEnabled() toggles .disabled class on all cleanup/screenshot
  buttons (both chat toolbar and inspector toolbar)
- Called with false in updateConnection when server URL is null
- Called with true when connection established
- runCleanup/runScreenshot silently return when disconnected instead of
  showing 'Not connected' error notifications
- CSS .disabled style: pointer-events:none, opacity:0.3, cursor:not-allowed


* test: cleanup heuristics, button disabled state, overlay selectors

17 new tests:
- cleanup defaults to --all on empty args
- CLEANUP_SELECTORS overlays category (paywall, newsletter, interstitial)
- Major ad networks in selectors (doubleclick, taboola, criteo, etc.)
- Major consent frameworks (OneTrust, Cookiebot, TrustArc, Quantcast)
- !important override for inline styles
- Scroll unlock (body overflow:hidden)
- Blur removal (paywall content blur)
- Article truncation removal (max-height)
- Empty placeholder collapse
- gstack-ctrl indicator skip in sticky cleanup
- setActionButtonsEnabled function
- Buttons disabled when disconnected
- No error spam from cleanup/screenshot when disconnected
- CSS disabled styles for action buttons


* feat: LLM-based page cleanup — agent analyzes page semantically

Instead of brittle CSS selectors, the cleanup button now sends a prompt to
the sidebar agent (which IS an LLM). The agent:
1. Runs deterministic $B cleanup --all as a quick first pass
2. Takes a snapshot to see what's left
3. Analyzes the page semantically to identify remaining clutter
4. Removes elements intelligently, preserving site branding

This means cleanup works correctly on any site without site-specific selectors.
The LLM understands that "Your Daily Puzzles" is clutter, "ADVERTISEMENT" is
junk, but the SF Chronicle masthead should stay.


* feat: aggressive cleanup heuristics + preserve top nav bar

Deterministic cleanup improvements (used as first pass before LLM analysis):
- New 'clutter' category: audio players, podcast widgets, sidebar puzzles/games,
  recirculation widgets (taboola, outbrain, nativo), cross-promotion banners
- Text-content detection: removes "ADVERTISEMENT", "Article continues below",
  "Sponsored", "Paid content" labels and their parent wrappers
- Sticky fix: preserves the topmost full-width element near viewport top (site
  nav bar) instead of hiding all sticky/fixed elements. Sorts by vertical
  position, preserves the first one that spans >80% viewport width.

Tests: clutter category, ad label removal, nav bar preservation logic.


* test: LLM-based cleanup architecture, deterministic heuristics, sticky nav

22 new tests covering:
- Cleanup button uses /sidebar-command (agent) not /command (deterministic)
- Cleanup prompt includes deterministic first pass + agent snapshot analysis
- Cleanup prompt lists specific clutter categories for agent guidance
- Cleanup prompt preserves site identity (masthead, headline, body, byline)
- Cleanup prompt instructs scroll unlock and $B eval removal
- Loading state management (async agent, setTimeout)
- Deterministic clutter: audio/podcast, games/puzzles, recirculation
- Ad label text patterns (ADVERTISEMENT, Sponsored, Article continues)
- Ad label parent wrapper hiding for small containers
- Sticky nav preservation (sort by position, first full-width near top)


* fix: prevent repeat chat message rendering on reconnect/replay

Root cause: server persists chat to disk (chat.jsonl) and replays on restart.
Client had no dedup, so every reconnect re-rendered the entire history.
Messages from an old HN session would repeat endlessly on the SF Chronicle tab.

Fix: renderedEntryIds Set tracks which entry IDs have been rendered. addChatEntry
skips entries already in the set. Entries without an id (local notifications)
bypass the check. Clear chat resets the set.


* fix: agent stops when done, no focus stealing, opus for prompt injection safety

Three fixes for sidebar agent UX:
- System prompt: "Be CONCISE. STOP as soon as the task is done. Do NOT keep
  exploring or doing bonus work." Prevents agent from endlessly taking
  screenshots and highlighting elements after answering the question.
- switchTab(id, opts): new bringToFront option. Internal tab pinning
  (BROWSE_TAB) uses bringToFront: false so agent commands never steal
  window focus from the user's active app.
- Keep opus model (not sonnet) for prompt injection resistance on untrusted
  web pages. Remove Write from allowedTools (agent only needs Bash for $B).


* test: agent conciseness, focus stealing, opus model, switchTab opts

Tests for the three UX fixes:
- System prompt contains STOP/CONCISE/Do NOT keep exploring
- sidebar agent uses opus (not sonnet) for prompt injection resistance
- switchTab has bringToFront option, defaults to true (opt-out)
- handleCommand tab pinning uses bringToFront: false (no focus steal)
- Updated stale tests: switchTab signature, allowedTools excludes Write,
  narration -> conciseness, tab pinning restore calls


* test: sidebar CSS interaction E2E — HN comment highlight round-trip

New E2E test (periodic tier, ~$2/run) that exercises the full sidebar
agent pipeline with CSS interaction:
1. Agent navigates to Hacker News
2. Clicks into the top story's comments
3. Reads comments and identifies the most insightful one
4. Highlights it with a 4px solid orange outline via style injection

Tests: navigation, snapshot, text reading, LLM judgment, CSS modification.
Requires real browser + real Claude (ANTHROPIC_API_KEY).


* fix: sidebar CSS E2E test — correct idle timeout (ms not s), pipe stdio

Root cause of test failure: BROWSE_IDLE_TIMEOUT is in milliseconds, not
seconds. '600' = 0.6 seconds, server died immediately after health check.
Fixed to '600000' (10 minutes).

Also: use 'pipe' stdio instead of file descriptors (closing fds kills child
on macOS/bun), catch ConnectionRefused on poll retry, 4 min poll timeout
for the multi-step opus task.

Test passes: agent navigates to HN, reads comments, identifies most
insightful one, highlights it with orange CSS, stops. 114s, $0.00.


---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- sidebar-agent.ts: keep both per-tab concurrency (main) and activeProc tracking (security fix)
- background.js: keep both getToken (security fix) and inspector message types (main)
Migration 003 re-opened the broad UPDATE policy that 002 explicitly
closed. Use PostgreSQL column-level GRANT to restrict anon UPDATE
to (last_seen, gstack_version, os) — the only columns the edge
function writes. Any UPDATE touching first_seen or installation_id
is now rejected at the query level.

Addresses feedback from garagon on garrytan#675.
@mr-k-man mr-k-man force-pushed the fix/security-audit-sqry-findings branch from 30579f7 to 46e18a3 Compare March 30, 2026 22:05
1. IPv6 ULA: prefix-match fc00::/7 range instead of exact fd00:: literal.
   Blocks fd00::1, fd12::*, fc00::*, etc. — not just fd00:: itself.

2. killAgent race: per-tab cancel files (sidebar-agent-cancel-{tabId})
   instead of single global file. Prevents cross-tab signal theft in
   concurrent multi-agent mode.

3. Telemetry upsert: check and surface errors from installations.upsert()
   instead of silently dropping failures. Returns upsertErrors in response
   when column-level GRANT or migration issues cause failures.
1. IPv6 ULA: require colon in address before prefix-matching — prevents
   false positives on hostnames like fd.example.com and fcustomer.com.

2. killAgent targeting: accept tabId parameter, read from request body
   in kill/stop endpoints. Per-tab cancel files now correctly target
   the intended tab's agent in concurrent mode.

3. getToken scoping: reject requests from content scripts (sender.tab
   set) — only extension pages (sidepanel/popup) can retrieve the token.

4. Telemetry upsert: surface upsertErrors in edge function response,
   log warnings in sync client when installation upserts fail.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants