fix(security): 6 browse bugs found via sqry semantic code graph#664
Open
mr-k-man wants to merge 20 commits intogarrytan:mainfrom
Open
fix(security): 6 browse bugs found via sqry semantic code graph#664mr-k-man wants to merge 20 commits intogarrytan:mainfrom
mr-k-man wants to merge 20 commits intogarrytan:mainfrom
Conversation
validateOutputPath used path.resolve() but never fs.realpathSync(), unlike validateReadPath. A symlink under /tmp or cwd could redirect screenshot/PDF writes outside safe directories. Now matches the symlink-aware pattern from validateReadPath including realpathSync on safe directories themselves (macOS /tmp → /private/tmp). Found via sqry AST-based semantic code graph analysis.
upload command had no safe-directory check — any file the process could read was exfiltrable via setInputFiles() to an attacker-controlled page. Now validates paths with realpathSync against safe directories. cookie-import used path.normalize() to check for ".." but never resolved symlinks. A relative symlink could bypass validation. Now uses realpathSync like validateReadPath. Found via sqry AST-based semantic code graph analysis.
resolvesToBlockedIp() only called dns.resolve4() — AAAA records pointing to blocked IPv6 metadata addresses (e.g., fd00::) bypassed protection. Now checks both resolve4 and resolve6 in parallel. Found via sqry AST-based semantic code graph analysis.
killAgent() signaled agentProcess which was never assigned after the queue-based refactor. Kill/stop endpoints reset local state but the real claude subprocess kept running. Server now writes a cancel file that sidebar-agent.ts polls for. On detection, the agent kills its active claude subprocess and cleans up. Found via sqry AST-based semantic code graph analysis.
The cookies command returned the full cookie jar verbatim while the storage command redacted secrets. Now applies the same pattern-based redaction (sensitive name patterns + known token prefixes) to cookie values for consistency. Found via sqry AST-based semantic code graph analysis.
Codex review caught that upload and cookie-import validated the resolved path but then used the original path for the actual operation. A symlink could be swapped between validation and use. Now upload passes resolvedPaths to setInputFiles, and cookie-import reads from realFilePath instead of filePath. Found via sqry AST-based semantic code graph analysis.
…NS IPv6 Tests for all new security fixes: - validateOutputPath symlink resolution (3 tests) - Cookie value redaction patterns (4 tests) - DNS rebinding IPv6 blocklist coverage (3 tests) Found via sqry AST-based semantic code graph analysis.
Codex review caught that cookie redaction and DNS blocklist tests duplicated constants locally, creating false-confidence tests that would stay green even if production code drifted. Now exports BLOCKED_METADATA_HOSTS from url-validation.ts and SENSITIVE_COOKIE_NAME/VALUE from read-commands.ts, and tests import them directly. Found via sqry AST-based semantic code graph analysis.
This was referenced Mar 30, 2026
Open
…raversal handleReload accepted arbitrary file paths via POST body.html with only an existsSync check. Any readable file could be served back through /. Now anchors all file reads to the initial HTML file's parent directory using realpathSync, blocking symlink and traversal attacks. Found via sqry AST-based semantic code graph analysis.
Auth token was included in chrome.runtime.sendMessage health broadcast, exposing it to all extension components. If sidepanel had an XSS, the token would be immediately compromised. Now the token is excluded from broadcasts. Sidepanel requests it via a targeted getToken message with sendResponse, which only delivers to the requesting component. Found via sqry AST-based semantic code graph analysis.
restoreState() called page.goto(saved.url) without validateNavigationUrl. State files loaded from disk could contain file://, chrome://, or cloud metadata URLs that would bypass the scheme/host blocklist. Now validates each saved URL through the same validateNavigationUrl used by newTab(), skipping navigation for blocked URLs. Found via sqry AST-based semantic code graph analysis.
…elemetry-ingest The edge function used SUPABASE_SERVICE_ROLE_KEY which bypasses RLS entirely. This was unnecessary — RLS policies already allow anon INSERT and SELECT on telemetry_events, plus INSERT/SELECT/UPDATE on installations (for upsert). Now requires the apikey header (returns 401 if missing) and uses it to create the Supabase client. RLS enforces access control. No server-side fallback — the header is mandatory. The client (gstack-telemetry-sync) already sends the anon key in the apikey header — no client changes needed. Found via sqry AST-based semantic code graph analysis.
- design/test/serve.test.ts: 3 integration tests for /api/reload path traversal — uses inline Bun.serve with production validation logic: blocks path outside allowedDir (403), blocks symlink to /etc (403), allows file within allowedDir (200 + content served) - browse/test/url-validation.test.ts: 5 tests for restoreState URL patterns — blocks file://, chrome://, metadata IPs; allows https, localhost All 62 tests pass across 3 files (26 path-validation + 22 url-validation + 14 serve). Found via sqry AST-based semantic code graph analysis.
This was referenced Mar 30, 2026
Migration 002 dropped anon_update_last_seen on installations, but the telemetry-ingest fix (which switched from service role key to anon key) needs UPDATE for the installations upsert. Without this policy, the upsert silently fails after 002 is applied. Found via sqry AST-based semantic code graph analysis.
* feat: CDP inspector module — persistent sessions, CSS cascade, style modification
New browse/src/cdp-inspector.ts with full CDP inspection engine:
- inspectElement() via CSS.getMatchedStylesForNode + DOM.getBoxModel
- modifyStyle() via CSS.setStyleTexts with headless page.evaluate fallback
- Persistent CDP session lifecycle (create, reuse, detach on nav, re-create)
- Specificity sorting, overridden property detection, UA rule filtering
- Modification history with undo support
- formatInspectorResult() for CLI output
* feat: browse server inspector endpoints + inspect/style/cleanup/prettyscreenshot CLI
Server endpoints: POST /inspector/pick, GET /inspector, POST /inspector/apply,
POST /inspector/reset, GET /inspector/history, GET /inspector/events (SSE).
CLI commands: inspect (CDP cascade), style (live CSS mod), cleanup (page clutter
removal), prettyscreenshot (clean screenshot pipeline).
* feat: sidebar CSS inspector — element picker, box model, rule cascade, quick edit
Extension changes for the visual CSS inspector:
- inspector.js: element picker with hover highlight, CSS selector generation,
basic mode fallback (getComputedStyle + CSSOM), page alteration handlers
- inspector.css: picker overlay styles (blue highlight + tooltip)
- background.js: inspector message routing (picker <-> server <-> sidepanel)
- sidepanel: Inspector tab with box model viz (gstack palette), matched rules
with specificity badges, computed styles, click-to-edit quick edit,
Send to Agent/Code button, empty/loading/error states
* docs: document inspect, style, cleanup, prettyscreenshot browse commands
* feat: auto-track user-created tabs and handle tab close
browser-manager.ts changes:
- context.on('page') listener: automatically tracks tabs opened by the user
(Cmd+T, right-click open in new tab, window.open). Previously only
programmatic newTab() was tracked, so user tabs were invisible.
- page.on('close') handler in wirePageEvents: removes closed tabs from the
pages map and switches activeTabId to the last remaining tab.
- syncActiveTabByUrl: match Chrome extension's active tab URL to the correct
Playwright page for accurate tab identity.
* feat: per-tab agent isolation via BROWSE_TAB environment variable
Prevents parallel sidebar agents from interfering with each other's tab context.
Three-layer fix:
- sidebar-agent.ts: passes BROWSE_TAB=<tabId> env var to each claude process,
per-tab processing set allows concurrent agents across tabs
- cli.ts: reads process.env.BROWSE_TAB and includes tabId in command request body
- server.ts: handleCommand() temporarily switches activeTabId when tabId is present,
restores after command completes (safe: Bun event loop is single-threaded)
Also: per-tab agent state (TabAgentState map), per-tab message queuing,
per-tab chat buffers, verbose streaming narration, stop button endpoint.
* feat: sidebar per-tab chat context, tab bar sync, stop button, UX polish
Extension changes:
- sidepanel.js: per-tab chat history (tabChatHistories map), switchChatTab()
swaps entire chat view, browserTabActivated handler for instant tab sync,
stop button wired to /sidebar-agent/stop, pollTabs renders tab bar
- sidepanel.html: updated banner text ("Browser co-pilot"), stop button markup,
input placeholder "Ask about this page..."
- sidepanel.css: tab bar styles, stop button styles, loading state fixes
- background.js: chrome.tabs.onActivated sends browserTabActivated to sidepanel
with tab URL for instant tab switch detection
* test: per-tab isolation, BROWSE_TAB pinning, tab tracking, sidebar UX
sidebar-agent.test.ts (new tests):
- BROWSE_TAB env var passed to claude process
- CLI reads BROWSE_TAB and sends tabId in body
- handleCommand accepts tabId, saves/restores activeTabId
- Tab pinning only activates when tabId provided
- Per-tab agent state, queue, concurrency
- processingTabs set for parallel agents
sidebar-ux.test.ts (new tests):
- context.on('page') tracks user-created tabs
- page.on('close') removes tabs from pages map
- Tab isolation uses BROWSE_TAB not system prompt hack
- Per-tab chat context in sidepanel
- Tab bar rendering, stop button, banner text
* fix: resolve merge conflicts — keep security defenses + per-tab isolation
Merged main's security improvements (XML escaping, prompt injection defense,
allowed commands whitelist, --model opus, Write tool, stderr capture) with
our branch's per-tab isolation (BROWSE_TAB env var, processingTabs set,
no --resume). Updated test expectations for expanded system prompt.
* chore: bump version and changelog (v0.13.9.0)
* fix: add inspector message types to background.js allowlist
Pre-existing bug found by Codex: ALLOWED_TYPES in background.js was missing
all inspector message types (startInspector, stopInspector, elementPicked,
pickerCancelled, applyStyle, toggleClass, injectCSS, resetAll, inspectResult).
Messages were silently rejected, making the inspector broken on ALL pages.
Also: separate executeScript and insertCSS into individual try blocks in
injectInspector(), store inspectorMode for routing, and add content.js
fallback when script injection fails (CSP, chrome:// pages).
* feat: basic element picker in content.js for CSP-restricted pages
When inspector.js can't be injected (CSP, chrome:// pages), content.js
provides a basic picker using getComputedStyle + CSSOM:
- startBasicPicker/stopBasicPicker message handlers
- captureBasicData() with ~30 key CSS properties, box model, matched rules
- Hover highlight with outline save/restore (never leaves artifacts)
- Click uses e.target directly (no re-querying by selector)
- Sends inspectResult with mode:'basic' for sidebar rendering
- Escape key cancels picker and restores outlines
* feat: cleanup + screenshot buttons in sidebar inspector toolbar
Two action buttons in the inspector toolbar:
- Cleanup (🧹): POSTs cleanup --all to server, shows spinner, chat
notification on success, resets inspector state (element may be removed)
- Screenshot (📸): POSTs screenshot to server, shows spinner, chat
notification with saved file path
Shared infrastructure:
- .inspector-action-btn CSS with loading spinner via ::after pseudo-element
- chat-notification type in addChatEntry() for system messages
- package.json version bump to 0.13.9.0
* test: inspector allowlist, CSP fallback, cleanup/screenshot buttons
16 new tests in sidebar-ux.test.ts:
- Inspector message allowlist includes all inspector types
- content.js basic picker (startBasicPicker, captureBasicData, CSSOM,
outline save/restore, inspectResult with mode basic, Escape cleanup)
- background.js CSP fallback (separate try blocks, inspectorMode, fallback)
- Cleanup button (POST /command, inspector reset after success)
- Screenshot button (POST /command, notification rendering)
- Chat notification type and CSS styles
* docs: update project documentation for v0.13.9.0
* feat: cleanup + screenshot buttons in chat toolbar (not just inspector)
Quick actions toolbar (🧹 Cleanup, 📸 Screenshot) now appears above the chat
input, always visible. Both inspector and chat buttons share runCleanup() and
runScreenshot() helper functions. Clicking either set shows loading state on
both simultaneously.
* test: chat toolbar buttons, shared helpers, quick-action-btn styles
Tests that chat toolbar exists (chat-cleanup-btn, chat-screenshot-btn,
quick-actions container), CSS styles (.quick-action-btn, .quick-action-btn.loading),
shared runCleanup/runScreenshot helper functions, and cleanup inspector reset.
* feat: aggressive cleanup heuristics — overlays, scroll unlock, blur removal
Massively expanded CLEANUP_SELECTORS with patterns from uBlock Origin and
Readability.js research:
- ads: 30+ selectors (Google, Amazon, Outbrain, Taboola, Criteo, etc.)
- cookies: OneTrust, Cookiebot, TrustArc, Quantcast + generic patterns
- overlays (NEW): paywalls, newsletter popups, interstitials, push prompts,
app download banners, survey modals
- social: follow prompts, share tools
- Cleanup now defaults to --all when no args (sidebar button fix)
- Uses !important on all display:none (overrides inline styles)
- Unlocks body/html scroll (overflow:hidden from modal lockout)
- Removes blur/filter effects (paywall content blur)
- Removes max-height truncation (article teaser truncation)
- Collapses empty ad placeholder whitespace (empty divs after ad removal)
- Skips gstack-ctrl indicator in sticky removal
* fix: disable action buttons when disconnected, no error spam
- setActionButtonsEnabled() toggles .disabled class on all cleanup/screenshot
buttons (both chat toolbar and inspector toolbar)
- Called with false in updateConnection when server URL is null
- Called with true when connection established
- runCleanup/runScreenshot silently return when disconnected instead of
showing 'Not connected' error notifications
- CSS .disabled style: pointer-events:none, opacity:0.3, cursor:not-allowed
* test: cleanup heuristics, button disabled state, overlay selectors
17 new tests:
- cleanup defaults to --all on empty args
- CLEANUP_SELECTORS overlays category (paywall, newsletter, interstitial)
- Major ad networks in selectors (doubleclick, taboola, criteo, etc.)
- Major consent frameworks (OneTrust, Cookiebot, TrustArc, Quantcast)
- !important override for inline styles
- Scroll unlock (body overflow:hidden)
- Blur removal (paywall content blur)
- Article truncation removal (max-height)
- Empty placeholder collapse
- gstack-ctrl indicator skip in sticky cleanup
- setActionButtonsEnabled function
- Buttons disabled when disconnected
- No error spam from cleanup/screenshot when disconnected
- CSS disabled styles for action buttons
* feat: LLM-based page cleanup — agent analyzes page semantically
Instead of brittle CSS selectors, the cleanup button now sends a prompt to
the sidebar agent (which IS an LLM). The agent:
1. Runs deterministic $B cleanup --all as a quick first pass
2. Takes a snapshot to see what's left
3. Analyzes the page semantically to identify remaining clutter
4. Removes elements intelligently, preserving site branding
This means cleanup works correctly on any site without site-specific selectors.
The LLM understands that "Your Daily Puzzles" is clutter, "ADVERTISEMENT" is
junk, but the SF Chronicle masthead should stay.
* feat: aggressive cleanup heuristics + preserve top nav bar
Deterministic cleanup improvements (used as first pass before LLM analysis):
- New 'clutter' category: audio players, podcast widgets, sidebar puzzles/games,
recirculation widgets (taboola, outbrain, nativo), cross-promotion banners
- Text-content detection: removes "ADVERTISEMENT", "Article continues below",
"Sponsored", "Paid content" labels and their parent wrappers
- Sticky fix: preserves the topmost full-width element near viewport top (site
nav bar) instead of hiding all sticky/fixed elements. Sorts by vertical
position, preserves the first one that spans >80% viewport width.
Tests: clutter category, ad label removal, nav bar preservation logic.
* test: LLM-based cleanup architecture, deterministic heuristics, sticky nav
22 new tests covering:
- Cleanup button uses /sidebar-command (agent) not /command (deterministic)
- Cleanup prompt includes deterministic first pass + agent snapshot analysis
- Cleanup prompt lists specific clutter categories for agent guidance
- Cleanup prompt preserves site identity (masthead, headline, body, byline)
- Cleanup prompt instructs scroll unlock and $B eval removal
- Loading state management (async agent, setTimeout)
- Deterministic clutter: audio/podcast, games/puzzles, recirculation
- Ad label text patterns (ADVERTISEMENT, Sponsored, Article continues)
- Ad label parent wrapper hiding for small containers
- Sticky nav preservation (sort by position, first full-width near top)
* fix: prevent repeat chat message rendering on reconnect/replay
Root cause: server persists chat to disk (chat.jsonl) and replays on restart.
Client had no dedup, so every reconnect re-rendered the entire history.
Messages from an old HN session would repeat endlessly on the SF Chronicle tab.
Fix: renderedEntryIds Set tracks which entry IDs have been rendered. addChatEntry
skips entries already in the set. Entries without an id (local notifications)
bypass the check. Clear chat resets the set.
* fix: agent stops when done, no focus stealing, opus for prompt injection safety
Three fixes for sidebar agent UX:
- System prompt: "Be CONCISE. STOP as soon as the task is done. Do NOT keep
exploring or doing bonus work." Prevents agent from endlessly taking
screenshots and highlighting elements after answering the question.
- switchTab(id, opts): new bringToFront option. Internal tab pinning
(BROWSE_TAB) uses bringToFront: false so agent commands never steal
window focus from the user's active app.
- Keep opus model (not sonnet) for prompt injection resistance on untrusted
web pages. Remove Write from allowedTools (agent only needs Bash for $B).
* test: agent conciseness, focus stealing, opus model, switchTab opts
Tests for the three UX fixes:
- System prompt contains STOP/CONCISE/Do NOT keep exploring
- sidebar agent uses opus (not sonnet) for prompt injection resistance
- switchTab has bringToFront option, defaults to true (opt-out)
- handleCommand tab pinning uses bringToFront: false (no focus steal)
- Updated stale tests: switchTab signature, allowedTools excludes Write,
narration -> conciseness, tab pinning restore calls
* test: sidebar CSS interaction E2E — HN comment highlight round-trip
New E2E test (periodic tier, ~$2/run) that exercises the full sidebar
agent pipeline with CSS interaction:
1. Agent navigates to Hacker News
2. Clicks into the top story's comments
3. Reads comments and identifies the most insightful one
4. Highlights it with a 4px solid orange outline via style injection
Tests: navigation, snapshot, text reading, LLM judgment, CSS modification.
Requires real browser + real Claude (ANTHROPIC_API_KEY).
* fix: sidebar CSS E2E test — correct idle timeout (ms not s), pipe stdio
Root cause of test failure: BROWSE_IDLE_TIMEOUT is in milliseconds, not
seconds. '600' = 0.6 seconds, server died immediately after health check.
Fixed to '600000' (10 minutes).
Also: use 'pipe' stdio instead of file descriptors (closing fds kills child
on macOS/bun), catch ConnectionRefused on poll retry, 4 min poll timeout
for the multi-step opus task.
Test passes: agent navigates to HN, reads comments, identifies most
insightful one, highlights it with orange CSS, stops. 114s, $0.00.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- sidebar-agent.ts: keep both per-tab concurrency (main) and activeProc tracking (security fix) - background.js: keep both getToken (security fix) and inspector message types (main)
Migration 003 re-opened the broad UPDATE policy that 002 explicitly closed. Use PostgreSQL column-level GRANT to restrict anon UPDATE to (last_seen, gstack_version, os) — the only columns the edge function writes. Any UPDATE touching first_seen or installation_id is now rejected at the query level. Addresses feedback from garagon on garrytan#675.
30579f7 to
46e18a3
Compare
1. IPv6 ULA: prefix-match fc00::/7 range instead of exact fd00:: literal.
Blocks fd00::1, fd12::*, fc00::*, etc. — not just fd00:: itself.
2. killAgent race: per-tab cancel files (sidebar-agent-cancel-{tabId})
instead of single global file. Prevents cross-tab signal theft in
concurrent multi-agent mode.
3. Telemetry upsert: check and surface errors from installations.upsert()
instead of silently dropping failures. Returns upsertErrors in response
when column-level GRANT or migration issues cause failures.
1. IPv6 ULA: require colon in address before prefix-matching — prevents false positives on hostnames like fd.example.com and fcustomer.com. 2. killAgent targeting: accept tabId parameter, read from request body in kill/stop endpoints. Per-tab cancel files now correctly target the intended tab's agent in concurrent mode. 3. getToken scoping: reject requests from content scripts (sender.tab set) — only extension pages (sidepanel/popup) can retrieve the token. 4. Telemetry upsert: surface upsertErrors in edge function response, log warnings in sync client when installation upserts fail.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
10 security/correctness bugs in the gstack codebase, discovered using sqry AST-based semantic code graph analysis. Cross-validated by Claude Opus 4.6, OpenAI Codex (GPT-5.4), and Gemini Flash — all findings verified with 0 false positives.
Round 1 — Browse subsystem (6 bugs)
realpathSyncto matchvalidateReadPathpatternrealpathSyncbeforesetInputFilesnormalize+includes("..")withrealpathSync-based validationresolvesToBlockedIpnow checks bothresolve4andresolve6storagecommandRound 2 — Cross-subsystem (4 bugs)
/api/reloadaccepted arbitrary file paths; now anchored to initial HTML's parent directory withrealpathSyncchrome.runtime.sendMessagehealth broadcast; sidepanel now requests via targetedgetTokenmessagepage.goto(saved.url)now runs throughvalidateNavigationUrlfirstSUPABASE_SERVICE_ROLE_KEYwith caller's anon key; RLS enforces access controlChanged files
browse/src/meta-commands.tsvalidateOutputPathresolves symlinks viarealpathSyncbrowse/src/write-commands.tsupload+cookie-importpath validation with TOCTOU fixbrowse/src/url-validation.tsresolvesToBlockedIpchecks IPv4 + IPv6, export blocklistbrowse/src/server.tskillAgentwrites cancel file for sidebar-agentbrowse/src/sidebar-agent.tsbrowse/src/read-commands.tsbrowse/src/browser-manager.tsrestoreStatevalidates URLs before navigationdesign/src/serve.ts/api/reloadpath traversal fix withrealpathSyncextension/background.jsgetTokenhandler addedextension/sidepanel.jsgetTokeninstead of broadcastsupabase/functions/telemetry-ingest/index.tsbrowse/test/path-validation.test.tsbrowse/test/url-validation.test.tsdesign/test/serve.test.tsTest plan
bun test browse/test/path-validation.test.ts— 26/26 passbun test browse/test/url-validation.test.ts— 22/22 passbun test design/test/serve.test.ts— 14/14 passbun run test:evals— needs ANTHROPIC_API_KEY (contributor environment)Discovery methodology
All bugs found via sqry
complexity_metrics→ manual inspection of highest-complexity functions, plusfind_unused,get_insights, and deepexplain_code/pattern_searchacross 174 indexed files. Three AI auditors (Claude, Codex, Gemini) independently analyzed the codebase and cross-validated findings.Issues
Round 1: #665, #666, #667, #668, #669, #670
Round 2: #672, #673, #674, #675