feat: universal proxy (new approach)#837
Conversation
Semgrep Security ScanNo security issues found. |
PR Metrics
Updated Wed, 13 May 2026 20:58:13 GMT · run #1491 |
…from ending up in server logs
- inject DnsLookup into url-validation, proxy, and preview routes - inject SearchExaClient into search route via createApp deps - inject email senders into createAuth to capture OTPs per test - avoids mock.module leaking across test files
- store test PGlite instance on globalThis so it survives module reloads - close on process beforeExit instead of afterAll, so reruns reuse one instance rather than reinitializing (and re-running migrations) each time
|
Preview environment deployed 🚀
Stack: Auto-destroys on PR close/merge. Login via the bundled Keycloak realm — |
- Eliminates a sign-in race where the bearer is returned before the session row is visible, which surfaced as a confusing 401 on the first authenticated request in tests.
- move cleanup into bun:test afterAll so it runs inside the test runner lifecycle, before Bun terminates worker threads - drop the globalThis-shared PGlite instance; one instance per process is sufficient now that cleanup is reliable
| @@ -0,0 +1,119 @@ | |||
| /* This Source Code Form is subject to the terms of the Mozilla Public | |||
There was a problem hiding this comment.
Why is this .spec instead of .test? All other tests are .test (or should be!) If other e2e tests are not, then let's leave it for consistency
|
|
||
| import { fetch as tauriFetch } from '@tauri-apps/plugin-http' | ||
| import { | ||
| PASSTHROUGH_PREFIX, |
There was a problem hiding this comment.
These should all be camel case
ital0
left a comment
There was a problem hiding this comment.
Took a careful pass through this. The overall shape is solid.
I left a few inline comments where I think the new approach trades away things we agreed on in [thunderbird/thunderbolt-spec#2 (https://github.com/thunderbird/thunderbolt-spec/pull/2): the <img> proxy contract (we lose require-corp COEP), cache headers, the proxy_enabled user toggle, observability error classes, and a small perf note on content-encoding. Most of these already landed in #816 in their final shape, so the diffs over there might save us a round of trial and error.
None of this is meant to block your direction.
| configureServer: (server) => { | ||
| server.middlewares.use((req, res, next) => { | ||
| res.setHeader('Cross-Origin-Embedder-Policy', 'require-corp') | ||
| res.setHeader('Cross-Origin-Embedder-Policy', 'credentialless') |
There was a problem hiding this comment.
COEP got dropped from require-corp to credentialless here, which I'm guessing was the workaround for favicons going direct to upstream.
The downside is that it weakens cross origin isolation for the whole app.
If favicons route through /v1/proxy instead, you can keep require-corp on and still show every image.
I took that route in #816 using a small useProxyUrl() hook on the <img> callers.
|
|
||
| // Proxy-set headers (NOT prefixed): describe the proxy's own response framing | ||
| // and security posture. Forced — override anything the upstream might have sent. | ||
| out.set('Content-Security-Policy', 'sandbox') |
There was a problem hiding this comment.
On caching: upstream's Cache-Control flows through unchanged, so images either get cached too eagerly at CDN layers or not at all by the browser.
Setting Cache-Control: private, max-age=600 and CDN-Cache-Control: no-store here gives the browser a 10 minute cache while keeping CDNs out.
I added both in #816 at routes.ts:376-377 https://github.com/thunderbird/thunderbolt/blob/6a1e1c9f/backend/src/proxy/routes.ts#L376-L377
| * upstream describe the wrong thing. Set-Cookie family is dropped to preserve | ||
| * cookie isolation: the response's *origin* is Thunderbolt, not the upstream. */ | ||
| export const DROPPED_RESPONSE_HEADERS = new Set([ | ||
| 'content-length', |
There was a problem hiding this comment.
wondering about content-encoding being on the dropped list.
It works, but Bun decompresses every gzipped body by default, so the proxy ends up holding the uncompressed bytes in memory before sending them back.
If you pass decompress: false to the upstream fetch and forward content-encoding through, the browser does its own decode and you skip the rebuffer.
That's what #816 does at routes.ts:271-276. Perf note, not a correctness one.
| export const createProxyFetch = (options: ProxyFetchOptions): typeof fetch => { | ||
| const proxyUrl = `${options.cloudUrl.replace(/\/$/, '')}/proxy` | ||
| return (async (input, init) => { | ||
| const standalone = (options.isStandalone ?? isTauri)() |
There was a problem hiding this comment.
quick thought: this picks between direct fetch and the proxy, but there's no Settings toggle anywhere.
Tauri users on hosted mode end up routing every Anthropic / OpenAI call through the proxy with no way to opt out.
THU 471 and spec PR #2 introduced a proxy_enabled flag plus a Network section in Preferences for exactly this.
I kept it in #816 at preferences.tsx:104.
| } | ||
|
|
||
| return { | ||
| proxyRequest(fields) { |
There was a problem hiding this comment.
Curious about the recorder shape.
There's an error field, but nothing categorizing the failure (SSRF block, DNS timeout, idle, cap, upstream 5xx, auth reject), so PostHog dashboards can't filter by type. A small ProxyErrorType enum tagged on every failure fixes that. Pairing it with [OpenTelemetry span attributes (https://opentelemetry.io/docs/concepts/signals/traces/#attributes) (when trace.getActiveSpan() returns one) makes proxy events show up inside the trace already running on the request.
Check how I did it: #816
Export getOrCreateProxyFetch + __resetProxyFetchCacheForTests and add a unit test that asserts same-cloudUrl returns a stable reference and different cloudUrls produce different references (incl. lazy eviction of the prior entry).
Drop redundant defensive fallback chain in ProxyFetchProvider; useSettings already applies the schema default. Clarify in JSDoc that the React context and the module-scoped cache in src/ai/fetch.ts are independent.
…hrough)
- observability: strip PostHog (Pino + OTel only); add ProxyErrorType
union and `trace.getActiveSpan().setAttributes(...)` for proxy.* attrs.
Wire real status / duration_ms / bytes_in / bytes_out from the response
and capStream (fixes bot MEDIUM findings: hardcoded status, zero
duration/bytes). Tag error_type on every failure path.
- content-encoding passthrough: pass `decompress: false` to Bun fetch,
remove `content-encoding` from DROPPED_RESPONSE_HEADERS. Browser now
decodes; the proxy holds compressed bytes only. Comment near 10MB cap
documents the zip-bomb risk shift to the caller.
- streaming: capStream returns { stream, bytesRead() } and fires a
single `onComplete(bytes)` so observability emits exactly once after
stream drain (graceful, cap-hit, idle, error, or cancel).
For streaming POSTs (no follow-redirects), bytesIn was captured at the moment fetchFn() returned response headers — before the request body had fully drained through capStream. Replace the captured number with a late-read getter so the observability emission (which fires from the response stream's onComplete) sees the final upload byte count.
- routes.ts: replace unbounded arrayBuffer() in buffered-body path with bounded streaming accumulator that early-terminates at maxBodyBytes. Closes a DoS vector where a chunked upload (no Content-Length) would materialise the full body in memory before the 413 check fired. - routes.ts: document Bun's decompress:false semantics (Bun >=1.3 keeps content-encoding on the Response even after auto-decode, so the cast is load-bearing for correctness — not just a perf flag). - routes.test.ts: add streaming-POST bytes_in regression test exercising the validator's deferred-getter fix (commit 7d5ddada) directly. - routes.test.ts: add chunked-upload DoS test proving early-termination.
| return { error: 'Search service is not configured' } | ||
| } | ||
|
|
||
| const limit = query.limit ? Math.min(Math.max(query.limit, 1), 25) : 10 |
There was a problem hiding this comment.
Truthy check on numeric limit treats zero as absent
Low Severity
The query.limit ? ... : 10 check uses a truthy test, but t.Numeric() coerces the string to a number — so ?limit=0 becomes 0 (falsy) and silently falls through to the default of 10 instead of being clamped to 1 by the Math.max logic. The correct check is query.limit != null to distinguish "not provided" from "provided as zero."
Reviewed by Cursor Bugbot for commit 3507897. Configure here.
|
|
||
| const rateLimitSettings = { enabled: settings.rateLimitEnabled } | ||
| const ipRateLimitSettings = { ...rateLimitSettings, trustedProxy: settings.trustedProxy } | ||
| const proRateLimit = createProRateLimit(database, rateLimitSettings) |
There was a problem hiding this comment.
Single shared rate limiter changes effective per-group limits
Medium Severity
createProRateLimit is now called once and the same instance is shared across createProToolsRoutes, createUniversalProxyRoutes, createUniversalProxyWsRoutes, createSearchRoutes, and createPreviewRoutes. Previously each route group received its own createProRateLimit(...) call. If the rate limiter is stateful (per-user token-bucket or sliding window), all five groups now compete for the same quota, so a burst of proxy requests can starve weather/search/preview and vice versa.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 4c6aeb6. Configure here.
- Remove backend/scripts/dev.sh (Chris flagged out-of-scope) - Revert backend/src/db/client.ts dev rewrite - Revert playwright.config.ts tweak - Revert powersync localhost defaults in settings.ts (closes HIGH bot finding) These items moved to italomenezes/cjroth-proxy-dev-env (stacked PR).
# Conflicts: # backend/.env.example # backend/eslint.config.js # backend/src/auth/auth.ts # backend/src/matchers.d.ts # backend/src/pro/link-preview.ts # backend/src/proxy/routes.ts # backend/src/utils/streaming.ts # backend/src/utils/url-validation.ts # powersync-service/docker-compose.yml
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 3 total unresolved issues (including 2 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 94342ac. Configure here.
| .string() | ||
| .default( | ||
| 'Content-Type,Authorization,Accept,Accept-Encoding,Accept-Language,Cache-Control,User-Agent,X-Requested-With,X-Client-Platform,X-Device-ID,X-Device-Name,X-Challenge-Token,X-Mcp-Target-Url,Mcp-Authorization,Mcp-Session-Id,Mcp-Protocol-Version', | ||
| 'set-auth-token,X-Proxy-Final-Url,X-Proxy-Passthrough-Content-Type,X-Proxy-Passthrough-Mcp-Session-Id,X-Proxy-Passthrough-Mcp-Protocol-Version,X-Proxy-Passthrough-Location,X-Proxy-Passthrough-Anthropic-Version', |
There was a problem hiding this comment.
Static expose-headers list breaks dynamic passthrough headers
Medium Severity
The corsExposeHeaders default is a hardcoded list of specific X-Proxy-Passthrough-* header names (e.g. Content-Type, Mcp-Session-Id, Location, Anthropic-Version). But the proxy dynamically prefixes every upstream response header with X-Proxy-Passthrough-, meaning any header not in this static list (e.g. X-Proxy-Passthrough-X-Request-Id, X-Proxy-Passthrough-OpenAI-Organization, etc.) will be invisible to browser-side Response.headers in cross-origin mode. The frontend's unwrapHostedResponse in proxy-fetch.ts relies on reading all passthrough headers to reconstruct the upstream response. This silently drops upstream headers that aren't in the allowlist, breaking provider SDK responses that use non-enumerated headers.
Reviewed by Cursor Bugbot for commit 94342ac. Configure here.


Note
High Risk
High risk because it rewires proxying/CORS behavior and removes legacy proxy/link-preview/MCP proxy routes, affecting security-sensitive SSRF protection, header handling, and client compatibility.
Overview
Introduces a new universal proxy at
POST/GET /v1/proxy(plus WS relay) with SSRF-safe URL validation/pinning, passthrough header rewriting, redirect handling, streaming body caps, and structured observability that records only the target hostname and sets OTel span attributes.Adds authenticated Pro endpoints
GET /v1/search(Exa-backed, HTTPS-normalised results) andPOST /v1/preview(OG metadata extraction with size/time caps, private per-user caching, and HTTPS-upgraded images), with new e2e coverage.Removes legacy
pro/proxy,pro/link-preview, andmcp-proxyroute implementations/tests, updates server-wide CORS toallowedHeaders: trueand revises settings/env docs to rely oncorsExposeHeadersinstead of a static allowlist; also tweaks auth factory for injectable email senders and sets Better AuthbaseURL.Reviewed by Cursor Bugbot for commit a9c2d2b. Bugbot is set up for automated code reviews on this repo. Configure here.