fix: re-validate redirects against SSRF protection in API Request component#13394
Conversation
…ponent The API Request component validated only the initial URL and pinned DNS for the initial host, then let httpx auto-follow redirects when follow_redirects was enabled. With SSRF protection enabled, a public URL that redirected to an internal address (loopback, RFC1918, link-local / cloud metadata) was followed without re-validation, bypassing the SSRF block. When SSRF protection is enabled, redirects are now followed manually and every hop's Location is re-validated with the same private/loopback/link-local denylist and DNS pinning applied to the initial request. Redirects are capped (MAX_REDIRECTS=20), non-http(s) schemes are blocked, relative locations are resolved, and sensitive headers (Authorization / Cookie) are dropped on cross-host redirects. make_request now defaults to follow_redirects=False (fail-safe). Behavior is unchanged when SSRF protection is disabled. Adds regression tests for public->localhost, public->RFC1918, public->link-local metadata, scheme changes, chained redirects, redirect to a hostname resolving internal, per-hop DNS pinning, the redirect cap, and cross-host header stripping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAPIRequestComponent now supports manual redirect following with per-hop SSRF revalidation when SSRF protection is enabled. Response handling is refactored into ChangesSSRF Protection with Manual Redirect Following
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 8 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (8 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Test Coverage AdvisorNo source changes detected without accompanying tests. Thanks for keeping coverage up! 🎉
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lfx/src/lfx/components/data_source/api_request.py`:
- Around line 643-656: The _headers_for_redirect function currently compares
only hostname and must be updated to compare the full origin (scheme + hostname
+ port) like httpx does; compute the origin for current_url and next_url using
parsed.scheme, parsed.hostname, and parsed.port (treat missing ports as the
default for the scheme, e.g., 80 for http, 443 for https), then only return
headers unchanged when the two origins are identical—otherwise filter out
sensitive keys ("authorization","proxy-authorization","cookie") as before.
- Around line 629-641: Update _method_for_redirect to treat HTTP_FOUND (302)
like HTTP_SEE_OTHER (303) by downgrading any non-"HEAD" method to "GET": change
the logic in _method_for_redirect to first uppercase method, then if status_code
== HTTP_SEE_OTHER and method != "HEAD" OR status_code == HTTP_FOUND and method
!= "HEAD" return "GET"; keep the special-case for HTTP_MOVED_PERMANENTLY (301)
to only change POST→GET (i.e., if status_code == HTTP_MOVED_PERMANENTLY and
method == "POST" return "GET"), otherwise return the original method.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f0210c9b-ead3-427d-88d8-e2202163ed56
📒 Files selected for processing (3)
src/backend/tests/unit/components/data_source/test_api_request_component.pysrc/backend/tests/unit/components/data_source/test_dns_rebinding.pysrc/lfx/src/lfx/components/data_source/api_request.py
| @staticmethod | ||
| def _method_for_redirect(method: str, status_code: int) -> str: | ||
| """Return the HTTP method to use after a redirect, mirroring httpx semantics. | ||
|
|
||
| A 303 (See Other) always becomes GET; 301/302 downgrade POST to GET for | ||
| browser compatibility; 307/308 preserve the original method (and body). | ||
| """ | ||
| method = method.upper() | ||
| if status_code == HTTP_SEE_OTHER and method != "HEAD": | ||
| return "GET" | ||
| if status_code in (HTTP_MOVED_PERMANENTLY, HTTP_FOUND) and method == "POST": | ||
| return "GET" | ||
| return method |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In httpx 0.24, for a 302 Found redirect, does the client change a PUT/PATCH/DELETE request method to GET, or only POST?
💡 Result:
In httpx 0.24, when follow_redirects=True and the response is 302 Found, httpx changes the method to GET for non-GET methods (including PUT/PATCH/DELETE), not only POST. This matches the “post/redirect/get” legacy behavior that many clients apply for 301/302; httpx added an (as of 2026) option to preserve the original method specifically for 301/302 redirects because the default behavior could change it and that broke some server expectations.[1][2] RFC 7231 notes that for historical reasons user agents MAY change POST to GET when following 302; it also explicitly states that only 307 guarantees method preservation.[3]
Citations:
- 1: HTTPX not flexible enough with preservation of HTTP method for 301 Moved Permanently redirect encode/httpx#2582
- 2: Add option to keep the same method for 301/302 redirects encode/httpx#3783
- 3: https://datatracker.ietf.org/doc/html/rfc7231
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show surrounding implementation + callers of _method_for_redirect
python3 - <<'PY'
import re, pathlib
path = pathlib.Path("src/lfx/src/lfx/components/data_source/api_request.py")
text = path.read_text(encoding="utf-8")
for m in re.finditer(r"def _method_for_redirect", text):
start = max(0, m.start()-800)
end = min(len(text), m.start()+8000)
print("=== around definition ===")
print(text[start:end])
PY
# Find any mention of follow_redirects or httpx redirect-method preservation options
rg -n "follow_redirects|redirect|_method_for_redirect|preserve|301|302|303|307|308" src/lfx/src/lfx/components/data_source/api_request.py
# Locate where httpx is used in this module (to see whether method-preservation options are set)
rg -n "httpx\." src/lfx/src/lfx/components/data_source/api_request.pyRepository: langflow-ai/langflow
Length of output: 14550
🌐 Web query:
httpx 0.24 redirect method change 302 PUT PATCH DELETE behavior follow_redirects
💡 Result:
In httpx, the 0.24-era change relevant to redirects is: redirects are NOT followed automatically; you must opt in with follow_redirects=True (client-level or per-request). This applies regardless of request method (GET/PUT/PATCH/DELETE etc.)—if you don’t enable following, you’ll just get the 3xx response back. 1) What changed around httpx 0.24 - HTTPX no longer automatically follows redirect responses; redirect following is opt-in via follow_redirects=True on the request or Client, instead of being the default behavior [1][2]. 2) follow_redirects parameter and defaults (including for PUT/PATCH/DELETE) - The public API for httpx.request/httpx.Client includes follow_redirects=False by default [3][4]. - The per-method helpers also expose follow_redirects (default False), e.g. httpx.put(..., follow_redirects=False), httpx.patch(..., follow_redirects=False), httpx.delete(..., follow_redirects=False) [3]. So, for a PUT/PATCH/DELETE request that receives a 302: - If follow_redirects=False (default), HTTPX returns the 302 response and does not send the redirected request. - If follow_redirects=True, HTTPX will send the redirect request(s). 3) 302 redirect method handling (what method httpx uses when following) - When follow_redirects is enabled and httpx constructs the redirect request, it may change the method based on response status code: for a 302 (FOUND) it “turn[s] 302s into GETs” (unless the original method is HEAD) [5]. Therefore, when following a 302 from PUT/PATCH/DELETE: - httpx will change the method to GET for the redirected request [5]. 4) How to detect / manually follow redirects instead of follow_redirects - HTTPX exposes response.next_request when follow_redirects=False, which you can use to build the next request yourself [4]. Practical takeaway - If you’re relying on “preserve PUT/PATCH/DELETE across 302”, httpx (when following redirects) will not do that by default; a 302 causes the redirected request method to become GET (legacy browser-like behavior) [5]. - If you want to prevent this, keep follow_redirects=False (default), and handle the Location yourself or adjust your API design to return 307/308 (which are defined to preserve method semantics) rather than 302. If you tell me your exact httpx callsite (e.g., httpx.Client(..., follow_redirects=...), or httpx.put(..., follow_redirects=...) plus the redirect status code), I can give a concrete expected request sequence for your specific case.
Citations:
- 1: https://github.com/encode/httpx/blob/master/CHANGELOG.md
- 2: https://github.com/encode/httpx/blob/master/docs/compatibility.md
- 3: https://www.python-httpx.org/api/
- 4: https://www.python-httpx.org/compatibility/
- 5: https://github.com/encode/httpx/blob/4f8068a7ade1f4d10403e325ed2354109df1724f/httpx/_client.py
Fix _method_for_redirect to match httpx redirect-method semantics for 302 Found
The current mapping only converts POST→GET for both 301 and 302, but httpx changes the method to GET for any non-HEAD method when following a 302 Found. This means PUT/PATCH/DELETE redirects will keep their original method (and body) instead of being downgraded to GET during the manual redirect-following path.
Locations: src/lfx/src/lfx/components/data_source/api_request.py (lines 629-641)
♻️ Align with httpx redirect-method semantics
method = method.upper()
- if status_code == HTTP_SEE_OTHER and method != "HEAD":
- return "GET"
- if status_code in (HTTP_MOVED_PERMANENTLY, HTTP_FOUND) and method == "POST":
- return "GET"
+ if status_code in (HTTP_SEE_OTHER, HTTP_FOUND) and method != "HEAD":
+ return "GET"
+ if status_code == HTTP_MOVED_PERMANENTLY and method == "POST":
+ return "GET"
return method🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lfx/src/lfx/components/data_source/api_request.py` around lines 629 -
641, Update _method_for_redirect to treat HTTP_FOUND (302) like HTTP_SEE_OTHER
(303) by downgrading any non-"HEAD" method to "GET": change the logic in
_method_for_redirect to first uppercase method, then if status_code ==
HTTP_SEE_OTHER and method != "HEAD" OR status_code == HTTP_FOUND and method !=
"HEAD" return "GET"; keep the special-case for HTTP_MOVED_PERMANENTLY (301) to
only change POST→GET (i.e., if status_code == HTTP_MOVED_PERMANENTLY and method
== "POST" return "GET"), otherwise return the original method.
| @staticmethod | ||
| def _headers_for_redirect(headers: dict | None, current_url: str, next_url: str) -> dict | None: | ||
| """Drop sensitive headers when a redirect crosses to a different host. | ||
|
|
||
| Mirrors httpx's auto-follow behavior so manually following redirects does not | ||
| leak credentials (Authorization / Cookie) to a different host than the one the | ||
| caller intended them for. Same-host redirects keep all headers. | ||
| """ | ||
| if not headers: | ||
| return headers | ||
| if urlparse(current_url).hostname == urlparse(next_url).hostname: | ||
| return headers | ||
| sensitive = {"authorization", "proxy-authorization", "cookie"} | ||
| return {k: v for k, v in headers.items() if k.lower() not in sensitive} |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
How does httpx 0.24 decide when to strip the Authorization header on redirects, and does it consider scheme and port (same-origin) or only hostname?
💡 Result:
httpx decides based on whether the redirect changes the request’s “origin”; it preserves the Authorization header only for same-origin redirects and strips it for cross-origin redirects. It does consider scheme and port as part of origin (not just hostname). In httpx, when following redirects, any Authorization header is automatically stripped on cross-domain redirects to prevent leaking credentials [1]. The httpx discussion indicates the stripping happens when the redirect goes to a different host/scheme, i.e., origin changes [1]. The definition of “origin” used in this context includes scheme, hostname, and port (scheme://hostname:port) [2]. Therefore, a same hostname but different scheme or port is treated as cross-origin, and Authorization will be stripped. Answering your specific question: it is not “only hostname”; it evaluates same-origin-ness including scheme and port [2], and strips Authorization when the redirect is cross-origin [1].
Citations:
- 1: follow_redirects dropping Bearer token header encode/httpx#3291
- 2: https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/Headers/Origin
Fix credentials stripping to match httpx “origin” (scheme + host + port), not hostname-only.
_headers_for_redirect only checks urlparse(...).hostname, so Authorization/Cookie are incorrectly preserved for same-host redirects that change scheme (https→http) or port (e.g., :443→:8443). httpx strips Authorization on redirects when the redirect is cross-origin; origin includes scheme and port, not just hostname.
🛡️ Proposed fix: compare full origin
`@staticmethod`
def _headers_for_redirect(headers: dict | None, current_url: str, next_url: str) -> dict | None:
"""Drop sensitive headers when a redirect crosses to a different host.
Mirrors httpx's auto-follow behavior so manually following redirects does not
leak credentials (Authorization / Cookie) to a different host than the one the
caller intended them for. Same-host redirects keep all headers.
"""
if not headers:
return headers
- if urlparse(current_url).hostname == urlparse(next_url).hostname:
- return headers
+
+ def _origin(u: str) -> tuple[str | None, str | None, int | None]:
+ parsed = urlparse(u)
+ default_port = {"https": 443, "http": 80}.get(parsed.scheme)
+ return (parsed.scheme, parsed.hostname, parsed.port or default_port)
+
+ if _origin(current_url) == _origin(next_url):
+ return headers
sensitive = {"authorization", "proxy-authorization", "cookie"}
return {k: v for k, v in headers.items() if k.lower() not in sensitive}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lfx/src/lfx/components/data_source/api_request.py` around lines 643 -
656, The _headers_for_redirect function currently compares only hostname and
must be updated to compare the full origin (scheme + hostname + port) like httpx
does; compute the origin for current_url and next_url using parsed.scheme,
parsed.hostname, and parsed.port (treat missing ports as the default for the
scheme, e.g., 80 for http, 443 for https), then only return headers unchanged
when the two origins are identical—otherwise filter out sensitive keys
("authorization","proxy-authorization","cookie") as before.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release-1.10.0 #13394 +/- ##
==================================================
- Coverage 57.68% 57.56% -0.12%
==================================================
Files 2264 2264
Lines 216811 216083 -728
Branches 31663 30530 -1133
==================================================
- Hits 125065 124394 -671
+ Misses 90330 90273 -57
Partials 1416 1416
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
Adam-Aghili
left a comment
There was a problem hiding this comment.
I think tis covers the use cases outlined in the JIRA/PVR
Summary
The API Request component could be used to bypass SSRF protection via HTTP redirects. When SSRF protection is enabled (
LANGFLOW_SSRF_PROTECTION_ENABLED=true), the component validated only the initial URL and pinned DNS for the initial host, then passed the advancedfollow_redirectsflag straight tohttpx, which auto-followed redirects to new hosts that were never re-validated. A public URL that redirects to an internal address (loopback, RFC1918 private ranges, or link-local / cloud metadata endpoints) was therefore followed, returning the internal service's response.Root cause
APIRequestComponent.make_api_request()calledvalidate_and_resolve_url()on the initial URL only, then lethttpx.AsyncClient.request(..., follow_redirects=True)auto-follow redirects. DNS pinning applied to the initial hostname only, so a redirect to a different host (e.g.127.0.0.1) connected without any SSRF check — the warning on the UI field did not enforce a security boundary.Fix
When SSRF protection is enabled, redirects are no longer auto-followed by httpx. Instead the component follows them manually and re-validates every hop:
follow_redirects=False.Location, the target is resolved (relative locations included viaurljoin) and re-validated withvalidate_and_resolve_url()— the same private/loopback/link-local denylist and DNS pinning used for the initial request. A blocked hop raises an error instead of being followed.file://) are blocked.MAX_REDIRECTS=20.Authorization,Cookie) are dropped on cross-host redirects, matching httpx's own behavior.make_request()now defaults tofollow_redirects=False(fail-safe).When SSRF protection is disabled, behavior is unchanged (httpx handles redirects as before).
Test plan
Added regression tests covering each vector:
127.0.0.1(loopback) blocked192.168.x,10.x,172.16.x) blocked169.254.169.254) blocked0.0.0.0blockedfile://) blockedAuthorizationstripped on cross-host redirect; non-sensitive headers preservedSummary by CodeRabbit