fix(proxy): 检测 200+HTML 的假200并触发故障转移(#749)#763
fix(proxy): 检测 200+HTML 的假200并触发故障转移(#749)#763tesgth032 wants to merge 5 commits intoding113:devfrom
Conversation
Summary of ChangesHello @tesgth032, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求旨在解决代理服务中一个关键问题:当上游服务返回 HTTP 200 状态码但实际内容是 HTML 错误页面时,系统会错误地将其识别为成功响应。通过引入对 Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthrough在 forwarder 中增加对非流式 200 响应体的检测与截断读取(含 HTML 嗅探),在识别为伪 200(FAKE_200_HTML_BODY 或 上游 JSON 异常)时抛出 ProxyError 并记录/触发提供商切换;新增上游错误检测工具与单元测试,并调整若干测试桩。 Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
src/app/v1/_lib/proxy/forwarder.ts
Outdated
| if (isHtml || !contentLength) { | ||
| const clonedResponse = response.clone(); | ||
| clonedResponseText = await clonedResponse.text(); | ||
| } |
There was a problem hiding this comment.
potential double clone when !contentLength && isHtml both true - body read twice unnecessarily
| if (isHtml || !contentLength) { | |
| const clonedResponse = response.clone(); | |
| clonedResponseText = await clonedResponse.text(); | |
| } | |
| let clonedResponseText: string | undefined; | |
| if (isHtml || !contentLength) { | |
| const clonedResponse = response.clone(); | |
| clonedResponseText = await clonedResponse.text(); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/forwarder.ts
Line: 678:681
Comment:
potential double clone when `!contentLength && isHtml` both true - body read twice unnecessarily
```suggestion
let clonedResponseText: string | undefined;
if (isHtml || !contentLength) {
const clonedResponse = response.clone();
clonedResponseText = await clonedResponse.text();
}
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Code Review
This PR addresses issues with failover and session binding when upstream services return "fake 200" HTML error pages. While the implementation is clear, efficient, and well-tested, it introduces a potential Denial of Service (DoS) vulnerability. The proxy buffers the entire response body in memory without a size limit when detecting text/html content or if Content-Length is missing. This could lead to memory exhaustion (OOM) or block the event loop if a malicious upstream sends large responses. It is recommended to implement a strict size limit (e.g., 1MB) for response body buffering during error detection. A minor improvement could also be made to the HTML detection regular expression for stricter matching to prevent potential misjudgments.
src/app/v1/_lib/proxy/forwarder.ts
Outdated
| if (isHtml || !contentLength) { | ||
| const clonedResponse = response.clone(); | ||
| clonedResponseText = await clonedResponse.text(); | ||
| } |
There was a problem hiding this comment.
The code reads the entire upstream response body into memory using response.text() without enforcing a maximum size limit. This occurs when the response is identified as HTML or when the Content-Length header is missing. A malicious or compromised upstream provider could send an excessively large response, leading to process-wide memory exhaustion (OOM). Furthermore, the resulting large string is passed to detectUpstreamErrorFromSseOrJsonText, which synchronously parses it as JSON if it appears to be a JSON object. Parsing a very large JSON string blocks the Node.js event loop, causing a denial of service for all concurrent requests handled by the proxy.
To remediate this, implement a size limit when reading the response body for error detection. Use a helper function to read the stream up to a fixed limit (e.g., 1MB) and abort if the limit is exceeded.
|
|
||
| const HTML_DOC_SNIFF_MAX_CHARS = 1024; | ||
| const HTML_DOCTYPE_RE = /^<!doctype\s+html[\s>]/i; | ||
| const HTML_HTML_TAG_RE = /<html[\s>]/i; |
There was a problem hiding this comment.
There was a problem hiding this comment.
Code Review Summary
No significant issues identified in this PR. The implementation is conservative and well-scoped: it detects fake 200 responses with HTML bodies (e.g., Cloudflare/WAF error pages) in the non-streaming path and correctly triggers the existing retry/failover/circuit-breaker machinery by throwing a ProxyError(502). The detection logic uses strong signals only (doctype/html tags), the response clone strategy is sound, and the refactoring to share clonedResponseText between the HTML check and the existing empty-body check avoids redundant clones.
PR Size: M
- Lines changed: 319 (316 additions, 3 deletions)
- Files changed: 5
Review Coverage
- Logic and correctness - Clean
- Security (OWASP Top 10) - Clean
- Error handling - Clean
- Type safety - Clean
- Documentation accuracy - Clean
- Test coverage - Adequate
- Code clarity - Good
Automated review by Claude AI
src/app/v1/_lib/proxy/forwarder.ts
Outdated
| if (isHtml || !contentLength) { | ||
| const clonedResponse = response.clone(); | ||
| const inspected = await readResponseTextUpTo( | ||
| clonedResponse, | ||
| NON_STREAM_BODY_INSPECTION_MAX_BYTES | ||
| ); | ||
| inspectedText = inspected.text; | ||
| inspectedTruncated = inspected.truncated; | ||
| } |
There was a problem hiding this comment.
When isHtml is false but !contentLength is true, the body is read into inspectedText but never used for HTML detection (line 745 checks isHtml && inspectedText !== undefined). This means non-HTML responses without Content-Length will have their body read twice unnecessarily - once here and again at line 759 where inspectedText is reused only for the empty check.
Consider optimizing to only read when needed:
| if (isHtml || !contentLength) { | |
| const clonedResponse = response.clone(); | |
| const inspected = await readResponseTextUpTo( | |
| clonedResponse, | |
| NON_STREAM_BODY_INSPECTION_MAX_BYTES | |
| ); | |
| inspectedText = inspected.text; | |
| inspectedTruncated = inspected.truncated; | |
| } | |
| let inspectedText: string | undefined; | |
| let inspectedTruncated = false; | |
| if (!contentLength) { | |
| const clonedResponse = response.clone(); | |
| const inspected = await readResponseTextUpTo( | |
| clonedResponse, | |
| NON_STREAM_BODY_INSPECTION_MAX_BYTES | |
| ); | |
| inspectedText = inspected.text; | |
| inspectedTruncated = inspected.truncated; | |
| // Perform HTML detection on the already-read body if applicable | |
| if (isHtml) { | |
| const detected = detectUpstreamErrorFromSseOrJsonText(inspectedText); | |
| if (detected.isError && detected.code === "FAKE_200_HTML_BODY") { | |
| throw new ProxyError(detected.code, 502, { | |
| body: detected.detail ?? "", | |
| providerId: currentProvider.id, | |
| providerName: currentProvider.name, | |
| }); | |
| } | |
| } | |
| } else if (isHtml) { | |
| // Has Content-Length but is HTML - need to read for detection | |
| const clonedResponse = response.clone(); | |
| const inspected = await readResponseTextUpTo( | |
| clonedResponse, | |
| NON_STREAM_BODY_INSPECTION_MAX_BYTES | |
| ); | |
| const detected = detectUpstreamErrorFromSseOrJsonText(inspected.text); | |
| if (detected.isError && detected.code === "FAKE_200_HTML_BODY") { | |
| throw new ProxyError(detected.code, 502, { | |
| body: detected.detail ?? "", | |
| providerId: currentProvider.id, | |
| providerName: currentProvider.name, | |
| }); | |
| } | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/forwarder.ts
Line: 735:743
Comment:
When `isHtml` is false but `!contentLength` is true, the body is read into `inspectedText` but never used for HTML detection (line 745 checks `isHtml && inspectedText !== undefined`). This means non-HTML responses without Content-Length will have their body read twice unnecessarily - once here and again at line 759 where `inspectedText` is reused only for the empty check.
Consider optimizing to only read when needed:
```suggestion
let inspectedText: string | undefined;
let inspectedTruncated = false;
if (!contentLength) {
const clonedResponse = response.clone();
const inspected = await readResponseTextUpTo(
clonedResponse,
NON_STREAM_BODY_INSPECTION_MAX_BYTES
);
inspectedText = inspected.text;
inspectedTruncated = inspected.truncated;
// Perform HTML detection on the already-read body if applicable
if (isHtml) {
const detected = detectUpstreamErrorFromSseOrJsonText(inspectedText);
if (detected.isError && detected.code === "FAKE_200_HTML_BODY") {
throw new ProxyError(detected.code, 502, {
body: detected.detail ?? "",
providerId: currentProvider.id,
providerName: currentProvider.name,
});
}
}
} else if (isHtml) {
// Has Content-Length but is HTML - need to read for detection
const clonedResponse = response.clone();
const inspected = await readResponseTextUpTo(
clonedResponse,
NON_STREAM_BODY_INSPECTION_MAX_BYTES
);
const detected = detectUpstreamErrorFromSseOrJsonText(inspected.text);
if (detected.isError && detected.code === "FAKE_200_HTML_BODY") {
throw new ProxyError(detected.code, 502, {
body: detected.detail ?? "",
providerId: currentProvider.id,
providerName: currentProvider.name,
});
}
}
```
How can I resolve this? If you propose a fix, please make it concise.| async function readResponseTextUpTo( | ||
| response: Response, | ||
| maxBytes: number | ||
| ): Promise<{ text: string; truncated: boolean }> { | ||
| const reader = response.body?.getReader(); | ||
| if (!reader) { | ||
| return { text: "", truncated: false }; | ||
| } | ||
|
|
||
| const decoder = new TextDecoder(); | ||
| const chunks: string[] = []; | ||
| let bytesRead = 0; | ||
| let truncated = false; | ||
|
|
||
| while (true) { | ||
| const { done, value } = await reader.read(); | ||
| if (done) break; | ||
| if (!value || value.byteLength === 0) continue; | ||
|
|
||
| const remaining = maxBytes - bytesRead; | ||
| if (remaining <= 0) { | ||
| truncated = true; | ||
| break; | ||
| } | ||
|
|
||
| if (value.byteLength > remaining) { | ||
| chunks.push(decoder.decode(value.subarray(0, remaining), { stream: true })); | ||
| bytesRead += remaining; | ||
| truncated = true; | ||
| break; | ||
| } | ||
|
|
||
| chunks.push(decoder.decode(value, { stream: true })); | ||
| bytesRead += value.byteLength; | ||
| } | ||
|
|
||
| const flushed = decoder.decode(); | ||
| if (flushed) chunks.push(flushed); | ||
|
|
||
| if (truncated) { | ||
| try { | ||
| await reader.cancel(); | ||
| } catch { | ||
| // ignore | ||
| } | ||
| } | ||
|
|
||
| return { text: chunks.join(""), truncated }; | ||
| } |
There was a problem hiding this comment.
Added readResponseTextUpTo function to safely read response bodies with a 1MB limit, preventing memory exhaustion from malicious or misconfigured upstream providers.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/forwarder.ts
Line: 94:142
Comment:
Added `readResponseTextUpTo` function to safely read response bodies with a 1MB limit, preventing memory exhaustion from malicious or misconfigured upstream providers.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/app/v1/_lib/proxy/forwarder.ts`:
- Around line 779-831: The EmptyResponseError thrown inside the JSON parsing
block (see EmptyResponseError, currentProvider, logger in
src/app/v1/_lib/proxy/forwarder.ts) is currently swallowed by the surrounding
catch; change the catch to detect and rethrow EmptyResponseError (e.g. if
(_parseError instanceof EmptyResponseError) throw _parseError) and only treat
other errors as JSON parse failures (log the debug about non-JSON response and
skip content check). This preserves retries/failover for missing_content while
still handling genuine parse errors.
🧹 Nitpick comments (1)
src/app/v1/_lib/proxy/forwarder.ts (1)
719-719: 冗余的!isSSE条件判断。Line 694 的
if (isSSE)分支已经return response,执行到 line 719 时isSSE必定为false,此处if (!isSSE)恒为真,可简化移除外层if减少嵌套层级。
dcfac4a to
c9bafaf
Compare
| import { beforeEach, describe, expect, test, vi } from "vitest"; | ||
|
|
||
| const mocks = vi.hoisted(() => { |
There was a problem hiding this comment.
Inconsistent line endings
This new test file appears to be committed with CRLF line endings (visible as \r in the diff). In this repo the other TS files shown are LF-only, so this will create noisy diffs and can break tooling that enforces end-of-line normalization. Please re-save/normalize this file to LF and recommit.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/unit/proxy/proxy-forwarder-fake-200-html.test.ts
Line: 1:3
Comment:
**Inconsistent line endings**
This new test file appears to be committed with CRLF line endings (visible as `\r` in the diff). In this repo the other TS files shown are LF-only, so this will create noisy diffs and can break tooling that enforces end-of-line normalization. Please re-save/normalize this file to LF and recommit.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
Several tests now add Also appears in: tests/unit/lib/provider-endpoints/probe.test.ts:112-114 Prompt To Fix With AIThis is a comment left during a code review.
Path: tests/unit/lib/provider-endpoints/probe.test.ts
Line: 71:73
Comment:
**Missing endpoint-circuit mocks**
Several tests now add `getEndpointCircuitStateSync`/`resetEndpointCircuit` to the `@/lib/endpoint-circuit-breaker` mock, but at least one test case (e.g. around here in the 5xx/4xx section) still mocks only `recordEndpointFailure`. Since `src/lib/provider-endpoints/probe.ts` imports `getEndpointCircuitStateSync` and `resetEndpointCircuit` at module scope, those tests will throw at runtime when `probe.ts` is imported. Make the endpoint-circuit-breaker mock include all required exports in every test that imports `probeEndpointUrl`/`probeProviderEndpointAndRecord`.
Also appears in: tests/unit/lib/provider-endpoints/probe.test.ts:112-114
How can I resolve this? If you propose a fix, please make it concise. |
Summary
Detects fake HTTP 200 responses that return HTML pages (e.g., Cloudflare/WAF/maintenance pages) instead of valid API responses, and triggers failover/circuit-breaker logic so the proxy correctly retries with another provider.
Problem
Some upstream providers or relay stations return HTTP 200 with
text/htmlcontent (Cloudflare challenge pages, WAF blocks, maintenance pages) instead of a proper API response. These fake successes cause:Related Issues & PRs:
Solution
Two-pronged detection approach, both conservative to avoid false positives:
Non-streaming path (
ProxyForwarder): When the upstream response hascontent-type: text/htmlorapplication/xhtml+xml, clone and inspect the body. If it looks like a full HTML document (doctype /<html>tag), throwProxyError(502)so existing retry/failover/circuit-breaker logic kicks in, andrecordSuccess()/ session binding are skipped.Shared detection utility (
upstream-error-detection.ts): AddedFAKE_200_HTML_BODYpattern todetectUpstreamErrorFromSseOrJsonText(), reusing the same fake-200 detection pipeline for consistency.Changes
Core Changes
src/app/v1/_lib/proxy/forwarder.ts(+29/-3) - Added HTML content-type check in the non-streaming success branch; clones response body for sniffing and throwsProxyError(502)on HTML document detectionsrc/lib/utils/upstream-error-detection.ts(+26/-0) - AddedFAKE_200_HTML_BODYerror pattern with HTML document regex detectionTests
tests/unit/proxy/proxy-forwarder-fake-200-html.test.ts(+224/-0) - Comprehensive test suite covering HTML fake-200 identification and failover triggeringsrc/lib/utils/upstream-error-detection.test.ts(+25/-0) - Unit tests for the new HTML body detection patterntests/unit/lib/provider-endpoints/probe.test.ts(+12/-0) - Additional probe test coverageBreaking Changes
None. This is a purely additive detection enhancement in the proxy pipeline. No API changes, no schema changes, no removed exports.
Testing
Automated Tests
Manual Testing
Checklist
Description enhanced by Claude AI - original description preserved below
Original Description (Chinese)
背景
Issue #749:部分上游/中转站会返回 HTTP 200,但实际响应为 text/html 的 HTML 页面(常见于 Cloudflare/WAF/维护页)。
这类响应如果被当作成功:
改动
ProxyForwarder的非流式成功分支中:当content-type指向text/html/application/xhtml+xml时,clone 读取 body 并做 强信号嗅探:ProxyError(502),使现有的重试/切换/熔断逻辑正常生效,并避免recordSuccess()/ session 绑定被更新。detectUpstreamErrorFromSseOrJsonText()中补充 HTML 文档检测(FAKE_200_HTML_BODY),用于复用同一套"假 200"判定逻辑。说明
测试
Fixes #749
Greptile Overview
Greptile Summary
This PR improves proxy failover correctness by detecting “fake 200” upstream responses that are actually HTML documents (e.g. WAF/Cloudflare/maintenance pages) and converting them into a failure signal so existing retry/circuit-breaker/session-binding logic behaves as if the upstream failed.
Key changes:
src/app/v1/_lib/proxy/forwarder.tsadds non-streaming body sniffing (capped at 1MiB) for HTML content-types and throwsProxyError(502)when the body looks like a real HTML document.src/lib/utils/upstream-error-detection.tsadds a sharedFAKE_200_HTML_BODYdetection path (doctype /<html>tag sniffing) and returns a sanitized/truncateddetailfor logging/diagnostics.Main issue to address before merge: at least one
tests/unit/lib/provider-endpoints/probe.test.tscase still mocks an incomplete@/lib/endpoint-circuit-breakermodule even thoughprobe.tsnow imports additional symbols at module scope; this will cause those tests to throw when importingprobe.ts. Also, the new forwarder test file appears to be committed with CRLF line endings, which can violate repo formatting expectations and create noisy diffs.Confidence Score: 4/5
Important Files Changed
Sequence Diagram
sequenceDiagram autonumber participant Client participant ProxyForwarder participant UpstreamProvider as Provider participant UpstreamDetect as detectUpstreamErrorFromSseOrJsonText participant CircuitBreaker participant ProviderSelector Client->>ProxyForwarder: send(session) ProxyForwarder->>UpstreamProvider: doForward(request) UpstreamProvider-->>ProxyForwarder: Response(200, content-type) alt Streaming (text/event-stream) ProxyForwarder->>ProxyForwarder: setDeferredStreamingFinalization(...) ProxyForwarder-->>Client: stream response else Non-streaming ProxyForwarder->>ProxyForwarder: if content-length == 0 throw EmptyResponseError alt HTML content-type (text/html or application/xhtml+xml) ProxyForwarder->>ProxyForwarder: clone response + readResponseTextUpTo(1MiB) ProxyForwarder->>UpstreamDetect: detect(text) alt detected FAKE_200_HTML_BODY ProxyForwarder->>CircuitBreaker: recordFailure(providerId, ProxyError) ProxyForwarder->>ProviderSelector: pickRandomProviderWithExclusion(exclude providerId) ProxyForwarder->>UpstreamProvider: doForward(request) on next provider UpstreamProvider-->>ProxyForwarder: Response(success) ProxyForwarder->>CircuitBreaker: recordSuccess(newProviderId) ProxyForwarder-->>Client: response else not HTML-doc ProxyForwarder->>CircuitBreaker: recordSuccess(providerId) ProxyForwarder-->>Client: response end else Missing content-length ProxyForwarder->>ProxyForwarder: clone response + readResponseTextUpTo(1MiB) ProxyForwarder->>ProxyForwarder: JSON parse + missing_content checks ProxyForwarder->>CircuitBreaker: recordSuccess(providerId) ProxyForwarder-->>Client: response end end