Conversation
…errides feat(provider): add Anthropic max_tokens and thinking budget override settings
…691) Gemini SSE streams return usageMetadata in every event, but only the final event contains complete token counts (candidatesTokenCount, thoughtsTokenCount). The existing first-wins strategy in applyUsageValue caused output tokens to be missed since early events only have promptTokenCount. This fix introduces last-wins strategy specifically for Gemini SSE usageMetadata while preserving first-wins for other formats (Claude, Codex) where usage is returned complete in a single event. Fixes: Gemini streaming responses showing 0 output tokens in billing
* fix: 修复 buildProxyUrl 重复拼接版本前缀的问题 * refactor(proxy): optimize regex compilation in buildProxyUrl - Move escapeRegExp helper to module scope to avoid recreation - Remove case-insensitive flag from version endpoint regex * refactor(proxy): optimize endpoint regex compilation in URL builder Move endpoint regex compilation outside the loop to avoid repeated regex creation on every request. Pre-compile endpoint patterns at module load time for better performance.
* Create Dockerfile Signed-off-by: h7ml <h7ml@qq.com> * Update Dockerfile Signed-off-by: h7ml <h7ml@qq.com> * Update Dockerfile Signed-off-by: h7ml <h7ml@qq.com> --------- Signed-off-by: h7ml <h7ml@qq.com>
…ting (#709) * refactor(proxy): remove format converters and enforce same-format routing BREAKING CHANGE: Cross-format conversion is no longer supported. Requests must be routed to providers with matching API formats. - Delete all converters (claude-to-openai, openai-to-claude, codex-*, gemini-cli-*) - Remove Codex CLI adapter, instruction injection, and request sanitizer - Simplify ProxyForwarder to pass-through without format transformation - Update provider-selector to enforce format compatibility - Remove ResponseTransformer conversion logic from response-handler - Clean up session-extractor to remove Codex-specific handling - Delete related test files for removed functionality Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * chore(ui): remove joinClaudePool and codexInstructionsStrategy from provider forms - Remove legacy pool joining and instruction strategy UI controls - Clean up i18n messages for removed provider form fields (all 5 languages) - Update provider actions and form context/types - Remove unused routing section options - Update related test mocks Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * docs: update README to reflect strict same-format routing - Remove outdated claims about format conversion and Codex CLI injection - Clarify that proxy enforces same-format routing with no cross-format conversion - Add format-compatibility unit tests for provider-selector Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> * chore: remove unused imports (lint fix) Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai> --------- Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
📝 Walkthrough概述该PR实现了Anthropic预算整流机制、删除了格式转换器系统、添加了提供商级参数覆盖、并扩展了只读访问支持。包含多阶段Docker构建、数据库迁移、新型重试逻辑、UI重构和多语言更新。 变更
代码审查工作量评估🎯 4 (复杂) | ⏱️ ~75 分钟 原因:
可能相关的PR
🚥 Pre-merge checks | ❌ 3❌ Failed checks (3 warnings)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🧪 测试结果
总体结果: ✅ 所有测试通过 |
| if (currentMaxTokens !== null && budgetTokens >= currentMaxTokens) { | ||
| budgetTokens = currentMaxTokens - 1; |
There was a problem hiding this comment.
Anthropic API requires budget_tokens < max_tokens (strict less than), but this clamping uses budgetTokens = currentMaxTokens - 1. If currentMaxTokens is exactly 1024, this results in budgetTokens = 1023 which fails the >= 1024 minimum requirement.
| if (currentMaxTokens !== null && budgetTokens >= currentMaxTokens) { | |
| budgetTokens = currentMaxTokens - 1; | |
| if (currentMaxTokens !== null && budgetTokens >= currentMaxTokens) { | |
| budgetTokens = Math.max(MIN_BUDGET_TOKENS, currentMaxTokens - 1); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/anthropic/provider-overrides.ts
Line: 69:70
Comment:
Anthropic API requires `budget_tokens < max_tokens` (strict less than), but this clamping uses `budgetTokens = currentMaxTokens - 1`. If `currentMaxTokens` is exactly 1024, this results in `budgetTokens = 1023` which fails the >= 1024 minimum requirement.
```suggestion
if (currentMaxTokens !== null && budgetTokens >= currentMaxTokens) {
budgetTokens = Math.max(MIN_BUDGET_TOKENS, currentMaxTokens - 1);
}
```
How can I resolve this? If you propose a fix, please make it concise.| COPY --from=builder /app/.next ./.next | ||
| COPY --from=builder /app/node_modules ./node_modules | ||
| COPY --from=builder /app/package.json ./package.json | ||
| COPY --from=builder /app/drizzle ./drizzle |
There was a problem hiding this comment.
trailing whitespace after ./drizzle
Prompt To Fix With AI
This is a comment left during a code review.
Path: Dockerfile
Line: 26:26
Comment:
trailing whitespace after `./drizzle`
How can I resolve this? If you propose a fix, please make it concise.| ALTER TABLE "notification_target_bindings" ALTER COLUMN "schedule_timezone" DROP DEFAULT;--> statement-breakpoint | ||
| ALTER TABLE "providers" ADD COLUMN "anthropic_max_tokens_preference" varchar(20);--> statement-breakpoint | ||
| ALTER TABLE "providers" ADD COLUMN "anthropic_thinking_budget_preference" varchar(20); No newline at end of file |
There was a problem hiding this comment.
verify this migration includes DROP COLUMN statements for removed join_claude_pool and codex_instructions_strategy fields - current migration only adds new columns without cleaning up deprecated ones
Prompt To Fix With AI
This is a comment left during a code review.
Path: drizzle/0060_bored_gertrude_yorkes.sql
Line: 1:3
Comment:
verify this migration includes DROP COLUMN statements for removed `join_claude_pool` and `codex_instructions_strategy` fields - current migration only adds new columns without cleaning up deprecated ones
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/app/[locale]/dashboard/logs/_components/error-details-dialog/components/LogicTraceTab.tsx (1)
51-58:⚠️ Potential issue | 🟠 MajorPR 目标分支应为 dev
当前 PR 目标分支是
main,与仓库约定不一致。请先将目标分支调整为dev,再继续评审流程。Based on learnings All pull requests must target the dev branch (https://github.com/ding113/claude-code-hub).src/app/[locale]/dashboard/availability/_components/endpoint/latency-curve.tsx (1)
16-21:⚠️ Potential issue | 🟠 Major用户可见文案需使用 i18n
图表标签与状态文案仍是硬编码,违反 i18n 约定。建议改为使用
t(...)。建议修改
-const chartConfig = { - latency: { - label: "Latency", - color: "var(--chart-1)", - }, -} satisfies ChartConfig; - -export function LatencyCurve({ logs, className }: LatencyCurveProps) { - const t = useTranslations("dashboard.availability.latencyCurve"); +export function LatencyCurve({ logs, className }: LatencyCurveProps) { + const t = useTranslations("dashboard.availability.latencyCurve"); + const chartConfig = { + latency: { + label: t("latency"), + color: "var(--chart-1)", + }, + } satisfies ChartConfig; @@ - <div className={cn("text-xs", data?.ok ? "text-emerald-500" : "text-rose-500")}> - {data?.statusCode || (data?.ok ? "OK" : "FAIL")} + <div className={cn("text-xs", data?.ok ? "text-emerald-500" : "text-rose-500")}> + {data?.statusCode || (data?.ok ? t("ok") : t("fail"))} </div>Also applies to: 121-126
src/app/[locale]/dashboard/availability/_components/provider/latency-chart.tsx (1)
16-28:⚠️ Potential issue | 🟠 Major图例与提示文案需 i18n 化
P50/P95/P99和 tooltip 的dataKey作为显示文本仍是硬编码或原始值,需改为 i18n。建议修改
-const chartConfig = { - p50: { - label: "P50", - color: "var(--chart-2)", - }, - p95: { - label: "P95", - color: "var(--chart-4)", - }, - p99: { - label: "P99", - color: "var(--chart-1)", - }, -} satisfies ChartConfig; - -export function LatencyChart({ providers, className }: LatencyChartProps) { - const t = useTranslations("dashboard.availability.latencyChart"); +export function LatencyChart({ providers, className }: LatencyChartProps) { + const t = useTranslations("dashboard.availability.latencyChart"); + const chartConfig = { + p50: { + label: t("p50"), + color: "var(--chart-2)", + }, + p95: { + label: t("p95"), + color: "var(--chart-4)", + }, + p99: { + label: t("p99"), + color: "var(--chart-1)", + }, + } satisfies ChartConfig; @@ - {chartConfig[item.dataKey as keyof typeof chartConfig]?.label || - item.dataKey} + {chartConfig[item.dataKey as keyof typeof chartConfig]?.label ?? + t("unknownSeries")}Also applies to: 153-162
🤖 Fix all issues with AI agents
In `@Dockerfile`:
- Around line 18-19: The Dockerfile sets the container port to 8080 but the app
and health checks expect port 3000; update the ENV PORT value and the EXPOSE
directive so they match the runtime configuration by changing ENV PORT to 3000
and EXPOSE to 3000 (adjust the ENV PORT and EXPOSE lines accordingly).
In `@src/app/v1/_lib/proxy/thinking-budget-rectifier.test.ts`:
- Around line 2-5: The test imports detectThinkingBudgetRectifierTrigger and
rectifyThinkingBudget using a relative path; update the import statement to use
the project path alias (start imports with "@/") so modules are resolved from
src via the alias—for example replace the relative import of
"./thinking-budget-rectifier" with
"@/app/v1/_lib/proxy/thinking-budget-rectifier" (keeping the same exported
symbols detectThinkingBudgetRectifierTrigger and rectifyThinkingBudget) to
follow the coding guideline.
In `@src/app/v1/_lib/proxy/thinking-budget-rectifier.ts`:
- Around line 73-78: Current check treats arrays as objects and will mutate
them; change the guard so message.thinking is only reused when it's a non-null
plain object (i.e., typeof message.thinking === "object" && message.thinking !==
null && !Array.isArray(message.thinking)), otherwise reassign message.thinking =
{}; then cast to thinkingObj and set thinkingObj.type = "enabled" as before
(referencing message.thinking and thinkingObj).
In `@src/lib/utils/special-settings.ts`:
- Around line 78-90: The deduplication key for the "thinking_budget_rectifier"
case omits thinkingType, which can merge distinct events; update the key
generation in the switch branch handling "thinking_budget_rectifier" (in
special-settings.ts) to include setting.before.thinkingType and
setting.after.thinkingType in the JSON.stringify array alongside the existing
fields (e.g., setting.before.maxTokens, setting.before.thinkingBudgetTokens,
setting.after.maxTokens, setting.after.thinkingBudgetTokens) so events that
differ only by thinkingType remain distinct.
🧹 Nitpick comments (5)
src/app/[locale]/dashboard/logs/_components/error-details-dialog/components/LogicTraceTab.tsx (1)
51-58: 考虑移除未使用的 statusCode 参数或在组件中使用它目前仅改名为
_statusCode来消除未使用告警,但仍对外暴露该字段,容易误导调用方。若确认不再需要,请同步从LogicTraceTabProps与调用处移除;若需要,请在渲染中实际消费它。可选修改(仅限本文件)
-export function LogicTraceTab({ - statusCode: _statusCode, +export function LogicTraceTab({ providerChain, blockedBy, blockedReason, requestSequence, initialExpandedChainIndex, }: LogicTraceTabProps) {src/app/[locale]/dashboard/availability/_components/endpoint/probe-terminal.tsx (1)
43-65:levelConfig中的icon属性是死代码。根据 AI 摘要,
const Icon = config.icon;的使用已被移除,但levelConfig中仍然定义了icon属性(第 45、52、59 行)。这些属性现在未被使用,应当清理以保持代码整洁。此外,AI 摘要声称
XCircle已从导入中移除,但实际上它仍存在于第 4 行的导入语句中,并在第 52 行被引用。♻️ 建议移除未使用的 icon 属性
const levelConfig = { success: { - icon: CheckCircle2, label: "OK", color: "text-emerald-500", bgColor: "bg-emerald-500/5", borderColor: "border-l-emerald-500", }, error: { - icon: XCircle, label: "FAIL", color: "text-rose-500", bgColor: "bg-rose-500/5", borderColor: "border-l-rose-500", }, warn: { - icon: AlertCircle, label: "WARN", color: "text-amber-500", bgColor: "bg-amber-500/5", borderColor: "border-l-amber-500", }, };移除
icon属性后,也需要清理未使用的导入:-import { AlertCircle, CheckCircle2, Download, XCircle } from "lucide-react"; +import { Download } from "lucide-react";src/app/[locale]/settings/prices/_components/price-list.tsx (1)
73-73: 可以考虑移除未使用的 Hook 调用变量
_locale从useLocale()获取后在组件中从未被使用。如果确实不需要 locale 信息,建议直接移除这行代码以保持代码简洁。♻️ 建议的修改
- const _locale = useLocale();如果将来需要使用 locale,可以随时重新添加。
tests/unit/app/v1/url.test.ts (1)
58-83: 测试覆盖新的正则匹配逻辑新增的三个测试用例很好地覆盖了版本前缀处理:
v1beta标准 beta 版本v1internal内部版本标识v2未来版本兼容性建议考虑补充以下边界情况测试:
📝 可选:补充边界情况测试
test("处理带尾部斜杠的请求路径", () => { const out = buildProxyUrl( "https://example.com/api/models", new URL("https://dummy.com/v1/models/") ); expect(out).toBe("https://example.com/api/models/"); }); test("不匹配无效版本前缀如 vbeta", () => { const out = buildProxyUrl( "https://example.com/api/models", new URL("https://dummy.com/vbeta/models/some-model") ); // 应走标准拼接逻辑 expect(out).toBe("https://example.com/api/models/vbeta/models/some-model"); });tests/integration/readonly-access-endpoints.test.ts (1)
41-100: 测试辅助函数和清理逻辑实现规范。
createTestUser和createTestKey提供了明确的错误消息afterAll使用软删除避免外键约束问题beforeEach正确重置认证状态唯一建议:可以考虑将这些辅助函数抽取到共享的 test-utils 中,因为
tests/api/my-usage-readonly.test.ts中有几乎相同的实现。
| ENV PORT=8080 | ||
| EXPOSE 8080 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# 查找所有端口相关的配置
rg -n "PORT|8080|23000" --type-list | head -5
fd -type f -name "README*" -o -name "docker-compose*" -o -name ".env*" -o -name "Dockerfile*" | head -20Repository: ding113/claude-code-hub
Length of output: 627
🏁 Script executed:
#!/bin/bash
# 在配置文件中查找端口配置
rg "8080|23000|APP_PORT|PORT" . --type md --type yml --type yaml --type dockerfile -A 2 -B 2 | head -50Repository: ding113/claude-code-hub
Length of output: 94
🏁 Script executed:
#!/bin/bash
# 找到配置相关文件
fd -type f \( -name "README*" -o -name "docker-compose*" -o -name ".env*" -o -name "Dockerfile*" \)Repository: ding113/claude-code-hub
Length of output: 237
🏁 Script executed:
#!/bin/bash
# 在Dockerfile中查找端口配置
cat Dockerfile | grep -n "PORT\|EXPOSE\|8080\|23000"Repository: ding113/claude-code-hub
Length of output: 98
🏁 Script executed:
#!/bin/bash
# 在README和其他配置文件中查找端口
rg "PORT|8080|23000" . -A 2 -B 2 2>/dev/null | head -100Repository: ding113/claude-code-hub
Length of output: 7556
🏁 Script executed:
#!/bin/bash
# 查找README文件
fd -type f -name "README*" | head -10Repository: ding113/claude-code-hub
Length of output: 237
🏁 Script executed:
#!/bin/bash
# 查找所有提及APP_PORT的文件
rg "APP_PORT" . 2>/dev/null | head -20Repository: ding113/claude-code-hub
Length of output: 1499
🏁 Script executed:
#!/bin/bash
# 查找所有提及23000的文件
rg "23000" . 2>/dev/null | head -20Repository: ding113/claude-code-hub
Length of output: 2016
🏁 Script executed:
#!/bin/bash
# 读取docker-compose.yaml文件了解端口配置
cat docker-compose.yaml | head -100Repository: ding113/claude-code-hub
Length of output: 2149
🏁 Script executed:
#!/bin/bash
# 检查应用代码中PORT的使用方式
rg "process.env.PORT|process.env.APP_PORT" src --type ts --type tsx -B 2 -A 2Repository: ding113/claude-code-hub
Length of output: 94
🏁 Script executed:
#!/bin/bash
# 查看Dockerfile完整内容
cat DockerfileRepository: ding113/claude-code-hub
Length of output: 774
🏁 Script executed:
#!/bin/bash
# 查看src目录下env schema配置
cat src/lib/config/env.schema.ts | grep -A 5 -B 5 "PORT"Repository: ding113/claude-code-hub
Length of output: 721
Dockerfile 端口配置错误:应为 3000 而非 8080
Dockerfile 第 18-19 行设置 PORT=8080 与实际部署配置不一致。根据 docker-compose.yaml,容器内部应监听端口 3000(对外通过 APP_PORT 映射为 23000)。健康检查亦使用 http://localhost:3000/api/actions/health。建议改为:
ENV PORT=3000
EXPOSE 3000🤖 Prompt for AI Agents
In `@Dockerfile` around lines 18 - 19, The Dockerfile sets the container port to
8080 but the app and health checks expect port 3000; update the ENV PORT value
and the EXPOSE directive so they match the runtime configuration by changing ENV
PORT to 3000 and EXPOSE to 3000 (adjust the ENV PORT and EXPOSE lines
accordingly).
| import { | ||
| detectThinkingBudgetRectifierTrigger, | ||
| rectifyThinkingBudget, | ||
| } from "./thinking-budget-rectifier"; |
There was a problem hiding this comment.
使用 @/ 路径别名导入 src 文件
该测试位于 src/ 下,建议改用 @/ 路径别名以保持一致并减少相对路径移动风险。
修改建议
import {
detectThinkingBudgetRectifierTrigger,
rectifyThinkingBudget,
-} from "./thinking-budget-rectifier";
+} from "@/app/v1/_lib/proxy/thinking-budget-rectifier";As per coding guidelines: Use path alias @/ to reference files in ./src/ directory.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import { | |
| detectThinkingBudgetRectifierTrigger, | |
| rectifyThinkingBudget, | |
| } from "./thinking-budget-rectifier"; | |
| import { | |
| detectThinkingBudgetRectifierTrigger, | |
| rectifyThinkingBudget, | |
| } from "@/app/v1/_lib/proxy/thinking-budget-rectifier"; |
🤖 Prompt for AI Agents
In `@src/app/v1/_lib/proxy/thinking-budget-rectifier.test.ts` around lines 2 - 5,
The test imports detectThinkingBudgetRectifierTrigger and rectifyThinkingBudget
using a relative path; update the import statement to use the project path alias
(start imports with "@/") so modules are resolved from src via the alias—for
example replace the relative import of "./thinking-budget-rectifier" with
"@/app/v1/_lib/proxy/thinking-budget-rectifier" (keeping the same exported
symbols detectThinkingBudgetRectifierTrigger and rectifyThinkingBudget) to
follow the coding guideline.
| if (!message.thinking || typeof message.thinking !== "object") { | ||
| message.thinking = {}; | ||
| } | ||
|
|
||
| const thinkingObj = message.thinking as Record<string, unknown>; | ||
| thinkingObj.type = "enabled"; |
There was a problem hiding this comment.
thinking 为数组时应视为无效并重建
数组同样是 object,当前逻辑会把数组当作对象写入字段,建议显式排除。
修改建议
- if (!message.thinking || typeof message.thinking !== "object") {
+ if (
+ !message.thinking ||
+ typeof message.thinking !== "object" ||
+ Array.isArray(message.thinking)
+ ) {
message.thinking = {};
}🤖 Prompt for AI Agents
In `@src/app/v1/_lib/proxy/thinking-budget-rectifier.ts` around lines 73 - 78,
Current check treats arrays as objects and will mutate them; change the guard so
message.thinking is only reused when it's a non-null plain object (i.e., typeof
message.thinking === "object" && message.thinking !== null &&
!Array.isArray(message.thinking)), otherwise reassign message.thinking = {};
then cast to thinkingObj and set thinkingObj.type = "enabled" as before
(referencing message.thinking and thinkingObj).
| case "thinking_budget_rectifier": | ||
| return JSON.stringify([ | ||
| setting.type, | ||
| setting.hit, | ||
| setting.providerId ?? null, | ||
| setting.trigger, | ||
| setting.attemptNumber, | ||
| setting.retryAttemptNumber, | ||
| setting.before.maxTokens, | ||
| setting.before.thinkingBudgetTokens, | ||
| setting.after.maxTokens, | ||
| setting.after.thinkingBudgetTokens, | ||
| ]); |
There was a problem hiding this comment.
去重键遗漏 thinkingType,可能合并不同事件。
如果仅 thinkingType 发生变化而 tokens 不变,当前 key 会把两次事件合并,导致审计信息丢失。建议把 before/after 的 thinkingType 纳入 key。
修复建议
case "thinking_budget_rectifier":
return JSON.stringify([
setting.type,
setting.hit,
setting.providerId ?? null,
setting.trigger,
setting.attemptNumber,
setting.retryAttemptNumber,
setting.before.maxTokens,
+ setting.before.thinkingType,
setting.before.thinkingBudgetTokens,
setting.after.maxTokens,
+ setting.after.thinkingType,
setting.after.thinkingBudgetTokens,
]);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case "thinking_budget_rectifier": | |
| return JSON.stringify([ | |
| setting.type, | |
| setting.hit, | |
| setting.providerId ?? null, | |
| setting.trigger, | |
| setting.attemptNumber, | |
| setting.retryAttemptNumber, | |
| setting.before.maxTokens, | |
| setting.before.thinkingBudgetTokens, | |
| setting.after.maxTokens, | |
| setting.after.thinkingBudgetTokens, | |
| ]); | |
| case "thinking_budget_rectifier": | |
| return JSON.stringify([ | |
| setting.type, | |
| setting.hit, | |
| setting.providerId ?? null, | |
| setting.trigger, | |
| setting.attemptNumber, | |
| setting.retryAttemptNumber, | |
| setting.before.maxTokens, | |
| setting.before.thinkingType, | |
| setting.before.thinkingBudgetTokens, | |
| setting.after.maxTokens, | |
| setting.after.thinkingType, | |
| setting.after.thinkingBudgetTokens, | |
| ]); |
🤖 Prompt for AI Agents
In `@src/lib/utils/special-settings.ts` around lines 78 - 90, The deduplication
key for the "thinking_budget_rectifier" case omits thinkingType, which can merge
distinct events; update the key generation in the switch branch handling
"thinking_budget_rectifier" (in special-settings.ts) to include
setting.before.thinkingType and setting.after.thinkingType in the JSON.stringify
array alongside the existing fields (e.g., setting.before.maxTokens,
setting.before.thinkingBudgetTokens, setting.after.maxTokens,
setting.after.thinkingBudgetTokens) so events that differ only by thinkingType
remain distinct.
Summary of ChangesHello @ding113, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This release focuses on simplifying the core API gateway by removing complex cross-format request/response conversion layers, promoting a more direct and native API interaction model. Concurrently, it introduces advanced control mechanisms for Anthropic providers through configurable parameter overrides and an automated error rectification system for thinking budget issues. Additionally, the update enhances the utility of read-only API keys by extending access to several informational endpoints, improving integration capabilities for external clients. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a major architectural refactoring by removing the complex format converters, which significantly simplifies the codebase and improves maintainability. It also adds several valuable features, including Anthropic provider overrides and a thinking budget rectifier, along with important bug fixes and expanded readonly API access. The changes are extensive but well-structured and thoroughly tested. My review found one area for improvement in the new Dockerfile to optimize the production image size.
| # 关键:确保复制了所有必要的文件,特别是 drizzle 文件夹 | ||
| COPY --from=builder /app/public ./public | ||
| COPY --from=builder /app/.next ./.next | ||
| COPY --from=builder /app/node_modules ./node_modules |
There was a problem hiding this comment.
The runner stage copies the node_modules directory from the builder stage. The builder stage installs all dependencies, including devDependencies, by running bun install. This means the final production image contains development dependencies, which increases its size and potential security surface.
It's a best practice to only include production dependencies in the final image. You can achieve this by adding a new stage specifically for production dependencies:
# After the 'builder' stage
FROM oven/bun:debian AS prod-deps
WORKDIR /app
COPY package.json bun.lockb* ./
RUN bun install --frozen-lockfile --production
# In the 'runner' stage
# ...
COPY --from=prod-deps /app/node_modules ./node_modules
# ...This change will result in a smaller and more secure production image.
There was a problem hiding this comment.
Code Review Summary
This is a major release (v0.5.3) that removes the format conversion layer (~8,800 lines) and introduces Anthropic provider parameter overrides with a thinking budget rectifier. The code changes are well-structured with comprehensive test coverage (209 tests for rectifier, 36 for overrides, 358 lines for readonly access).
PR Size: XL
- Lines changed: 18,341 (8,908 additions + 9,433 deletions)
- Files changed: 114
Split Suggestion for XL PR: This PR combines multiple features that could have been separate PRs:
- Format converter removal (breaking change)
- Anthropic provider overrides feature
- Thinking budget rectifier feature
- Readonly API access extension
- Zeabur deployment support
For future releases, consider splitting breaking changes from new features.
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 0 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 0 | 0 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 1 | 0 |
| Tests | 0 | 0 | 0 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
Medium Priority Issues (Should Fix)
[COMMENT-OUTDATED] CLAUDE.md not updated after converter removal
CLAUDE.md (lines 89-90, 117) still references converters/ and codex/ directories that were removed in this PR. While CLAUDE.md is not part of the diff, it should be updated to maintain documentation accuracy:
- Line 89:
├── converters/ # Format converters (claude/openai/codex/gemini)- directory removed - Line 90:
└── codex/ # Codex CLI adapter- directory removed - Line 117:
- **Format Converters** (converters/): Bidirectional format translation- feature removed
Suggested action: Update CLAUDE.md in a follow-up commit or separate PR.
Review Coverage
- Logic and correctness - Clean
- Security (OWASP Top 10) - Clean
- Error handling - Clean (non-blocking persistence failures are intentional)
- Type safety - Clean
- Documentation accuracy - 1 issue (CLAUDE.md outdated)
- Test coverage - Excellent (209 + 36 + 358 new test lines)
- Code clarity - Good
Positive Observations (for context only)
The new code demonstrates good practices:
- Thinking budget rectifier has single-retry enforcement to prevent infinite loops
- Provider overrides include audit trail for debugging
- Clamping logic correctly handles edge cases (budget_tokens < 1024 skip)
- Comprehensive test coverage for all new features
Automated review by Claude AI
| z | ||
| .string() | ||
| .regex(/^\d+$/, "max_tokens must be 'inherit' or a numeric string") | ||
| .refine( |
There was a problem hiding this comment.
[MEDIUM] [STANDARD-VIOLATION] Non-i18n validation strings added for Anthropic override preferences
Why this is a problem: Repo rule requires: "i18n Required - All user-facing strings must use i18n (5 languages supported). Never hardcode display text". These .regex(..., "max_tokens must be ...") / .refine(..., { message: "..." }) messages can reach the UI (provider form shows res.error in a toast), so users will see hardcoded English regardless of locale.
Suggested fix (use error codes + existing zod helpers, consistent with other actions):
// src/lib/validation/schemas.ts
import { ERROR_CODES } from "@/lib/utils/error-messages";
const ANTHROPIC_MAX_TOKENS_PREFERENCE = z.union([
z.literal("inherit"),
z
.string()
.regex(/^\d+$/, ERROR_CODES.INVALID_FORMAT)
.refine(
(val) => {
const num = Number.parseInt(val, 10);
return num >= 1 && num <= 64000;
},
{ message: ERROR_CODES.INVALID_RANGE }
),
]);
// src/actions/providers.ts
import { extractZodErrorCode, formatZodError } from "@/lib/utils/zod-i18n";
const parsed = UpdateProviderSchema.safeParse(data);
if (!parsed.success) {
return {
ok: false,
error: formatZodError(parsed.error),
errorCode: extractZodErrorCode(parsed.error),
};
}| // Gemini usageMetadata - 改为 last-wins 策略 | ||
| // 跳过 applyUsageValue(它是 first-wins),直接更新 | ||
| if (data.usageMetadata && typeof data.usageMetadata === "object") { | ||
| const extracted = extractUsageMetrics(data.usageMetadata); |
There was a problem hiding this comment.
[HIGH] [TEST-MISSING-CRITICAL] Gemini SSE usageMetadata last-wins logic has no regression test
Why this is a problem: This PR changes SSE usage parsing to treat Gemini usageMetadata as last-wins (to avoid missing output tokens in streaming). Without a unit test, it is easy to regress back to first-wins and silently undercount output_tokens again (billing/usage impact).
Suggested fix (add a focused test case):
// tests/unit/proxy/extract-usage-metrics.test.ts
it("Gemini SSE: uses the last usageMetadata event (last-wins)", () => {
const sse = [
'data: {"usageMetadata":{"promptTokenCount":1000}}',
"",
'data: {"usageMetadata":{"promptTokenCount":1000,"candidatesTokenCount":500,"thoughtsTokenCount":100}}',
"",
].join("\n");
const result = parseUsageFromResponseText(sse, "gemini");
expect(result.usageMetrics?.output_tokens).toBe(600);
});There was a problem hiding this comment.
Code Review Summary
This is an XL PR spanning multiple independent areas (proxy behavior, permissions, billing usage parsing, provider UI/schema/migrations). The breadth and size make it difficult to validate critical paths thoroughly in one review and increase regression risk.
PR Size: XL
- Lines changed: 18341
- Files changed: 114
- Split suggestions:
- Split the proxy “same-format routing / converter removal” refactor into its own PR.
- Split the read-only key access expansion (+ integration tests) into its own PR.
- Split Anthropic provider overrides + thinking budget rectifier + DB migrations + UI changes into its own PR.
- Split URL builder changes (buildProxyUrl) + unit tests + docs/Dockerfile into a separate PR.
Issues Found
| Category | Critical | High | Medium | Low |
|---|---|---|---|---|
| Logic/Bugs | 0 | 0 | 0 | 0 |
| Security | 0 | 0 | 0 | 0 |
| Error Handling | 0 | 0 | 1 | 0 |
| Types | 0 | 0 | 0 | 0 |
| Comments/Docs | 0 | 0 | 0 | 0 |
| Tests | 0 | 1 | 0 | 0 |
| Simplification | 0 | 0 | 0 | 0 |
Critical Issues (Must Fix)
None.
High Priority Issues (Should Fix)
- (Confidence 90)
src/app/v1/_lib/proxy/response-handler.ts:1556– Gemini SSEusageMetadatalast-wins logic was added/changed but lacks a regression unit test to prevent output token undercount in streaming. - (Confidence 80)
src/lib/validation/schemas.ts:34– New hardcoded English validation messages for Anthropic override preferences can surface to UI error toasts, violating the i18n requirement.
Review Coverage
- Logic and correctness
- Security (OWASP Top 10)
- Error handling
- Type safety
- Documentation accuracy
- Test coverage
- Code clarity
Automated review by Codex AI
There was a problem hiding this comment.
- Identified PR
#712(release v0.5.3) and applied labelsize/XL. - Posted 2 inline review comments (diff-only):
src/app/v1/_lib/proxy/response-handler.ts(Gemini SSEusageMetadatalast-wins): missing regression unit test.src/lib/validation/schemas.ts(Anthropic override preference schemas): hardcoded (non-i18n) validation strings that can surface to UI toasts.
- Submitted the required PR review summary via
gh pr review --comment.
Release v0.5.3
This release includes major architectural improvements, new features, bug fixes, and deployment enhancements.
🎯 Highlights
Major Refactoring
New Features
max_tokensandthinking.budget_tokensat provider levelBug Fixes
📦 What's Included
This release merges the following PRs from
devbranch:🔧 Refactoring & Architecture
✨ Features
feat(provider): add Anthropic max_tokens and thinking budget override settings #689 - Anthropic provider token overrides
max_tokensoverride (1-64,000)thinking.budget_tokensoverride (1,024-32,000)enable_thinking_budget_rectifierfeat(api): 扩展只读密钥访问权限以支持更多端点 #704 - Extended readonly API key permissions
allowReadOnlyAccess: trueto 6 endpoints:getUsers,getUserLimitUsage,getUserStatisticsgetUsageLogs,getOverviewData,getActiveSessions支持 zeabur 部署 #679 - Zeabur deployment support
🐛 Bug Fixes
fix: 修复 buildProxyUrl 重复拼接版本前缀的问题(Gemini) #693 - Fix Gemini URL version prefix duplication
/v1,/v1beta,/v1internal,/v2prefixesfix(billing): use last-wins for Gemini SSE usageMetadata extraction #691 - Fix Gemini SSE billing calculation
usageMetadataextractionFix provider exhaustion after model redirect (refs #629) #633 - Fix provider exhaustion after model redirect
🗄️ Database Migrations
Two new migrations included:
anthropic_max_tokens_preferenceandanthropic_thinking_budget_preferencecolumns to providers tableenable_thinking_budget_rectifierboolean to system_settings tableMigration required: Run
bun run db:migrateor setAUTO_MIGRATE=trueFormat Converter Removal (PR #709)
Cross-format conversion is no longer supported.
Requests must be routed to providers with matching API formats:
Migration Path:
providerTypematches the API format you're sendingRemoved Features:
joinClaudePoolprovider configuration optioncodexInstructionsStrategyprovider configuration option📊 Statistics
🔗 Related Issues
🧪 Testing
📝 Upgrade Notes
bun run db:migrateor useAUTO_MIGRATE=true🙏 Contributors
Release description generated by Claude AI
Greptile Overview
Greptile Summary
This release (v0.5.3) introduces major breaking changes by removing cross-format API conversion capabilities and enforcing same-format routing between clients and providers.
Major Changes
BREAKING: Format Converters Removed
Provider Configuration Changes
New Features
UI & Configuration
Testing
Confidence Score: 4/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant Client participant Forwarder as ProxyForwarder participant Selector as ProviderSelector participant Overrides as AnthropicOverrides participant Provider as Upstream Provider participant ResponseHandler Client->>Forwarder: Request (format: openai/claude/response) Forwarder->>Selector: Pick provider for request Note over Selector: NEW: Strict format matching<br/>openai to openai-compatible only<br/>claude to claude/claude-auth only<br/>response to codex only Selector-->>Forwarder: Selected provider alt Anthropic Provider (claude/claude-auth) Forwarder->>Overrides: Apply parameter overrides Note over Overrides: Override max_tokens<br/>Override thinking budget<br/>(if configured) Overrides-->>Forwarder: Modified request end Forwarder->>Provider: Forward request Provider-->>Forwarder: Response or Error alt Anthropic budget validation error Note over Forwarder: Detect budget validation failure Forwarder->>Forwarder: Rectify: set budget to 32000<br/>set max_tokens to 64000 if needed Forwarder->>Provider: Retry with rectified request Provider-->>Forwarder: Response (success) end Forwarder->>ResponseHandler: Process response Note over ResponseHandler: REMOVED: Format conversion<br/>Pass-through only<br/>Gemini: last-wins usage extraction ResponseHandler-->>Client: Final response (same format as request)