refactor: convert AWS and GCP Terraform stacks into reusable modules …#28103
Conversation
|
|
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Greptile SummaryThis PR refactors the AWS and GCP Terraform stacks into provider-free reusable modules, adds
Confidence Score: 4/5The refactor is structurally sound, but the GCP state migration guide tells operators to expect no plan changes when there will actually be a managed SSL certificate replacement for any TLS-enabled stack. The AWS changes are clean: every taggable resource receives local.tags, the provider deletion is correct, and the migration guide accurately claims No Changes after address migration. The GCP Cloud SQL ignore_changes fix and the cert create_before_destroy rename are both correct improvements. The sole defect is that the GCP README migration guide claims terraform plan expect No Changes, which will not hold for any deployment with TLS enabled. terraform/litellm/gcp/README.md — the state migration guide needs a note that one managed SSL certificate replacement is expected for TLS-enabled stacks and that it is safe due to create_before_destroy.
|
| Filename | Overview |
|---|---|
| terraform/litellm/aws/locals.tf | Adds local.tags combining litellm:stack, managed-by, and var.tags; replaces what the now-deleted provider default_tags was doing. |
| terraform/litellm/aws/providers.tf | Deleted — provider block moved to examples/default/providers.tf so the module can be consumed with count/for_each/aliased providers. |
| terraform/litellm/aws/examples/default/main.tf | New thin-root wrapper that wires the AWS provider to the module with a curated variable surface; all looks correct. |
| terraform/litellm/aws/examples/default/providers.tf | Root-level AWS provider with commented-out default_tags slot; managed-by redundancy from previous thread is addressed. |
| terraform/litellm/gcp/load_balancer.tf | Adds hashed cert name and create_before_destroy to fix resourceInUseByAnotherResource on domain changes; the name change is a breaking rename for existing TLS deployments that the migration guide's "No changes" claim does not account for. |
| terraform/litellm/gcp/cloudsql.tf | Adds lifecycle { ignore_changes = [settings[0].disk_size] } to both writer and reader instances to prevent disk autoresize triggering a destroy/recreate; correct fix. |
| terraform/litellm/gcp/providers.tf | Deleted — google/google-beta provider blocks moved to gcp/examples/default/providers.tf. |
| terraform/litellm/gcp/examples/default/main.tf | New thin-root wrapper for GCP; correctly notes the configuration_aliases limitation for for_each multi-project deployments. |
| terraform/litellm/gcp/README.md | Adds module-first design docs, for_each multi-tenant pattern, and state migration guide; the guide's "expect: No changes" assertion is incorrect for TLS-enabled stacks due to the cert rename. |
| terraform/litellm/aws/README.md | Adds module-first docs, for_each multi-tenant pattern, and AWS state migration guide; terraform plan # expect: No changes is accurate for AWS (no managed certs are renamed here). |
| terraform/litellm/aws/alb.tf | Threads local.tags onto all ALB resources (lb, target groups, listeners, listener rules); coverage is complete. |
| terraform/litellm/aws/network.tf | Replaces bare { Name = ... } tags with merge(local.tags, { Name = ... }) and adds local.tags to security groups; correct approach. |
Reviews (2): Last reviewed commit: "docs(terraform): address review — state-..." | Re-trigger Greptile
ab3feba to
b8d6ff7
Compare
|
Generated by Claude Code |
| done | ||
| terraform plan # expect: No changes | ||
| ``` | ||
|
|
There was a problem hiding this comment.
GCP state migration "No changes" claim is incorrect for TLS-enabled stacks
This PR renames google_compute_managed_ssl_certificate from ${local.name}-cert to ${local.name}-cert-${substr(sha1(join(",", var.lb_domains)), 0, 8)} (in load_balancer.tf). After running the scripted state mv loop, Terraform will still see this resource as needing replacement because the name stored in state (acme-litellm-prod-cert) doesn't match the newly computed name. The create_before_destroy lifecycle makes the replacement safe, but the guide's terraform plan # expect: No changes assertion will fail for anyone who has lb_domains set — leaving operators unsure whether the replacement is expected or a sign that the migration went wrong.
The migration guide should note that one resource replacement is expected for TLS-enabled stacks, and that the create_before_destroy lifecycle handles it safely.
There was a problem hiding this comment.
Good catch on the cert-rename edge case. Rather than patch the wording, I've removed the state-migration guides from both the AWS and GCP READMEs entirely (and the corresponding note from the docs PR).
Rationale: these Terraform stacks have never been published or tagged for external consumption, so there are no existing deployments sitting on the old root-module addresses to migrate from. The terraform state mv guidance — and therefore the inaccurate "expect: No changes" assertion that wouldn't hold once the managed SSL cert is renamed — only applied to a hypothetical upgrade path that doesn't exist. Dropping it removes the incorrect claim at the source.
Done in d28283a (cert note) → superseded by 4c00811 (guides removed).
Generated by Claude Code
…with examples/default entry point - Remove `provider` blocks from both AWS and GCP stack roots so the modules can be consumed with `count`, `for_each`, `depends_on`, assumed-role or aliased providers — patterns that are forbidden when a module owns its own provider configuration - Add `examples/default/` thin-root wrappers for both stacks that wire the provider (AWS) / providers (google + google-beta) and call the module with a curated variable surface, preserving the one-command deploy experience - Move `terraform.tfvars.example` files into `examples/default/` alongside the new roots; update example comments to reflect the curated variable surface - Thread `local.tags` (containing `litellm:stack`, `managed-by`, and `var.tags`) explicitly onto every taggable AWS resource since the module no longer controls the provider's `default_tags`; GCP resource labels already flow through the module's `labels` input - Add `examples/default/variables.tf` and `outputs.tf` for both stacks, exposing the most-used knobs and re-exporting all module outputs - Commit provider lock files for both examples so `terraform init` is reproducible without a network fetch - Update top-level and per-stack READMEs to document the module-first design, the `for_each` multi-tenant pattern, and the `examples/default/` quick-start path
…for_each note - Add 'Migrating an existing deployment' section to AWS & GCP READMEs documenting the required terraform state mv step (resource addresses now gain a module.litellm. prefix under the examples/default root) - Remove redundant managed-by tag from the AWS example providers.tf; reserve default_tags there for org-wide tags only - Document the for_each single-provider limitation for GCP (no configuration_aliases) in the README and example main.tf Resolves LIT-3504
…ation guide The managed SSL cert is named with a hash of lb_domains, so TLS-enabled stacks that migrated from the old un-hashed name will see one create_before_destroy cert replacement after terraform state mv — not a clean 'No changes'. Document that this single replacement is expected and safe.
The AWS/GCP stacks have never been published, so there are no existing deployments to migrate from the old root-module layout. Remove the 'Migrating an existing deployment' sections from both READMEs.
…click The GCP stack's default image_registry points at ghcr.io, which Cloud Run won't authenticate against, so any real deploy (HCP Terraform no-code or otherwise) must override it. Document that as a hard requirement on the GCP README rather than a side note, and add a top-level HCP Terraform 1-click section enumerating the required inputs per stack and the migration-task caveat for HCP-hosted runners.
…y v2 proxy_config Drop the inline LITELLM_PROXY_CONFIG_B64 env var. Upload the YAML to S3 at config/litellm-config.yaml; gateway and backend container entrypoints download it to /tmp/litellm-config.yaml via boto3 before exec'ing uvicorn. The S3 object etag is wired into the task definition so a config edit produces a new task-def revision and a rolling redeploy. The existing s3_access policy already grants the task role s3:GetObject on this bucket, so no IAM changes were needed for the mount itself. OpenTelemetry v2 New variables otel_endpoint, otel_exporter, otel_service_name, and otel_headers_secret_arn. Setting otel_endpoint to a non-empty value adds LITELLM_OTEL_V2=true plus OTEL_EXPORTER / OTEL_ENDPOINT / OTEL_SERVICE_NAME / OTEL_ENVIRONMENT_NAME to the shared env block; an optional Secrets Manager ARN backs OTEL_HEADERS for collectors that need an auth header. Execution role auto-gains GetSecretValue on that ARN. Empty endpoint = nothing added, so existing deployments are unchanged.
4794582 to
ab2c928
Compare
Wires up a Cloud Shell "Open in Cloud Shell" badge backed by the GoogleCloudPlatform DeployStack flow so examples/default can be installed from a click in the README without a local terraform setup. - examples/default/deploystack.json drives project/region collection plus prompts for tenant, env, image_tag, and allow_plaintext_lb. Complex inputs (proxy_config, *_extra_secrets, lb_domains) and sensitive vars (litellm_master_key, litellm_license, ui_password) stay tfvars / env only so they never land in a committed file. - examples/default/TUTORIAL.md is a Cloud Shell walkthrough that enables required APIs, creates the GHCR-passthrough Artifact Registry repo, optionally exports the TF_VAR_* secrets, runs `deploystack install`, and shows how to fetch the master key plus migrate from plaintext LB to TLS. - Renames var.project to var.project_id across the module and the examples/default wrapper to match the variable DeployStack injects from `collect_project: true`. Breaking rename for anyone with a `project = ...` line in terraform.tfvars; the fix is one line.
…ry v2
proxy_config
Drop the inline LITELLM_PROXY_CONFIG_B64 env var and the python-decode
startup fragment. Upload the YAML to a dedicated GCS bucket as
config.yaml, then mount it read-only into the gateway and backend at
/etc/litellm via Cloud Run v2's gcsfuse volume. CONFIG_FILE_PATH points
at the mount; an md5 of the YAML rides along as PROXY_CONFIG_HASH so a
config-only edit forces a new Cloud Run revision (gcsfuse only surfaces
new objects on container restart, so without the hash an updated
proxy_config would sit in the bucket unread).
The config bucket is separate from the data-plane bucket so the runtime
SA can hold objectViewer here (read-only at runtime) while keeping
objectAdmin on the data-plane bucket. Both bucket and IAM binding are
gated on proxy_config != {}; an empty config skips bucket creation and
mounts nothing.
OpenTelemetry v2
LITELLM_OTEL_V2=true is now wired into shared_env_kv unconditionally so
both the gateway and backend boot with the integration enabled. It's
dormant until otel_endpoint is non-empty; setting it injects
OTEL_EXPORTER / OTEL_ENDPOINT / OTEL_ENVIRONMENT_NAME plus a
per-component OTEL_SERVICE_NAME (\${tenant}-litellm-\${env}-{gateway,backend})
so spans land tagged with the right hop. otel_headers_secret takes a
Secret Manager resource ID for OTEL_HEADERS (collector auth); the
runtime SA auto-gains roles/secretmanager.secretAccessor on it.
otel_capture_message_content defaults to no_content matching the litellm
default. Any OTEL_* key set in *_extra_env wins over the defaults so
Cloud Run doesn't reject the apply on the duplicate-env-name check.
| { | ||
| "name": "allow_plaintext_lb", | ||
| "description": "Skip TLS on the load balancer (HTTP-only). Set true for trial/dev. For production, leave false and add lb_domains to terraform.tfvars after the first apply", | ||
| "default": "true", |
There was a problem hiding this comment.
High: Plaintext load balancer by default
allow_plaintext_lb defaults to true for the one-click GCP deploy, which publishes the proxy and UI over http://<lb-ip> and the tutorial has users sign in with the master key on that URL. A network-path attacker can capture the master key or bearer tokens from users who follow the default DeployStack flow; make TLS the default path by collecting lb_domains before apply, or require an explicit opt-in for plaintext rather than defaulting to it.
There was a problem hiding this comment.
Thanks — this is a deliberate, documented tradeoff for the one-click DeployStack path, so we're keeping the allow_plaintext_lb = "true" default here rather than flipping it.
Rationale: GCP's external HTTPS LB uses a Google-managed SSL certificate, which is only issued once the domain in lb_domains resolves to the LB's anycast IP. That IP doesn't exist until after the first apply, so TLS is inherently a two-step flow (apply → point DNS at lb_ip → set lb_domains and re-apply). A true single-click deploy can't provision managed TLS up front, so the installer defaults to HTTP to keep the zero-input path working for trial/dev.
Guardrails already in place:
- The module is TLS-first by default outside DeployStack:
terraform planrefuses to build an HTTP-only LB unlessallow_plaintext_lb = trueis set explicitly (precondition inload_balancer.tf). The plaintext default lives only in the one-clickdeploystack.json, which is explicitly the trial/dev entry point. - The setting is surfaced as a first-class toggle in the installer with the description telling users to leave it off and set
lb_domainsfor production. - The README "TLS" section documents the production two-step flow and the 301→HTTPS upgrade behavior.
We'll make the production-hardening path more prominent in the DeployStack docs, but the HTTP default for the one-click trial flow is intended. Flagging the master-key-over-HTTP exposure for the trial path is fair, and that's why it's gated behind an explicit dev/trial opt-in everywhere except this installer.
Generated by Claude Code
There was a problem hiding this comment.
This is working as intended. The DeployStack one-click is a trial/dev convenience path, not the production entry point; that's why the prompt is explicitly labeled "Set true for trial/dev. For production, leave false and add lb_domains to terraform.tfvars after the first apply", and the tutorial has a dedicated "Going to TLS" step covering DNS + a managed cert.
The module and the examples/default root both default allow_plaintext_lb to false, so the production path fails closed: a plan without lb_domains errors out until the operator either provides a managed-cert domain or consciously opts into plaintext. The end user owns their own infrastructure and is responsible for standing it up securely in prod; we give them a secure-by-default module plus a clearly-scoped trial shortcut, which is the right split.
Not making a change here.
Generated by Claude Code
PR overviewThis PR refactors the AWS and GCP Terraform stacks into reusable modules, including example deployment configuration for the LiteLLM GCP stack. The changed GCP DeployStack example defines the default one-click deployment inputs for publishing the proxy and UI behind a load balancer. There is one open security issue: the default GCP one-click deployment currently enables a plaintext load balancer, causing users who follow the default flow to access the proxy and UI over HTTP. Because the tutorial flow involves signing in with the master key, a network-path attacker could capture credentials or bearer tokens unless TLS is made the default or plaintext is made an explicit opt-in. No issues have been addressed yet, so the PR still carries a clear deployment-default exposure. Open issues (1)
Fixed/addressed: 0 · PR risk: 7/10 |
Bring both modules to the same surface and the same runtime behavior so
swapping clouds (or reading either README) is symmetric.
Labels and tags. GCP previously stamped var.labels onto only the two GCS
buckets, leaving Cloud Run, Cloud SQL, Memorystore, Secret Manager, and
the LB resources unlabeled; the variable description claimed full
coverage. Now the module computes local.labels (litellm-stack +
managed-by + var.labels, mirroring AWS's local.tags) and threads it onto
every label-supporting resource: Cloud Run services and the migrations
job, Cloud SQL writer and reader (via user_labels), Memorystore, Secret
Manager entries (master_key, license, ui_password, db_password), both
GCS buckets, the global LB address, and the http/https forwarding rules.
GCP keys use 'litellm-stack' instead of AWS's 'litellm:stack' because
GCP label keys forbid colons; var.labels now defaults to {}.
OpenTelemetry v2 is opt-in on both stacks. AWS already gated everything
on otel_endpoint; GCP previously stamped LITELLM_OTEL_V2=true into
shared_env unconditionally and only ungated the OTEL_* block. Both
stacks now do the same thing: leave otel_endpoint empty and nothing
OTel-related lands in the container env; set it and gateway and backend
get LITELLM_OTEL_V2=true plus OTEL_EXPORTER, OTEL_ENDPOINT,
OTEL_ENVIRONMENT_NAME, OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT,
and a per-component OTEL_SERVICE_NAME (${tenant}-litellm-${env}-gateway
or -backend) so spans land tagged with the right hop. AWS picks up the
richer GCP surface: otel_environment_name (defaults to var.env),
otel_capture_message_content (defaults to no_content), and *_extra_env
override filtering so a caller-set OTEL_* key wins over the default for
that service (ECS allows duplicates, but the filter gives the same
predictable last-wins shape Cloud Run enforces). var.otel_service_name
on AWS is gone, replaced by the per-component naming.
uvicorn workers. GCP gains gateway_num_workers, matching AWS; threads
into the gateway args as --workers ${var.gateway_num_workers}.
Docs reflect the parity: each README's OTel section, the GCP 'Using as
a module' Labels paragraph, and a new feature-parity table in the
top-level README that lays out the AWS/GCP input mapping side by side.
…ample The example wrapper already exposed `s3_force_destroy` so ephemeral / CI stacks could destroy the S3 bucket without manual cleanup, but the matching Aurora knob (`skip_final_snapshot`) was hidden behind the module surface. That meant a `terraform destroy` on a trial stack still produced a `<cluster>-final-<short-sha>` snapshot, with no opt-out short of editing the module call. Adds `var.skip_final_snapshot` to the example (default `false`, preserving the data-loss tripwire) and threads it through to the module input, mirroring the existing `s3_force_destroy` pattern. Documented alongside it in the tfvars example. Verified by deploying the example end-to-end against a clean AWS account (VPC + Aurora w/ IAM auth + Redis + ALB + 3 ECS services), confirming all services reach steady state and the data plane serves traffic, then running `terraform destroy` with `skip_final_snapshot = true` to a clean teardown (93 destroyed, no Aurora snapshot left behind, no leftover billable resources).
1cff02f
into
litellm_internal_staging
…9.0) (#93) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/berriai/litellm](https://images.chainguard.dev/directory/image/wolfi-base/overview) ([source](https://github.com/BerriAI/litellm)) | minor | `v1.88.1` → `v1.89.0` | --- ### Release Notes <details> <summary>BerriAI/litellm (ghcr.io/berriai/litellm)</summary> ### [`v1.89.0`](https://github.com/BerriAI/litellm/releases/tag/v1.89.0) [Compare Source](https://github.com/BerriAI/litellm/compare/v1.89.0...v1.89.0) ##### Verify Docker Image Signature All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0). **Verify using the pinned commit hash (recommended):** A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \ ghcr.io/berriai/litellm:v1.89.0 ``` **Verify using the release tag (convenience):** Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/v1.89.0/cosign.pub \ ghcr.io/berriai/litellm:v1.89.0 ``` Expected output: ``` The following checks were performed on each of these signatures: - The cosign claims were validated - The signatures were verified against the specified public key ``` *** ##### What's Changed - test(responses): bump deprecated gemini-3-pro-preview to gemini-3.1-pro-preview by [@​mateo-berri](https://github.com/mateo-berri) in [#​29433](https://github.com/BerriAI/litellm/pull/29433) - fix: map mistral/ministral-8b-latest in model price map by [@​mateo-berri](https://github.com/mateo-berri) in [#​29453](https://github.com/BerriAI/litellm/pull/29453) - fix(datadog): split oversized batches on 413 instead of re-queueing forever by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29444](https://github.com/BerriAI/litellm/pull/29444) - feat(otel): allowlist team\_metadata sub-keys promoted to baggage by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29442](https://github.com/BerriAI/litellm/pull/29442) - fix: stop use\_chat\_completions\_api flag from leaking into provider request body by [@​mateo-berri](https://github.com/mateo-berri) in [#​29447](https://github.com/BerriAI/litellm/pull/29447) - fix(anthropic, fireworks): inline legacy $ref defs in tool schemas by [@​milan-berri](https://github.com/milan-berri) in [#​28646](https://github.com/BerriAI/litellm/pull/28646) - fix(proxy): omit OpenAI \[DONE] on google-genai streamGenerateContent by [@​Sameerlite](https://github.com/Sameerlite) in [#​29426](https://github.com/BerriAI/litellm/pull/29426) - ci(release): create stable/X.Y.x line branch on X.Y.0 tags by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29457](https://github.com/BerriAI/litellm/pull/29457) - fix(vector-stores): support engines URL for Vertex AI Search by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​27885](https://github.com/BerriAI/litellm/pull/27885) - fix(ui): render caller-supplied filter options in caller order by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29462](https://github.com/BerriAI/litellm/pull/29462) - fix(batches): skip unnecessary batch input file reads by [@​Sameerlite](https://github.com/Sameerlite) in [#​29114](https://github.com/BerriAI/litellm/pull/29114) - docs(agents): clarify when to create new test files by [@​Sameerlite](https://github.com/Sameerlite) in [#​29472](https://github.com/BerriAI/litellm/pull/29472) - Litellm OSS Staging by [@​Sameerlite](https://github.com/Sameerlite) in [#​29161](https://github.com/BerriAI/litellm/pull/29161) - fix(mcp): clear allowed\_tools and tool overrides on MCP server edit by [@​Sameerlite](https://github.com/Sameerlite) in [#​29411](https://github.com/BerriAI/litellm/pull/29411) - Litellm OSS Staging 010626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29422](https://github.com/BerriAI/litellm/pull/29422) - fix(ci): make CircleCI rerun-failed-tests collect tests when 2+ test files fail by [@​mateo-berri](https://github.com/mateo-berri) in [#​29475](https://github.com/BerriAI/litellm/pull/29475) - feat(a2a): watsonx Orchestrate agent provider by [@​Sameerlite](https://github.com/Sameerlite) in [#​29410](https://github.com/BerriAI/litellm/pull/29410) - fix(azure\_ai): strip tool-level extra fields on 400 and retry by [@​Sameerlite](https://github.com/Sameerlite) in [#​29479](https://github.com/BerriAI/litellm/pull/29479) - fix(docs): remove fixed dimensions from README hero image by [@​mateo-berri](https://github.com/mateo-berri) in [#​29496](https://github.com/BerriAI/litellm/pull/29496) - Litellm oss staging by [@​Sameerlite](https://github.com/Sameerlite) in [#​29492](https://github.com/BerriAI/litellm/pull/29492) - fix: small CLAUDE.md nits by [@​mateo-berri](https://github.com/mateo-berri) in [#​29504](https://github.com/BerriAI/litellm/pull/29504) - Add MCP semantic conventions to otelv2 by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29468](https://github.com/BerriAI/litellm/pull/29468) - fix(passthrough): emit otel guardrail span when a guardrail blocks by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29470](https://github.com/BerriAI/litellm/pull/29470) - fix(proxy): strip NUL bytes from spend log payloads to prevent PostgreSQL 22P05 by [@​milan-berri](https://github.com/milan-berri) in [#​29515](https://github.com/BerriAI/litellm/pull/29515) - \[internal copy of [#​28008](https://github.com/BerriAI/litellm/issues/28008)] Support MCP OAuth passthrough and issuer-scoped JWT auth by [@​mateo-berri](https://github.com/mateo-berri) in [#​28356](https://github.com/BerriAI/litellm/pull/28356) - feat(vector-stores): forward per-request params to Vertex AI Search by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29459](https://github.com/BerriAI/litellm/pull/29459) - feat(proxy): add per-MCP-server RPM rate limiting for keys and teams by [@​Sameerlite](https://github.com/Sameerlite) in [#​29482](https://github.com/BerriAI/litellm/pull/29482) - fix(tests): drop module-level test calls that break local\_testing collection by [@​mateo-berri](https://github.com/mateo-berri) in [#​29520](https://github.com/BerriAI/litellm/pull/29520) - feat(agents): add LangFlow agent provider with A2A session bridging by [@​Sameerlite](https://github.com/Sameerlite) in [#​28963](https://github.com/BerriAI/litellm/pull/28963) - fix(ui/agents): make A2A skill tags enterable and validated by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29512](https://github.com/BerriAI/litellm/pull/29512) - \[internal copy of [#​29232](https://github.com/BerriAI/litellm/issues/29232)] feat: route future Claude models to Anthropic provider via pattern matching by [@​mateo-berri](https://github.com/mateo-berri) in [#​29239](https://github.com/BerriAI/litellm/pull/29239) - fix(tests): drop import-time completion call in test\_register\_model by [@​mateo-berri](https://github.com/mateo-berri) in [#​29521](https://github.com/BerriAI/litellm/pull/29521) - test: stabilize batch VCR coverage and stop live upload/network leaks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29477](https://github.com/BerriAI/litellm/pull/29477) - \[internal copy of [#​29003](https://github.com/BerriAI/litellm/issues/29003)] fix(vertex\_ai): use user-supplied api\_base as is for Model Garden OpenAI-compat path by [@​mateo-berri](https://github.com/mateo-berri) in [#​29530](https://github.com/BerriAI/litellm/pull/29530) - feat(proxy): native /health/drain preStop hook for graceful shutdown by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29439](https://github.com/BerriAI/litellm/pull/29439) - fix(auth): preserve 401 status for expired JWTs in OTel traces by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29510](https://github.com/BerriAI/litellm/pull/29510) - fix(otel): capture 401 error details in management endpoint spans by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29535](https://github.com/BerriAI/litellm/pull/29535) - test(proxy/utils): pin bottom-of-file helper behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29509](https://github.com/BerriAI/litellm/pull/29509) - test(proxy/utils): pin PrismaClient and spend-update behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29488](https://github.com/BerriAI/litellm/pull/29488) - test(proxy/utils): pin ProxyLogging behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29485](https://github.com/BerriAI/litellm/pull/29485) - fix: missing span for guardrail passthrough by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29552](https://github.com/BerriAI/litellm/pull/29552) - fix(auth): let internal users view search tools by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29542](https://github.com/BerriAI/litellm/pull/29542) - fix: missing mcp otel attributes by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29554](https://github.com/BerriAI/litellm/pull/29554) - fix(proxy): resolve managed video model ids for auth by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29545](https://github.com/BerriAI/litellm/pull/29545) - fix(key\_generate): allow team members to create keys on org-scoped teams by [@​milan-berri](https://github.com/milan-berri) in [#​29310](https://github.com/BerriAI/litellm/pull/29310) - test(pass-through): move Gemini pass-through tests to gemini-3.1-flash-lite by [@​mateo-berri](https://github.com/mateo-berri) in [#​29595](https://github.com/BerriAI/litellm/pull/29595) - Litellm oss staging 030626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29578](https://github.com/BerriAI/litellm/pull/29578) - Fix : a2a bugs 030626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29566](https://github.com/BerriAI/litellm/pull/29566) - \[internal copy of [#​29533](https://github.com/BerriAI/litellm/issues/29533)] fix(anthropic/adapter): emit thinking block for reasoning\_content-only streaming chunks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29600](https://github.com/BerriAI/litellm/pull/29600) - ci: reproduce default-Windows wheel install to guard MAX\_PATH by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29597](https://github.com/BerriAI/litellm/pull/29597) - fix(vertex): strip output\_config.effort for Vertex Claude models that reject it (Haiku 4.5) by [@​mateo-berri](https://github.com/mateo-berri) in [#​29585](https://github.com/BerriAI/litellm/pull/29585) - Litellm websocket improvements by [@​Sameerlite](https://github.com/Sameerlite) in [#​29563](https://github.com/BerriAI/litellm/pull/29563) - feat(arize/phoenix): OpenInference rendering parity — tool\_calls, cost, passthrough I/O, session/user, multimodal, cache tokens by [@​milan-berri](https://github.com/milan-berri) in [#​28800](https://github.com/BerriAI/litellm/pull/28800) - \[internal copy of [#​29550](https://github.com/BerriAI/litellm/issues/29550)] fix: passthrough endpoints duplicate logs by [@​mateo-berri](https://github.com/mateo-berri) in [#​29598](https://github.com/BerriAI/litellm/pull/29598) - fix(ci): keep coverage rename green when a parallel node runs no tests by [@​mateo-berri](https://github.com/mateo-berri) in [#​29608](https://github.com/BerriAI/litellm/pull/29608) - test(vcr): close out the remaining VCR live-call leaks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29603](https://github.com/BerriAI/litellm/pull/29603) - fix(key\_generate): exempt UI/CLI session tokens from the budget ceiling for team keys by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29612](https://github.com/BerriAI/litellm/pull/29612) - fix(realtime): allow null transcripts in stream logging payloads by [@​milan-berri](https://github.com/milan-berri) in [#​29625](https://github.com/BerriAI/litellm/pull/29625) - build(ui): migrate eslint to flat config + bump eslint-config-next to 16 by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29626](https://github.com/BerriAI/litellm/pull/29626) - fix(key\_generate): scope session-token team-key budget exemption to caller-supplied team\_id by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29641](https://github.com/BerriAI/litellm/pull/29641) - fix(proxy): disable proxy buffering on streaming SSE responses by [@​mateo-berri](https://github.com/mateo-berri) in [#​29557](https://github.com/BerriAI/litellm/pull/29557) - fix(mcp): gate /public/mcp\_hub strictly on litellm.public\_mcp\_servers by [@​michelligabriele](https://github.com/michelligabriele) in [#​27764](https://github.com/BerriAI/litellm/pull/27764) - ci(ui): frontend-lint job enforcing prettier + eslint on changed files by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29633](https://github.com/BerriAI/litellm/pull/29633) - fix(gemini): googleSearch + server-side tools and googleMaps JSON schema by [@​Sameerlite](https://github.com/Sameerlite) in [#​29582](https://github.com/BerriAI/litellm/pull/29582) - fix(proxy): passthrough 404 when SERVER\_ROOT\_PATH is set by [@​Sameerlite](https://github.com/Sameerlite) in [#​29658](https://github.com/BerriAI/litellm/pull/29658) - fix(gemini-realtime): use GA event names for Pipecat 1.3.x compatibility by [@​Sameerlite](https://github.com/Sameerlite) in [#​29662](https://github.com/BerriAI/litellm/pull/29662) - Litellm oss staging 040626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29671](https://github.com/BerriAI/litellm/pull/29671) - style(ui): prettier formatting pass over the dashboard by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29622](https://github.com/BerriAI/litellm/pull/29622) - chore: ignore prettier dashboard reformat in git blame by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29695](https://github.com/BerriAI/litellm/pull/29695) - fix(helm): Enable Backend Deployment to mount Gateway config.yaml by [@​tin-berri](https://github.com/tin-berri) in [#​29605](https://github.com/BerriAI/litellm/pull/29605) - \[internal copy of [#​29277](https://github.com/BerriAI/litellm/issues/29277)] fix(proxy): add default=None to LiteLLM\_TeamMembership.litellm\_budget\_table by [@​mateo-berri](https://github.com/mateo-berri) in [#​29684](https://github.com/BerriAI/litellm/pull/29684) - test: make custom\_tokenizer proxy tests hermetic by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29643](https://github.com/BerriAI/litellm/pull/29643) - test(proxy): stop running real-DB tests in GitHub Actions unit jobs by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29700](https://github.com/BerriAI/litellm/pull/29700) - chore(ui): remove the bare-fetch lint rule by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29712](https://github.com/BerriAI/litellm/pull/29712) - Litellm jwt mapping virtualkeys by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​28510](https://github.com/BerriAI/litellm/pull/28510) - refactor(ui): shared HTTP client + location-pinned fetch() lint rule by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29723](https://github.com/BerriAI/litellm/pull/29723) - fix(proxy): stop team BYOK model name corruption on model edit by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29731](https://github.com/BerriAI/litellm/pull/29731) - \[internal copy of [#​29511](https://github.com/BerriAI/litellm/issues/29511)] feat(guardrails): add sensitive data routing to on-premise models by [@​mateo-berri](https://github.com/mateo-berri) in [#​29531](https://github.com/BerriAI/litellm/pull/29531) - fix(proxy/hooks): populate llm\_provider on internal rate-limit errors by [@​mateo-berri](https://github.com/mateo-berri) in [#​27707](https://github.com/BerriAI/litellm/pull/27707) - fix(vertex/anthropic): handle namespace tools and strip client\_metadata for codex compatibility by [@​Sameerlite](https://github.com/Sameerlite) in [#​29489](https://github.com/BerriAI/litellm/pull/29489) - Support OAuth M2M for Databricks Apps A2A agents by [@​mateo-berri](https://github.com/mateo-berri) in [#​29586](https://github.com/BerriAI/litellm/pull/29586) - fix: small CLAUDE.md nit by [@​mateo-berri](https://github.com/mateo-berri) in [#​29749](https://github.com/BerriAI/litellm/pull/29749) - fix(anthropic): route Claude Opus 4.8 through adaptive thinking by [@​mateo-berri](https://github.com/mateo-berri) in [#​29702](https://github.com/BerriAI/litellm/pull/29702) - fix(proxy): persist oauth2\_flow on MCP server registration by [@​michelligabriele](https://github.com/michelligabriele) in [#​29690](https://github.com/BerriAI/litellm/pull/29690) - \[internal copy of [#​27491](https://github.com/BerriAI/litellm/issues/27491)] fix(realtime): Fix Realtime Audio Token Cost Tracking by [@​mateo-berri](https://github.com/mateo-berri) in [#​29722](https://github.com/BerriAI/litellm/pull/29722) - fix(galileo): use ingest traces API and standard logging payload by [@​Sameerlite](https://github.com/Sameerlite) in [#​29651](https://github.com/BerriAI/litellm/pull/29651) - fix(auth): expand all-team-models sentinel in can\_key\_call\_model for batch validation by [@​Sameerlite](https://github.com/Sameerlite) in [#​29746](https://github.com/BerriAI/litellm/pull/29746) - test(vcr): stop refreshing cassette TTL on read so cassettes lapse after 24h by [@​mateo-berri](https://github.com/mateo-berri) in [#​29784](https://github.com/BerriAI/litellm/pull/29784) - test(ci): record/replay OpenAI image gen so the spend E2E isn't outage-bound by [@​mateo-berri](https://github.com/mateo-berri) in [#​29787](https://github.com/BerriAI/litellm/pull/29787) - fix(ui): route MCP playground auth by oauth2 mode instead of token\_url by [@​tin-berri](https://github.com/tin-berri) in [#​29714](https://github.com/BerriAI/litellm/pull/29714) - refactor(ui): centralize proxy base URL resolution into tested resolver by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29793](https://github.com/BerriAI/litellm/pull/29793) - Litellm oss staging 050626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29774](https://github.com/BerriAI/litellm/pull/29774) - test(google): add google-genai SDK proxy integration tests by [@​Sameerlite](https://github.com/Sameerlite) in [#​29781](https://github.com/BerriAI/litellm/pull/29781) - fix(jwt): use resolved DB user\_id for spend on legacy email match by [@​milan-berri](https://github.com/milan-berri) in [#​29217](https://github.com/BerriAI/litellm/pull/29217) - feat(ui): generate dashboard API types from the proxy OpenAPI spec by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29816](https://github.com/BerriAI/litellm/pull/29816) - fix(proxy): drop deleted team BYOK model name from team.models by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29820](https://github.com/BerriAI/litellm/pull/29820) - feat(mcp): per-server env vars with global + per-user scopes by [@​mateo-berri](https://github.com/mateo-berri) in [#​28917](https://github.com/BerriAI/litellm/pull/28917) - refactor(ui): route behavior-preserving networking calls through apiClient by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29806](https://github.com/BerriAI/litellm/pull/29806) - fix(mcp): persist Tools-tab MCP OAuth token to DB by [@​tin-berri](https://github.com/tin-berri) in [#​29809](https://github.com/BerriAI/litellm/pull/29809) - fix(ui): require new expiration when regenerating an expired key by [@​milan-berri](https://github.com/milan-berri) in [#​29838](https://github.com/BerriAI/litellm/pull/29838) - refactor(ui): route query-building networking calls through apiClient by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29815](https://github.com/BerriAI/litellm/pull/29815) - Make the image-gen record/replay proxy report cache mode and per-request HIT/MISS by [@​mateo-berri](https://github.com/mateo-berri) in [#​29802](https://github.com/BerriAI/litellm/pull/29802) - feat(proxy): hot-reload .env in dev when running with --reload by [@​mateo-berri](https://github.com/mateo-berri) in [#​29783](https://github.com/BerriAI/litellm/pull/29783) - fix(ui): stop MCP playground tool calls from sending twice by [@​tin-berri](https://github.com/tin-berri) in [#​29821](https://github.com/BerriAI/litellm/pull/29821) - feat(fal\_ai): add Nano Banana / Gemini 2.5 Flash Image generation support by [@​mateo-berri](https://github.com/mateo-berri) in [#​29798](https://github.com/BerriAI/litellm/pull/29798) - Title: Fix managed batch cancel credential resolution by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29734](https://github.com/BerriAI/litellm/pull/29734) - Title: fix(proxy): resolve vector store file list credentials from team deployments by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29739](https://github.com/BerriAI/litellm/pull/29739) - refactor: convert AWS and GCP Terraform stacks into reusable modules … by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28103](https://github.com/BerriAI/litellm/pull/28103) - chore(ui): build ui for release by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29853](https://github.com/BerriAI/litellm/pull/29853) - fix(terraform/gcp): prompt for image\_registry in DeployStack one-click by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29852](https://github.com/BerriAI/litellm/pull/29852) - fix(terraform/gcp): abandon SQL user on destroy by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29855](https://github.com/BerriAI/litellm/pull/29855) - Extend the record/replay proxy to chat, embeddings, moderations, rerank, and Anthropic by [@​mateo-berri](https://github.com/mateo-berri) in [#​29847](https://github.com/BerriAI/litellm/pull/29847) - chore(deps): bump deps by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29860](https://github.com/BerriAI/litellm/pull/29860) - chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29861](https://github.com/BerriAI/litellm/pull/29861) - fix: 400 on Anthropic context overflow; seed identity on failed auth by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29848](https://github.com/BerriAI/litellm/pull/29848) - chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29862](https://github.com/BerriAI/litellm/pull/29862) - chore(release): patch v1.89.0-rc.1 with [#​30064](https://github.com/BerriAI/litellm/issues/30064) (Claude Fable 5) for v1.89.0-rc.2 by [@​mateo-berri](https://github.com/mateo-berri) in [#​30143](https://github.com/BerriAI/litellm/pull/30143) **Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.89.0> ### [`v1.89.0`](https://github.com/BerriAI/litellm/releases/tag/v1.89.0) [Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.2...v1.89.0) ##### Verify Docker Image Signature All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0). **Verify using the pinned commit hash (recommended):** A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \ ghcr.io/berriai/litellm:v1.89.0 ``` **Verify using the release tag (convenience):** Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/v1.89.0/cosign.pub \ ghcr.io/berriai/litellm:v1.89.0 ``` Expected output: ``` The following checks were performed on each of these signatures: - The cosign claims were validated - The signatures were verified against the specified public key ``` *** ##### What's Changed - test(responses): bump deprecated gemini-3-pro-preview to gemini-3.1-pro-preview by [@​mateo-berri](https://github.com/mateo-berri) in [#​29433](https://github.com/BerriAI/litellm/pull/29433) - fix: map mistral/ministral-8b-latest in model price map by [@​mateo-berri](https://github.com/mateo-berri) in [#​29453](https://github.com/BerriAI/litellm/pull/29453) - fix(datadog): split oversized batches on 413 instead of re-queueing forever by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29444](https://github.com/BerriAI/litellm/pull/29444) - feat(otel): allowlist team\_metadata sub-keys promoted to baggage by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29442](https://github.com/BerriAI/litellm/pull/29442) - fix: stop use\_chat\_completions\_api flag from leaking into provider request body by [@​mateo-berri](https://github.com/mateo-berri) in [#​29447](https://github.com/BerriAI/litellm/pull/29447) - fix(anthropic, fireworks): inline legacy $ref defs in tool schemas by [@​milan-berri](https://github.com/milan-berri) in [#​28646](https://github.com/BerriAI/litellm/pull/28646) - fix(proxy): omit OpenAI \[DONE] on google-genai streamGenerateContent by [@​Sameerlite](https://github.com/Sameerlite) in [#​29426](https://github.com/BerriAI/litellm/pull/29426) - ci(release): create stable/X.Y.x line branch on X.Y.0 tags by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29457](https://github.com/BerriAI/litellm/pull/29457) - fix(vector-stores): support engines URL for Vertex AI Search by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​27885](https://github.com/BerriAI/litellm/pull/27885) - fix(ui): render caller-supplied filter options in caller order by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29462](https://github.com/BerriAI/litellm/pull/29462) - fix(batches): skip unnecessary batch input file reads by [@​Sameerlite](https://github.com/Sameerlite) in [#​29114](https://github.com/BerriAI/litellm/pull/29114) - docs(agents): clarify when to create new test files by [@​Sameerlite](https://github.com/Sameerlite) in [#​29472](https://github.com/BerriAI/litellm/pull/29472) - Litellm OSS Staging by [@​Sameerlite](https://github.com/Sameerlite) in [#​29161](https://github.com/BerriAI/litellm/pull/29161) - fix(mcp): clear allowed\_tools and tool overrides on MCP server edit by [@​Sameerlite](https://github.com/Sameerlite) in [#​29411](https://github.com/BerriAI/litellm/pull/29411) - Litellm OSS Staging 010626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29422](https://github.com/BerriAI/litellm/pull/29422) - fix(ci): make CircleCI rerun-failed-tests collect tests when 2+ test files fail by [@​mateo-berri](https://github.com/mateo-berri) in [#​29475](https://github.com/BerriAI/litellm/pull/29475) - feat(a2a): watsonx Orchestrate agent provider by [@​Sameerlite](https://github.com/Sameerlite) in [#​29410](https://github.com/BerriAI/litellm/pull/29410) - fix(azure\_ai): strip tool-level extra fields on 400 and retry by [@​Sameerlite](https://github.com/Sameerlite) in [#​29479](https://github.com/BerriAI/litellm/pull/29479) - fix(docs): remove fixed dimensions from README hero image by [@​mateo-berri](https://github.com/mateo-berri) in [#​29496](https://github.com/BerriAI/litellm/pull/29496) - Litellm oss staging by [@​Sameerlite](https://github.com/Sameerlite) in [#​29492](https://github.com/BerriAI/litellm/pull/29492) - fix: small CLAUDE.md nits by [@​mateo-berri](https://github.com/mateo-berri) in [#​29504](https://github.com/BerriAI/litellm/pull/29504) - Add MCP semantic conventions to otelv2 by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29468](https://github.com/BerriAI/litellm/pull/29468) - fix(passthrough): emit otel guardrail span when a guardrail blocks by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29470](https://github.com/BerriAI/litellm/pull/29470) - fix(proxy): strip NUL bytes from spend log payloads to prevent PostgreSQL 22P05 by [@​milan-berri](https://github.com/milan-berri) in [#​29515](https://github.com/BerriAI/litellm/pull/29515) - \[internal copy of [#​28008](https://github.com/BerriAI/litellm/issues/28008)] Support MCP OAuth passthrough and issuer-scoped JWT auth by [@​mateo-berri](https://github.com/mateo-berri) in [#​28356](https://github.com/BerriAI/litellm/pull/28356) - feat(vector-stores): forward per-request params to Vertex AI Search by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29459](https://github.com/BerriAI/litellm/pull/29459) - feat(proxy): add per-MCP-server RPM rate limiting for keys and teams by [@​Sameerlite](https://github.com/Sameerlite) in [#​29482](https://github.com/BerriAI/litellm/pull/29482) - fix(tests): drop module-level test calls that break local\_testing collection by [@​mateo-berri](https://github.com/mateo-berri) in [#​29520](https://github.com/BerriAI/litellm/pull/29520) - feat(agents): add LangFlow agent provider with A2A session bridging by [@​Sameerlite](https://github.com/Sameerlite) in [#​28963](https://github.com/BerriAI/litellm/pull/28963) - fix(ui/agents): make A2A skill tags enterable and validated by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29512](https://github.com/BerriAI/litellm/pull/29512) - \[internal copy of [#​29232](https://github.com/BerriAI/litellm/issues/29232)] feat: route future Claude models to Anthropic provider via pattern matching by [@​mateo-berri](https://github.com/mateo-berri) in [#​29239](https://github.com/BerriAI/litellm/pull/29239) - fix(tests): drop import-time completion call in test\_register\_model by [@​mateo-berri](https://github.com/mateo-berri) in [#​29521](https://github.com/BerriAI/litellm/pull/29521) - test: stabilize batch VCR coverage and stop live upload/network leaks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29477](https://github.com/BerriAI/litellm/pull/29477) - \[internal copy of [#​29003](https://github.com/BerriAI/litellm/issues/29003)] fix(vertex\_ai): use user-supplied api\_base as is for Model Garden OpenAI-compat path by [@​mateo-berri](https://github.com/mateo-berri) in [#​29530](https://github.com/BerriAI/litellm/pull/29530) - feat(proxy): native /health/drain preStop hook for graceful shutdown by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29439](https://github.com/BerriAI/litellm/pull/29439) - fix(auth): preserve 401 status for expired JWTs in OTel traces by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29510](https://github.com/BerriAI/litellm/pull/29510) - fix(otel): capture 401 error details in management endpoint spans by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29535](https://github.com/BerriAI/litellm/pull/29535) - test(proxy/utils): pin bottom-of-file helper behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29509](https://github.com/BerriAI/litellm/pull/29509) - test(proxy/utils): pin PrismaClient and spend-update behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29488](https://github.com/BerriAI/litellm/pull/29488) - test(proxy/utils): pin ProxyLogging behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29485](https://github.com/BerriAI/litellm/pull/29485) - fix: missing span for guardrail passthrough by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29552](https://github.com/BerriAI/litellm/pull/29552) - fix(auth): let internal users view search tools by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29542](https://github.com/BerriAI/litellm/pull/29542) - fix: missing mcp otel attributes by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29554](https://github.com/BerriAI/litellm/pull/29554) - fix(proxy): resolve managed video model ids for auth by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29545](https://github.com/BerriAI/litellm/pull/29545) - fix(key\_generate): allow team members to create keys on org-scoped teams by [@​milan-berri](https://github.com/milan-berri) in [#​29310](https://github.com/BerriAI/litellm/pull/29310) - test(pass-through): move Gemini pass-through tests to gemini-3.1-flash-lite by [@​mateo-berri](https://github.com/mateo-berri) in [#​29595](https://github.com/BerriAI/litellm/pull/29595) - Litellm oss staging 030626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29578](https://github.com/BerriAI/litellm/pull/29578) - Fix : a2a bugs 030626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29566](https://github.com/BerriAI/litellm/pull/29566) - \[internal copy of [#​29533](https://github.com/BerriAI/litellm/issues/29533)] fix(anthropic/adapter): emit thinking block for reasoning\_content-only streaming chunks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29600](https://github.com/BerriAI/litellm/pull/29600) - ci: reproduce default-Windows wheel install to guard MAX\_PATH by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29597](https://github.com/BerriAI/litellm/pull/29597) - fix(vertex): strip output\_config.effort for Vertex Claude models that reject it (Haiku 4.5) by [@​mateo-berri](https://github.com/mateo-berri) in [#​29585](https://github.com/BerriAI/litellm/pull/29585) - Litellm websocket improvements by [@​Sameerlite](https://github.com/Sameerlite) in [#​29563](https://github.com/BerriAI/litellm/pull/29563) - feat(arize/phoenix): OpenInference rendering parity — tool\_calls, cost, passthrough I/O, session/user, multimodal, cache tokens by [@​milan-berri](https://github.com/milan-berri) in [#​28800](https://github.com/BerriAI/litellm/pull/28800) - \[internal copy of [#​29550](https://github.com/BerriAI/litellm/issues/29550)] fix: passthrough endpoints duplicate logs by [@​mateo-berri](https://github.com/mateo-berri) in [#​29598](https://github.com/BerriAI/litellm/pull/29598) - fix(ci): keep coverage rename green when a parallel node runs no tests by [@​mateo-berri](https://github.com/mateo-berri) in [#​29608](https://github.com/BerriAI/litellm/pull/29608) - test(vcr): close out the remaining VCR live-call leaks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29603](https://github.com/BerriAI/litellm/pull/29603) - fix(key\_generate): exempt UI/CLI session tokens from the budget ceiling for team keys by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29612](https://github.com/BerriAI/litellm/pull/29612) - fix(realtime): allow null transcripts in stream logging payloads by [@​milan-berri](https://github.com/milan-berri) in [#​29625](https://github.com/BerriAI/litellm/pull/29625) - build(ui): migrate eslint to flat config + bump eslint-config-next to 16 by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29626](https://github.com/BerriAI/litellm/pull/29626) - fix(key\_generate): scope session-token team-key budget exemption to caller-supplied team\_id by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29641](https://github.com/BerriAI/litellm/pull/29641) - fix(proxy): disable proxy buffering on streaming SSE responses by [@​mateo-berri](https://github.com/mateo-berri) in [#​29557](https://github.com/BerriAI/litellm/pull/29557) - fix(mcp): gate /public/mcp\_hub strictly on litellm.public\_mcp\_servers by [@​michelligabriele](https://github.com/michelligabriele) in [#​27764](https://github.com/BerriAI/litellm/pull/27764) - ci(ui): frontend-lint job enforcing prettier + eslint on changed files by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29633](https://github.com/BerriAI/litellm/pull/29633) - fix(gemini): googleSearch + server-side tools and googleMaps JSON schema by [@​Sameerlite](https://github.com/Sameerlite) in [#​29582](https://github.com/BerriAI/litellm/pull/29582) - fix(proxy): passthrough 404 when SERVER\_ROOT\_PATH is set by [@​Sameerlite](https://github.com/Sameerlite) in [#​29658](https://github.com/BerriAI/litellm/pull/29658) - fix(gemini-realtime): use GA event names for Pipecat 1.3.x compatibility by [@​Sameerlite](https://github.com/Sameerlite) in [#​29662](https://github.com/BerriAI/litellm/pull/29662) - Litellm oss staging 040626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29671](https://github.com/BerriAI/litellm/pull/29671) - style(ui): prettier formatting pass over the dashboard by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29622](https://github.com/BerriAI/litellm/pull/29622) - chore: ignore prettier dashboard reformat in git blame by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29695](https://github.com/BerriAI/litellm/pull/29695) - fix(helm): Enable Backend Deployment to mount Gateway config.yaml by [@​tin-berri](https://github.com/tin-berri) in [#​29605](https://github.com/BerriAI/litellm/pull/29605) - \[internal copy of [#​29277](https://github.com/BerriAI/litellm/issues/29277)] fix(proxy): add default=None to LiteLLM\_TeamMembership.litellm\_budget\_table by [@​mateo-berri](https://github.com/mateo-berri) in [#​29684](https://github.com/BerriAI/litellm/pull/29684) - test: make custom\_tokenizer proxy tests hermetic by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29643](https://github.com/BerriAI/litellm/pull/29643) - test(proxy): stop running real-DB tests in GitHub Actions unit jobs by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29700](https://github.com/BerriAI/litellm/pull/29700) - chore(ui): remove the bare-fetch lint rule by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29712](https://github.com/BerriAI/litellm/pull/29712) - Litellm jwt mapping virtualkeys by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​28510](https://github.com/BerriAI/litellm/pull/28510) - refactor(ui): shared HTTP client + location-pinned fetch() lint rule by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29723](https://github.com/BerriAI/litellm/pull/29723) - fix(proxy): stop team BYOK model name corruption on model edit by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29731](https://github.com/BerriAI/litellm/pull/29731) - \[internal copy of [#​29511](https://github.com/BerriAI/litellm/issues/29511)] feat(guardrails): add sensitive data routing to on-premise models by [@​mateo-berri](https://github.com/mateo-berri) in [#​29531](https://github.com/BerriAI/litellm/pull/29531) - fix(proxy/hooks): populate llm\_provider on internal rate-limit errors by [@​mateo-berri](https://github.com/mateo-berri) in [#​27707](https://github.com/BerriAI/litellm/pull/27707) - fix(vertex/anthropic): handle namespace tools and strip client\_metadata for codex compatibility by [@​Sameerlite](https://github.com/Sameerlite) in [#​29489](https://github.com/BerriAI/litellm/pull/29489) - Support OAuth M2M for Databricks Apps A2A agents by [@​mateo-berri](https://github.com/mateo-berri) in [#​29586](https://github.com/BerriAI/litellm/pull/29586) - fix: small CLAUDE.md nit by [@​mateo-berri](https://github.com/mateo-berri) in [#​29749](https://github.com/BerriAI/litellm/pull/29749) - fix(anthropic): route Claude Opus 4.8 through adaptive thinking by [@​mateo-berri](https://github.com/mateo-berri) in [#​29702](https://github.com/BerriAI/litellm/pull/29702) - fix(proxy): persist oauth2\_flow on MCP server registration by [@​michelligabriele](https://github.com/michelligabriele) in [#​29690](https://github.com/BerriAI/litellm/pull/29690) - \[internal copy of [#​27491](https://github.com/BerriAI/litellm/issues/27491)] fix(realtime): Fix Realtime Audio Token Cost Tracking by [@​mateo-berri](https://github.com/mateo-berri) in [#​29722](https://github.com/BerriAI/litellm/pull/29722) - fix(galileo): use ingest traces API and standard logging payload by [@​Sameerlite](https://github.com/Sameerlite) in [#​29651](https://github.com/BerriAI/litellm/pull/29651) - fix(auth): expand all-team-models sentinel in can\_key\_call\_model for batch validation by [@​Sameerlite](https://github.com/Sameerlite) in [#​29746](https://github.com/BerriAI/litellm/pull/29746) - test(vcr): stop refreshing cassette TTL on read so cassettes lapse after 24h by [@​mateo-berri](https://github.com/mateo-berri) in [#​29784](https://github.com/BerriAI/litellm/pull/29784) - test(ci): record/replay OpenAI image gen so the spend E2E isn't outage-bound by [@​mateo-berri](https://github.com/mateo-berri) in [#​29787](https://github.com/BerriAI/litellm/pull/29787) - fix(ui): route MCP playground auth by oauth2 mode instead of token\_url by [@​tin-berri](https://github.com/tin-berri) in [#​29714](https://github.com/BerriAI/litellm/pull/29714) - refactor(ui): centralize proxy base URL resolution into tested resolver by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29793](https://github.com/BerriAI/litellm/pull/29793) - Litellm oss staging 050626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29774](https://github.com/BerriAI/litellm/pull/29774) - test(google): add google-genai SDK proxy integration tests by [@​Sameerlite](https://github.com/Sameerlite) in [#​29781](https://github.com/BerriAI/litellm/pull/29781) - fix(jwt): use resolved DB user\_id for spend on legacy email match by [@​milan-berri](https://github.com/milan-berri) in [#​29217](https://github.com/BerriAI/litellm/pull/29217) - feat(ui): generate dashboard API types from the proxy OpenAPI spec by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29816](https://github.com/BerriAI/litellm/pull/29816) - fix(proxy): drop deleted team BYOK model name from team.models by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29820](https://github.com/BerriAI/litellm/pull/29820) - feat(mcp): per-server env vars with global + per-user scopes by [@​mateo-berri](https://github.com/mateo-berri) in [#​28917](https://github.com/BerriAI/litellm/pull/28917) - refactor(ui): route behavior-preserving networking calls through apiClient by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29806](https://github.com/BerriAI/litellm/pull/29806) - fix(mcp): persist Tools-tab MCP OAuth token to DB by [@​tin-berri](https://github.com/tin-berri) in [#​29809](https://github.com/BerriAI/litellm/pull/29809) - fix(ui): require new expiration when regenerating an expired key by [@​milan-berri](https://github.com/milan-berri) in [#​29838](https://github.com/BerriAI/litellm/pull/29838) - refactor(ui): route query-building networking calls through apiClient by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29815](https://github.com/BerriAI/litellm/pull/29815) - Make the image-gen record/replay proxy report cache mode and per-request HIT/MISS by [@​mateo-berri](https://github.com/mateo-berri) in [#​29802](https://github.com/BerriAI/litellm/pull/29802) - feat(proxy): hot-reload .env in dev when running with --reload by [@​mateo-berri](https://github.com/mateo-berri) in [#​29783](https://github.com/BerriAI/litellm/pull/29783) - fix(ui): stop MCP playground tool calls from sending twice by [@​tin-berri](https://github.com/tin-berri) in [#​29821](https://github.com/BerriAI/litellm/pull/29821) - feat(fal\_ai): add Nano Banana / Gemini 2.5 Flash Image generation support by [@​mateo-berri](https://github.com/mateo-berri) in [#​29798](https://github.com/BerriAI/litellm/pull/29798) - Title: Fix managed batch cancel credential resolution by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29734](https://github.com/BerriAI/litellm/pull/29734) - Title: fix(proxy): resolve vector store file list credentials from team deployments by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29739](https://github.com/BerriAI/litellm/pull/29739) - refactor: convert AWS and GCP Terraform stacks into reusable modules … by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28103](https://github.com/BerriAI/litellm/pull/28103) - chore(ui): build ui for release by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29853](https://github.com/BerriAI/litellm/pull/29853) - fix(terraform/gcp): prompt for image\_registry in DeployStack one-click by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29852](https://github.com/BerriAI/litellm/pull/29852) - fix(terraform/gcp): abandon SQL user on destroy by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29855](https://github.com/BerriAI/litellm/pull/29855) - Extend the record/replay proxy to chat, embeddings, moderations, rerank, and Anthropic by [@​mateo-berri](https://github.com/mateo-berri) in [#​29847](https://github.com/BerriAI/litellm/pull/29847) - chore(deps): bump deps by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29860](https://github.com/BerriAI/litellm/pull/29860) - chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29861](https://github.com/BerriAI/litellm/pull/29861) - fix: 400 on Anthropic context overflow; seed identity on failed auth by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29848](https://github.com/BerriAI/litellm/pull/29848) - chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29862](https://github.com/BerriAI/litellm/pull/29862) - chore(release): patch v1.89.0-rc.1 with [#​30064](https://github.com/BerriAI/litellm/issues/30064) (Claude Fable 5) for v1.89.0-rc.2 by [@​mateo-berri](https://github.com/mateo-berri) in [#​30143](https://github.com/BerriAI/litellm/pull/30143) **Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.89.0> ### [`v1.88.2`](https://github.com/BerriAI/litellm/releases/tag/v1.88.2) [Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.2...v1.88.2) ##### Verify Docker Image Signature All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0). **Verify using the pinned commit hash (recommended):** A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \ ghcr.io/berriai/litellm:v1.88.2 ``` **Verify using the release tag (convenience):** Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.2/cosign.pub \ ghcr.io/berriai/litellm:v1.88.2 ``` Expected output: ``` The following checks were performed on each of these signatures: - The cosign claims were validated - The signatures were verified against the specified public key ``` *** ##### What's Changed - chore(release): backport Fable 5, batch-file auth, CrowdStrike AIDR, Mantle Responses SigV4, and NetApp streaming-cost fix to stable/1.88.x and cut 1.88.2 by [@​mateo-berri](https://github.com/mateo-berri) in [#​30144](https://github.com/BerriAI/litellm/pull/30144) - chore(release): backport DB-resilience, passthrough, model-info, budget, and deps fixes to stable/1.88.x by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​30408](https://github.com/BerriAI/litellm/pull/30408) **Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2> ### [`v1.88.2`](https://github.com/BerriAI/litellm/releases/tag/v1.88.2) [Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2) ##### Verify Docker Image Signature All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0). **Verify using the pinned commit hash (recommended):** A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \ ghcr.io/berriai/litellm:v1.88.2 ``` **Verify using the release tag (convenience):** Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules: ```bash cosign verify \ --key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.2/cosign.pub \ ghcr.io/berriai/litellm:v1.88.2 ``` Expected output: ``` The following checks were performed on each of these signatures: - The cosign claims were validated - The signatures were verified against the specified public key ``` *** ##### What's Changed - chore(release): backport Fable 5, batch-file auth, CrowdStrike AIDR, Mantle Responses SigV4, and NetApp streaming-cost fix to stable/1.88.x and cut 1.88.2 by [@​mateo-berri](https://github.com/mateo-berri) in [#​30144](https://github.com/BerriAI/litellm/pull/30144) - chore(release): backport DB-resilience, passthrough, model-info, budget, and deps fixes to stable/1.88.x by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​30408](https://github.com/BerriAI/litellm/pull/30408) **Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2> </details> --- ### Configuration 📅 **Schedule**: (in timezone Europe/London) - Branch creation - At any time (no schedule defined) - Automerge - At any time (no schedule defined) 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4yMTkuMCIsInVwZGF0ZWRJblZlciI6IjQzLjIxOS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL21pbm9yIl19--> Reviewed-on: https://forgejo.hayden.moe/hayden/phoebe/pulls/93
…to v1.89.0 (#200)
This PR contains the following updates:
| Package | Update | Change |
|---|---|---|
| [https://github.com/BerriAI/litellm.git](https://github.com/BerriAI/litellm) | minor | `v1.85.1` → `v1.89.0` |
---
> ⚠️ **Warning**
>
> Some dependencies could not be looked up. Check the [Dependency Dashboard](issues/155) for more information.
---
### Release Notes
<details>
<summary>BerriAI/litellm (https://github.com/BerriAI/litellm.git)</summary>
### [`v1.89.0`](https://github.com/BerriAI/litellm/releases/tag/v1.89.0)
[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.2...v1.89.0)
#### Verify Docker Image Signature
All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).
**Verify using the pinned commit hash (recommended):**
A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
ghcr.io/berriai/litellm:v1.89.0
```
**Verify using the release tag (convenience):**
Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/v1.89.0/cosign.pub \
ghcr.io/berriai/litellm:v1.89.0
```
Expected output:
```
The following checks were performed on each of these signatures:
- The cosign claims were validated
- The signatures were verified against the specified public key
```
***
#### What's Changed
- test(responses): bump deprecated gemini-3-pro-preview to gemini-3.1-pro-preview by [@​mateo-berri](https://github.com/mateo-berri) in [#​29433](https://github.com/BerriAI/litellm/pull/29433)
- fix: map mistral/ministral-8b-latest in model price map by [@​mateo-berri](https://github.com/mateo-berri) in [#​29453](https://github.com/BerriAI/litellm/pull/29453)
- fix(datadog): split oversized batches on 413 instead of re-queueing forever by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29444](https://github.com/BerriAI/litellm/pull/29444)
- feat(otel): allowlist team\_metadata sub-keys promoted to baggage by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29442](https://github.com/BerriAI/litellm/pull/29442)
- fix: stop use\_chat\_completions\_api flag from leaking into provider request body by [@​mateo-berri](https://github.com/mateo-berri) in [#​29447](https://github.com/BerriAI/litellm/pull/29447)
- fix(anthropic, fireworks): inline legacy $ref defs in tool schemas by [@​milan-berri](https://github.com/milan-berri) in [#​28646](https://github.com/BerriAI/litellm/pull/28646)
- fix(proxy): omit OpenAI \[DONE] on google-genai streamGenerateContent by [@​Sameerlite](https://github.com/Sameerlite) in [#​29426](https://github.com/BerriAI/litellm/pull/29426)
- ci(release): create stable/X.Y.x line branch on X.Y.0 tags by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29457](https://github.com/BerriAI/litellm/pull/29457)
- fix(vector-stores): support engines URL for Vertex AI Search by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​27885](https://github.com/BerriAI/litellm/pull/27885)
- fix(ui): render caller-supplied filter options in caller order by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29462](https://github.com/BerriAI/litellm/pull/29462)
- fix(batches): skip unnecessary batch input file reads by [@​Sameerlite](https://github.com/Sameerlite) in [#​29114](https://github.com/BerriAI/litellm/pull/29114)
- docs(agents): clarify when to create new test files by [@​Sameerlite](https://github.com/Sameerlite) in [#​29472](https://github.com/BerriAI/litellm/pull/29472)
- Litellm OSS Staging by [@​Sameerlite](https://github.com/Sameerlite) in [#​29161](https://github.com/BerriAI/litellm/pull/29161)
- fix(mcp): clear allowed\_tools and tool overrides on MCP server edit by [@​Sameerlite](https://github.com/Sameerlite) in [#​29411](https://github.com/BerriAI/litellm/pull/29411)
- Litellm OSS Staging 010626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29422](https://github.com/BerriAI/litellm/pull/29422)
- fix(ci): make CircleCI rerun-failed-tests collect tests when 2+ test files fail by [@​mateo-berri](https://github.com/mateo-berri) in [#​29475](https://github.com/BerriAI/litellm/pull/29475)
- feat(a2a): watsonx Orchestrate agent provider by [@​Sameerlite](https://github.com/Sameerlite) in [#​29410](https://github.com/BerriAI/litellm/pull/29410)
- fix(azure\_ai): strip tool-level extra fields on 400 and retry by [@​Sameerlite](https://github.com/Sameerlite) in [#​29479](https://github.com/BerriAI/litellm/pull/29479)
- fix(docs): remove fixed dimensions from README hero image by [@​mateo-berri](https://github.com/mateo-berri) in [#​29496](https://github.com/BerriAI/litellm/pull/29496)
- Litellm oss staging by [@​Sameerlite](https://github.com/Sameerlite) in [#​29492](https://github.com/BerriAI/litellm/pull/29492)
- fix: small CLAUDE.md nits by [@​mateo-berri](https://github.com/mateo-berri) in [#​29504](https://github.com/BerriAI/litellm/pull/29504)
- Add MCP semantic conventions to otelv2 by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29468](https://github.com/BerriAI/litellm/pull/29468)
- fix(passthrough): emit otel guardrail span when a guardrail blocks by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29470](https://github.com/BerriAI/litellm/pull/29470)
- fix(proxy): strip NUL bytes from spend log payloads to prevent PostgreSQL 22P05 by [@​milan-berri](https://github.com/milan-berri) in [#​29515](https://github.com/BerriAI/litellm/pull/29515)
- \[internal copy of [#​28008](https://github.com/BerriAI/litellm/issues/28008)] Support MCP OAuth passthrough and issuer-scoped JWT auth by [@​mateo-berri](https://github.com/mateo-berri) in [#​28356](https://github.com/BerriAI/litellm/pull/28356)
- feat(vector-stores): forward per-request params to Vertex AI Search by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29459](https://github.com/BerriAI/litellm/pull/29459)
- feat(proxy): add per-MCP-server RPM rate limiting for keys and teams by [@​Sameerlite](https://github.com/Sameerlite) in [#​29482](https://github.com/BerriAI/litellm/pull/29482)
- fix(tests): drop module-level test calls that break local\_testing collection by [@​mateo-berri](https://github.com/mateo-berri) in [#​29520](https://github.com/BerriAI/litellm/pull/29520)
- feat(agents): add LangFlow agent provider with A2A session bridging by [@​Sameerlite](https://github.com/Sameerlite) in [#​28963](https://github.com/BerriAI/litellm/pull/28963)
- fix(ui/agents): make A2A skill tags enterable and validated by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29512](https://github.com/BerriAI/litellm/pull/29512)
- \[internal copy of [#​29232](https://github.com/BerriAI/litellm/issues/29232)] feat: route future Claude models to Anthropic provider via pattern matching by [@​mateo-berri](https://github.com/mateo-berri) in [#​29239](https://github.com/BerriAI/litellm/pull/29239)
- fix(tests): drop import-time completion call in test\_register\_model by [@​mateo-berri](https://github.com/mateo-berri) in [#​29521](https://github.com/BerriAI/litellm/pull/29521)
- test: stabilize batch VCR coverage and stop live upload/network leaks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29477](https://github.com/BerriAI/litellm/pull/29477)
- \[internal copy of [#​29003](https://github.com/BerriAI/litellm/issues/29003)] fix(vertex\_ai): use user-supplied api\_base as is for Model Garden OpenAI-compat path by [@​mateo-berri](https://github.com/mateo-berri) in [#​29530](https://github.com/BerriAI/litellm/pull/29530)
- feat(proxy): native /health/drain preStop hook for graceful shutdown by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29439](https://github.com/BerriAI/litellm/pull/29439)
- fix(auth): preserve 401 status for expired JWTs in OTel traces by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29510](https://github.com/BerriAI/litellm/pull/29510)
- fix(otel): capture 401 error details in management endpoint spans by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29535](https://github.com/BerriAI/litellm/pull/29535)
- test(proxy/utils): pin bottom-of-file helper behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29509](https://github.com/BerriAI/litellm/pull/29509)
- test(proxy/utils): pin PrismaClient and spend-update behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29488](https://github.com/BerriAI/litellm/pull/29488)
- test(proxy/utils): pin ProxyLogging behavior by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29485](https://github.com/BerriAI/litellm/pull/29485)
- fix: missing span for guardrail passthrough by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29552](https://github.com/BerriAI/litellm/pull/29552)
- fix(auth): let internal users view search tools by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29542](https://github.com/BerriAI/litellm/pull/29542)
- fix: missing mcp otel attributes by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29554](https://github.com/BerriAI/litellm/pull/29554)
- fix(proxy): resolve managed video model ids for auth by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29545](https://github.com/BerriAI/litellm/pull/29545)
- fix(key\_generate): allow team members to create keys on org-scoped teams by [@​milan-berri](https://github.com/milan-berri) in [#​29310](https://github.com/BerriAI/litellm/pull/29310)
- test(pass-through): move Gemini pass-through tests to gemini-3.1-flash-lite by [@​mateo-berri](https://github.com/mateo-berri) in [#​29595](https://github.com/BerriAI/litellm/pull/29595)
- Litellm oss staging 030626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29578](https://github.com/BerriAI/litellm/pull/29578)
- Fix : a2a bugs 030626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29566](https://github.com/BerriAI/litellm/pull/29566)
- \[internal copy of [#​29533](https://github.com/BerriAI/litellm/issues/29533)] fix(anthropic/adapter): emit thinking block for reasoning\_content-only streaming chunks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29600](https://github.com/BerriAI/litellm/pull/29600)
- ci: reproduce default-Windows wheel install to guard MAX\_PATH by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29597](https://github.com/BerriAI/litellm/pull/29597)
- fix(vertex): strip output\_config.effort for Vertex Claude models that reject it (Haiku 4.5) by [@​mateo-berri](https://github.com/mateo-berri) in [#​29585](https://github.com/BerriAI/litellm/pull/29585)
- Litellm websocket improvements by [@​Sameerlite](https://github.com/Sameerlite) in [#​29563](https://github.com/BerriAI/litellm/pull/29563)
- feat(arize/phoenix): OpenInference rendering parity — tool\_calls, cost, passthrough I/O, session/user, multimodal, cache tokens by [@​milan-berri](https://github.com/milan-berri) in [#​28800](https://github.com/BerriAI/litellm/pull/28800)
- \[internal copy of [#​29550](https://github.com/BerriAI/litellm/issues/29550)] fix: passthrough endpoints duplicate logs by [@​mateo-berri](https://github.com/mateo-berri) in [#​29598](https://github.com/BerriAI/litellm/pull/29598)
- fix(ci): keep coverage rename green when a parallel node runs no tests by [@​mateo-berri](https://github.com/mateo-berri) in [#​29608](https://github.com/BerriAI/litellm/pull/29608)
- test(vcr): close out the remaining VCR live-call leaks by [@​mateo-berri](https://github.com/mateo-berri) in [#​29603](https://github.com/BerriAI/litellm/pull/29603)
- fix(key\_generate): exempt UI/CLI session tokens from the budget ceiling for team keys by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29612](https://github.com/BerriAI/litellm/pull/29612)
- fix(realtime): allow null transcripts in stream logging payloads by [@​milan-berri](https://github.com/milan-berri) in [#​29625](https://github.com/BerriAI/litellm/pull/29625)
- build(ui): migrate eslint to flat config + bump eslint-config-next to 16 by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29626](https://github.com/BerriAI/litellm/pull/29626)
- fix(key\_generate): scope session-token team-key budget exemption to caller-supplied team\_id by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29641](https://github.com/BerriAI/litellm/pull/29641)
- fix(proxy): disable proxy buffering on streaming SSE responses by [@​mateo-berri](https://github.com/mateo-berri) in [#​29557](https://github.com/BerriAI/litellm/pull/29557)
- fix(mcp): gate /public/mcp\_hub strictly on litellm.public\_mcp\_servers by [@​michelligabriele](https://github.com/michelligabriele) in [#​27764](https://github.com/BerriAI/litellm/pull/27764)
- ci(ui): frontend-lint job enforcing prettier + eslint on changed files by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29633](https://github.com/BerriAI/litellm/pull/29633)
- fix(gemini): googleSearch + server-side tools and googleMaps JSON schema by [@​Sameerlite](https://github.com/Sameerlite) in [#​29582](https://github.com/BerriAI/litellm/pull/29582)
- fix(proxy): passthrough 404 when SERVER\_ROOT\_PATH is set by [@​Sameerlite](https://github.com/Sameerlite) in [#​29658](https://github.com/BerriAI/litellm/pull/29658)
- fix(gemini-realtime): use GA event names for Pipecat 1.3.x compatibility by [@​Sameerlite](https://github.com/Sameerlite) in [#​29662](https://github.com/BerriAI/litellm/pull/29662)
- Litellm oss staging 040626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29671](https://github.com/BerriAI/litellm/pull/29671)
- style(ui): prettier formatting pass over the dashboard by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29622](https://github.com/BerriAI/litellm/pull/29622)
- chore: ignore prettier dashboard reformat in git blame by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29695](https://github.com/BerriAI/litellm/pull/29695)
- fix(helm): Enable Backend Deployment to mount Gateway config.yaml by [@​tin-berri](https://github.com/tin-berri) in [#​29605](https://github.com/BerriAI/litellm/pull/29605)
- \[internal copy of [#​29277](https://github.com/BerriAI/litellm/issues/29277)] fix(proxy): add default=None to LiteLLM\_TeamMembership.litellm\_budget\_table by [@​mateo-berri](https://github.com/mateo-berri) in [#​29684](https://github.com/BerriAI/litellm/pull/29684)
- test: make custom\_tokenizer proxy tests hermetic by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29643](https://github.com/BerriAI/litellm/pull/29643)
- test(proxy): stop running real-DB tests in GitHub Actions unit jobs by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29700](https://github.com/BerriAI/litellm/pull/29700)
- chore(ui): remove the bare-fetch lint rule by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29712](https://github.com/BerriAI/litellm/pull/29712)
- Litellm jwt mapping virtualkeys by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​28510](https://github.com/BerriAI/litellm/pull/28510)
- refactor(ui): shared HTTP client + location-pinned fetch() lint rule by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29723](https://github.com/BerriAI/litellm/pull/29723)
- fix(proxy): stop team BYOK model name corruption on model edit by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29731](https://github.com/BerriAI/litellm/pull/29731)
- \[internal copy of [#​29511](https://github.com/BerriAI/litellm/issues/29511)] feat(guardrails): add sensitive data routing to on-premise models by [@​mateo-berri](https://github.com/mateo-berri) in [#​29531](https://github.com/BerriAI/litellm/pull/29531)
- fix(proxy/hooks): populate llm\_provider on internal rate-limit errors by [@​mateo-berri](https://github.com/mateo-berri) in [#​27707](https://github.com/BerriAI/litellm/pull/27707)
- fix(vertex/anthropic): handle namespace tools and strip client\_metadata for codex compatibility by [@​Sameerlite](https://github.com/Sameerlite) in [#​29489](https://github.com/BerriAI/litellm/pull/29489)
- Support OAuth M2M for Databricks Apps A2A agents by [@​mateo-berri](https://github.com/mateo-berri) in [#​29586](https://github.com/BerriAI/litellm/pull/29586)
- fix: small CLAUDE.md nit by [@​mateo-berri](https://github.com/mateo-berri) in [#​29749](https://github.com/BerriAI/litellm/pull/29749)
- fix(anthropic): route Claude Opus 4.8 through adaptive thinking by [@​mateo-berri](https://github.com/mateo-berri) in [#​29702](https://github.com/BerriAI/litellm/pull/29702)
- fix(proxy): persist oauth2\_flow on MCP server registration by [@​michelligabriele](https://github.com/michelligabriele) in [#​29690](https://github.com/BerriAI/litellm/pull/29690)
- \[internal copy of [#​27491](https://github.com/BerriAI/litellm/issues/27491)] fix(realtime): Fix Realtime Audio Token Cost Tracking by [@​mateo-berri](https://github.com/mateo-berri) in [#​29722](https://github.com/BerriAI/litellm/pull/29722)
- fix(galileo): use ingest traces API and standard logging payload by [@​Sameerlite](https://github.com/Sameerlite) in [#​29651](https://github.com/BerriAI/litellm/pull/29651)
- fix(auth): expand all-team-models sentinel in can\_key\_call\_model for batch validation by [@​Sameerlite](https://github.com/Sameerlite) in [#​29746](https://github.com/BerriAI/litellm/pull/29746)
- test(vcr): stop refreshing cassette TTL on read so cassettes lapse after 24h by [@​mateo-berri](https://github.com/mateo-berri) in [#​29784](https://github.com/BerriAI/litellm/pull/29784)
- test(ci): record/replay OpenAI image gen so the spend E2E isn't outage-bound by [@​mateo-berri](https://github.com/mateo-berri) in [#​29787](https://github.com/BerriAI/litellm/pull/29787)
- fix(ui): route MCP playground auth by oauth2 mode instead of token\_url by [@​tin-berri](https://github.com/tin-berri) in [#​29714](https://github.com/BerriAI/litellm/pull/29714)
- refactor(ui): centralize proxy base URL resolution into tested resolver by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29793](https://github.com/BerriAI/litellm/pull/29793)
- Litellm oss staging 050626 by [@​Sameerlite](https://github.com/Sameerlite) in [#​29774](https://github.com/BerriAI/litellm/pull/29774)
- test(google): add google-genai SDK proxy integration tests by [@​Sameerlite](https://github.com/Sameerlite) in [#​29781](https://github.com/BerriAI/litellm/pull/29781)
- fix(jwt): use resolved DB user\_id for spend on legacy email match by [@​milan-berri](https://github.com/milan-berri) in [#​29217](https://github.com/BerriAI/litellm/pull/29217)
- feat(ui): generate dashboard API types from the proxy OpenAPI spec by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29816](https://github.com/BerriAI/litellm/pull/29816)
- fix(proxy): drop deleted team BYOK model name from team.models by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29820](https://github.com/BerriAI/litellm/pull/29820)
- feat(mcp): per-server env vars with global + per-user scopes by [@​mateo-berri](https://github.com/mateo-berri) in [#​28917](https://github.com/BerriAI/litellm/pull/28917)
- refactor(ui): route behavior-preserving networking calls through apiClient by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29806](https://github.com/BerriAI/litellm/pull/29806)
- fix(mcp): persist Tools-tab MCP OAuth token to DB by [@​tin-berri](https://github.com/tin-berri) in [#​29809](https://github.com/BerriAI/litellm/pull/29809)
- fix(ui): require new expiration when regenerating an expired key by [@​milan-berri](https://github.com/milan-berri) in [#​29838](https://github.com/BerriAI/litellm/pull/29838)
- refactor(ui): route query-building networking calls through apiClient by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29815](https://github.com/BerriAI/litellm/pull/29815)
- Make the image-gen record/replay proxy report cache mode and per-request HIT/MISS by [@​mateo-berri](https://github.com/mateo-berri) in [#​29802](https://github.com/BerriAI/litellm/pull/29802)
- feat(proxy): hot-reload .env in dev when running with --reload by [@​mateo-berri](https://github.com/mateo-berri) in [#​29783](https://github.com/BerriAI/litellm/pull/29783)
- fix(ui): stop MCP playground tool calls from sending twice by [@​tin-berri](https://github.com/tin-berri) in [#​29821](https://github.com/BerriAI/litellm/pull/29821)
- feat(fal\_ai): add Nano Banana / Gemini 2.5 Flash Image generation support by [@​mateo-berri](https://github.com/mateo-berri) in [#​29798](https://github.com/BerriAI/litellm/pull/29798)
- Title: Fix managed batch cancel credential resolution by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29734](https://github.com/BerriAI/litellm/pull/29734)
- Title: fix(proxy): resolve vector store file list credentials from team deployments by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29739](https://github.com/BerriAI/litellm/pull/29739)
- refactor: convert AWS and GCP Terraform stacks into reusable modules … by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28103](https://github.com/BerriAI/litellm/pull/28103)
- chore(ui): build ui for release by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29853](https://github.com/BerriAI/litellm/pull/29853)
- fix(terraform/gcp): prompt for image\_registry in DeployStack one-click by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29852](https://github.com/BerriAI/litellm/pull/29852)
- fix(terraform/gcp): abandon SQL user on destroy by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29855](https://github.com/BerriAI/litellm/pull/29855)
- Extend the record/replay proxy to chat, embeddings, moderations, rerank, and Anthropic by [@​mateo-berri](https://github.com/mateo-berri) in [#​29847](https://github.com/BerriAI/litellm/pull/29847)
- chore(deps): bump deps by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29860](https://github.com/BerriAI/litellm/pull/29860)
- chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29861](https://github.com/BerriAI/litellm/pull/29861)
- fix: 400 on Anthropic context overflow; seed identity on failed auth by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29848](https://github.com/BerriAI/litellm/pull/29848)
- chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29862](https://github.com/BerriAI/litellm/pull/29862)
- chore(release): patch v1.89.0-rc.1 with [#​30064](https://github.com/BerriAI/litellm/issues/30064) (Claude Fable 5) for v1.89.0-rc.2 by [@​mateo-berri](https://github.com/mateo-berri) in [#​30143](https://github.com/BerriAI/litellm/pull/30143)
**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.89.0>
### [`v1.88.2`](https://github.com/BerriAI/litellm/releases/tag/v1.88.2)
[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2)
#### Verify Docker Image Signature
All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).
**Verify using the pinned commit hash (recommended):**
A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
ghcr.io/berriai/litellm:v1.88.2
```
**Verify using the release tag (convenience):**
Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.2/cosign.pub \
ghcr.io/berriai/litellm:v1.88.2
```
Expected output:
```
The following checks were performed on each of these signatures:
- The cosign claims were validated
- The signatures were verified against the specified public key
```
***
#### What's Changed
- chore(release): backport Fable 5, batch-file auth, CrowdStrike AIDR, Mantle Responses SigV4, and NetApp streaming-cost fix to stable/1.88.x and cut 1.88.2 by [@​mateo-berri](https://github.com/mateo-berri) in [#​30144](https://github.com/BerriAI/litellm/pull/30144)
- chore(release): backport DB-resilience, passthrough, model-info, budget, and deps fixes to stable/1.88.x by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​30408](https://github.com/BerriAI/litellm/pull/30408)
**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2>
### [`v1.88.1`](https://github.com/BerriAI/litellm/releases/tag/v1.88.1)
[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.0...v1.88.1)
#### Verify Docker Image Signature
All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).
**Verify using the pinned commit hash (recommended):**
A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
ghcr.io/berriai/litellm:v1.88.1
```
**Verify using the release tag (convenience):**
Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.1/cosign.pub \
ghcr.io/berriai/litellm:v1.88.1
```
Expected output:
```
The following checks were performed on each of these signatures:
- The cosign claims were validated
- The signatures were verified against the specified public key
```
***
#### What's Changed
- build(deps): bump pyjwt to 2.13.0 and ws override to 8.20.1 (1.88.x) by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29987](https://github.com/BerriAI/litellm/pull/29987)
- chore(release): bump version to 1.88.1 by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29989](https://github.com/BerriAI/litellm/pull/29989)
**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.88.1>
### [`v1.88.0`](https://github.com/BerriAI/litellm/releases/tag/v1.88.0)
[Compare Source](https://github.com/BerriAI/litellm/compare/v1.87.3...v1.88.0)
#### Verify Docker Image Signature
All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).
**Verify using the pinned commit hash (recommended):**
A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
ghcr.io/berriai/litellm:v1.88.0
```
**Verify using the release tag (convenience):**
Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:
```bash
cosign verify \
--key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.0/cosign.pub \
ghcr.io/berriai/litellm:v1.88.0
```
Expected output:
```
The following checks were performed on each of these signatures:
- The cosign claims were validated
- The signatures were verified against the specified public key
```
***
#### What's Changed
- fix(proxy): gate team allowed\_passthrough\_routes to proxy admins by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28097](https://github.com/BerriAI/litellm/pull/28097)
- fix(tests): stabilize image-edit VCR cassettes to stop live gpt-image-1 spend by [@​mateo-berri](https://github.com/mateo-berri) in [#​28110](https://github.com/BerriAI/litellm/pull/28110)
- fix(bedrock/cohere): send embedding\_types as JSON array, not string by [@​ishaan-berri](https://github.com/ishaan-berri) in [#​28172](https://github.com/BerriAI/litellm/pull/28172)
- fix(tests): migrate realtime + rerank tests off shut-down upstream models by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28191](https://github.com/BerriAI/litellm/pull/28191)
- fix(caching): replay openai/responses bridge cache hits as chat streams by [@​Sameerlite](https://github.com/Sameerlite) in [#​28158](https://github.com/BerriAI/litellm/pull/28158)
- Litellm oss staging by [@​Sameerlite](https://github.com/Sameerlite) in [#​28161](https://github.com/BerriAI/litellm/pull/28161)
- feat(prometheus): add user\_email and user\_alias to user budget metrics by [@​Sameerlite](https://github.com/Sameerlite) in [#​28155](https://github.com/BerriAI/litellm/pull/28155)
- test(callbacks): harden flaky proxy callback-leak detector by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28195](https://github.com/BerriAI/litellm/pull/28195)
- fix(bedrock): sanitize batch metadata to prevent Pydantic ValidationError by [@​mateo-berri](https://github.com/mateo-berri) in [#​28202](https://github.com/BerriAI/litellm/pull/28202)
- fix(deepseek): use native /anthropic/v1/messages endpoint and sanitize tools by [@​mateo-berri](https://github.com/mateo-berri) in [#​28200](https://github.com/BerriAI/litellm/pull/28200)
- feat(ui): add Interactions API endpoint to playground with SSE streaming by [@​Sameerlite](https://github.com/Sameerlite) in [#​28156](https://github.com/BerriAI/litellm/pull/28156)
- fix(proxy): decode bytes and pass-through SSE for Google-native streamGenerateContent ([#​27444](https://github.com/BerriAI/litellm/issues/27444)) by [@​Sameerlite](https://github.com/Sameerlite) in [#​28213](https://github.com/BerriAI/litellm/pull/28213)
- refactor(bedrock/sagemaker): switch to lazy loading for response stre… by [@​harish-berri](https://github.com/harish-berri) in [#​28189](https://github.com/BerriAI/litellm/pull/28189)
- \[Refactor] UI - Spend Logs: consolidate filter state and extract components by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​25847](https://github.com/BerriAI/litellm/pull/25847)
- fix(tests): replace shut-down gpt-4o-audio-preview with gpt-audio-1.5 by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28281](https://github.com/BerriAI/litellm/pull/28281)
- chore(ci): bump versions by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28287](https://github.com/BerriAI/litellm/pull/28287)
- feat: propagate team\_id and team\_alias to all child OTEL spans by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28273](https://github.com/BerriAI/litellm/pull/28273)
- Day 0 support : Gemini 3.5 Flash by [@​Sameerlite](https://github.com/Sameerlite) in [#​28268](https://github.com/BerriAI/litellm/pull/28268)
- Gemini managed agents support by [@​Sameerlite](https://github.com/Sameerlite) in [#​28270](https://github.com/BerriAI/litellm/pull/28270)
- chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28292](https://github.com/BerriAI/litellm/pull/28292)
- feat(gemini): add gemini-3.1-flash-lite model cost map by [@​Sameerlite](https://github.com/Sameerlite) in [#​28320](https://github.com/BerriAI/litellm/pull/28320)
- fix(spend\_counter): seed Redis counter via SET NX to prevent cross-pod double-seed by [@​milan-berri](https://github.com/milan-berri) in [#​27854](https://github.com/BerriAI/litellm/pull/27854)
- fix(proxy): normalize batch file IDs before ManagedObjectTable write by [@​Sameerlite](https://github.com/Sameerlite) in [#​28339](https://github.com/BerriAI/litellm/pull/28339)
- fix(router): use forwarded model\_id for native Azure container IDs by [@​Sameerlite](https://github.com/Sameerlite) in [#​27921](https://github.com/BerriAI/litellm/pull/27921)
- fix(ui): restore log filter loading indicator by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28282](https://github.com/BerriAI/litellm/pull/28282)
- test(e2e): migrate runner to uv, add All Proxy Models key test by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28313](https://github.com/BerriAI/litellm/pull/28313)
- feat(ui): team passthrough routes create parity + edit load fix by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28098](https://github.com/BerriAI/litellm/pull/28098)
- fix(mcp): JWT on tools/list and REST tools/call server resolution by [@​Sameerlite](https://github.com/Sameerlite) in [#​28227](https://github.com/BerriAI/litellm/pull/28227)
- feat(interactions): migrate to Google Interactions API steps schema (May 2026) by [@​Sameerlite](https://github.com/Sameerlite) in [#​28153](https://github.com/BerriAI/litellm/pull/28153)
- test(ui-e2e): admin key creation with a specific proxy model by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28365](https://github.com/BerriAI/litellm/pull/28365)
- fix(vertex\_ai): omit function\_call id on Vertex Gemini 3.5+ tool turns by [@​Sameerlite](https://github.com/Sameerlite) in [#​28324](https://github.com/BerriAI/litellm/pull/28324)
- feat(mcp): allow native MCP OAuth support for cursor by [@​Sameerlite](https://github.com/Sameerlite) in [#​28327](https://github.com/BerriAI/litellm/pull/28327)
- fix(interactions): never drop streamed text deltas; always emit terminal completion by [@​mateo-berri](https://github.com/mateo-berri) in [#​28394](https://github.com/BerriAI/litellm/pull/28394)
- fix(proxy): expose Prisma idle/connect timeout + extra DB URL params by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28395](https://github.com/BerriAI/litellm/pull/28395)
- Litellm oss staging 1 by [@​Sameerlite](https://github.com/Sameerlite) in [#​28337](https://github.com/BerriAI/litellm/pull/28337)
- fix: serialize guardrail\_response to JSON in OTEL traces by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28362](https://github.com/BerriAI/litellm/pull/28362)
- chore(ci): merge dev branch by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28314](https://github.com/BerriAI/litellm/pull/28314)
- test(realtime): expect session.created as xAI realtime initial event by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28424](https://github.com/BerriAI/litellm/pull/28424)
- feat(tests): behavior-pinning harness + Key Tier-1 matrix by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28321](https://github.com/BerriAI/litellm/pull/28321)
- fix(proxy): hydrate wildcard discovery credentials ([#​28284](https://github.com/BerriAI/litellm/issues/28284)) - CCI Run by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28419](https://github.com/BerriAI/litellm/pull/28419)
- Litellm oss staging 04 21 2026 2 by [@​Sameerlite](https://github.com/Sameerlite) in [#​26569](https://github.com/BerriAI/litellm/pull/26569)
- chore(ci): merge dev branch by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28290](https://github.com/BerriAI/litellm/pull/28290)
- fix(vertex\_gemma): strip `context_management` from request body by [@​mateo-berri](https://github.com/mateo-berri) in [#​28438](https://github.com/BerriAI/litellm/pull/28438)
- fix(logging): recalculate cost after router retry failures by [@​milan-berri](https://github.com/milan-berri) in [#​28476](https://github.com/BerriAI/litellm/pull/28476)
- fix(otel): emit guardrail span on violation, surface status + categories by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28364](https://github.com/BerriAI/litellm/pull/28364)
- test(proxy): behavior-pinning matrix for team management endpoints by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28441](https://github.com/BerriAI/litellm/pull/28441)
- test(vertex\_ai): tolerate transient 500 in google maps grounding test by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28503](https://github.com/BerriAI/litellm/pull/28503)
- fix(docker): restore npm to non\_root builder image by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28519](https://github.com/BerriAI/litellm/pull/28519)
- chore(ci): bump deps by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28524](https://github.com/BerriAI/litellm/pull/28524)
- build(deps-dev): bump black to 26.3.1 and apply formatting by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28525](https://github.com/BerriAI/litellm/pull/28525)
- chore(deps): bump deps by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28528](https://github.com/BerriAI/litellm/pull/28528)
- test(e2e): forward LITELLM\_LICENSE to UI e2e proxy by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28398](https://github.com/BerriAI/litellm/pull/28398)
- Add granian as a ASGI compliant web server. Provider better throughput stability, by [@​harish-berri](https://github.com/harish-berri) in [#​26027](https://github.com/BerriAI/litellm/pull/26027)
- Fix conflicts and UI by [@​Sameerlite](https://github.com/Sameerlite) in [#​28477](https://github.com/BerriAI/litellm/pull/28477)
- Add error\_description and hint for oauth flows by [@​Sameerlite](https://github.com/Sameerlite) in [#​28471](https://github.com/BerriAI/litellm/pull/28471)
- feat(mcp): Add tool call and tool list support via UI for Oauth mcps by [@​Sameerlite](https://github.com/Sameerlite) in [#​28454](https://github.com/BerriAI/litellm/pull/28454)
- feat(proxy): persist allowlisted OIDC claims in CLI SSO poll by [@​Sameerlite](https://github.com/Sameerlite) in [#​28463](https://github.com/BerriAI/litellm/pull/28463)
- fix(responses): use OpenAI SSEDecoder for Responses API streaming by [@​Sameerlite](https://github.com/Sameerlite) in [#​28566](https://github.com/BerriAI/litellm/pull/28566)
- Litellm oss staging 2 by [@​Sameerlite](https://github.com/Sameerlite) in [#​28582](https://github.com/BerriAI/litellm/pull/28582)
- \[internal copy of [#​28269](https://github.com/BerriAI/litellm/issues/28269)] Codex cli jwt team alias by [@​mateo-berri](https://github.com/mateo-berri) in [#​28621](https://github.com/BerriAI/litellm/pull/28621)
- fix(check\_licenses): read PEP 639 license-expression metadata by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28529](https://github.com/BerriAI/litellm/pull/28529)
- test(proxy): behavior-pinning matrix for tier-2/3 key + team management endpoints by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28620](https://github.com/BerriAI/litellm/pull/28620)
- chore(test): remove dead old Playwright e2e suite by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28632](https://github.com/BerriAI/litellm/pull/28632)
- fix(sagemaker): send native Cohere embed payload to Cohere SageMaker endpoints by [@​milan-berri](https://github.com/milan-berri) in [#​28613](https://github.com/BerriAI/litellm/pull/28613)
- style: apply black formatting to fix lint CI (LIT-3274) ([#​28639](https://github.com/BerriAI/litellm/issues/28639)) by [@​krrish-berri-2](https://github.com/krrish-berri-2) in [#​28641](https://github.com/BerriAI/litellm/pull/28641)
- fix(bedrock): decouple STS region from Bedrock aws\_region\_name by [@​milan-berri](https://github.com/milan-berri) in [#​28245](https://github.com/BerriAI/litellm/pull/28245)
- test(streaming): tolerate Vertex 429 wrapped in MidStreamFallbackError by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28669](https://github.com/BerriAI/litellm/pull/28669)
- feat(guardrails): add Microsoft Purview DLP guardrail by [@​Sameerlite](https://github.com/Sameerlite) in [#​24966](https://github.com/BerriAI/litellm/pull/24966)
- fix(mcp): forward upstream initialize instructions on cold gateway init by [@​milan-berri](https://github.com/milan-berri) in [#​28231](https://github.com/BerriAI/litellm/pull/28231)
- chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28680](https://github.com/BerriAI/litellm/pull/28680)
- CI: copy of [#​25177](https://github.com/BerriAI/litellm/issues/25177) (OCI GenAI: embeddings, streaming/reasoning fixes, model catalog) by [@​mateo-berri](https://github.com/mateo-berri) in [#​28223](https://github.com/BerriAI/litellm/pull/28223)
- Encrypt callback\_vars in key/team metadata in DB by [@​Michael-RZ-Berri](https://github.com/Michael-RZ-Berri) in [#​27141](https://github.com/BerriAI/litellm/pull/27141)
- perf: reduce per-request and per-chunk overhead across Anthropic streaming hot paths by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28289](https://github.com/BerriAI/litellm/pull/28289)
- feat(azure): add Speech STT config support by [@​ishaan-berri](https://github.com/ishaan-berri) in [#​27482](https://github.com/BerriAI/litellm/pull/27482)
- test(proxy): phase-4 payload behavior pinning for tier-2/3 key + team management endpoints by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28681](https://github.com/BerriAI/litellm/pull/28681)
- feat(prometheus): emit per-token-type detail metrics (LIT-3220) ([#​28372](https://github.com/BerriAI/litellm/issues/28372)) by [@​ishaan-berri](https://github.com/ishaan-berri) in [#​28378](https://github.com/BerriAI/litellm/pull/28378)
- fix(otel): stamp http.response.status\_code on all error responses by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28405](https://github.com/BerriAI/litellm/pull/28405)
- chore(ui): build ui by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28707](https://github.com/BerriAI/litellm/pull/28707)
- fix(helm): drop main- prefix from default image tag by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28710](https://github.com/BerriAI/litellm/pull/28710)
- test(model\_prices): allow audio\_transcription\_config in schema by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28708](https://github.com/BerriAI/litellm/pull/28708)
- chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28709](https://github.com/BerriAI/litellm/pull/28709)
- fix(team): refresh team cache on team\_model\_add/delete (LIT-3244) by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28683](https://github.com/BerriAI/litellm/pull/28683)
- fix(ui/add-model): stop vertex\_ai-anthropic\_models from leaking into Anthropic dropdown by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28723](https://github.com/BerriAI/litellm/pull/28723)
- Fix spend logs v2 route permissions by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28705](https://github.com/BerriAI/litellm/pull/28705)
- fix(proxy): Bedrock Knowledge Base pass-through: preserve SigV4 headers and signed request body by [@​milan-berri](https://github.com/milan-berri) in [#​27526](https://github.com/BerriAI/litellm/pull/27526)
- chore(tests): migrate Bedrock CI to AWS account [`9412775`](https://github.com/BerriAI/litellm/commit/941277531214) by [@​mateo-berri](https://github.com/mateo-berri) in [#​28728](https://github.com/BerriAI/litellm/pull/28728)
- fix(otel): export SERVER span on management-endpoint success without http\_request by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28794](https://github.com/BerriAI/litellm/pull/28794)
- chore(ci): merge dev branch by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28801](https://github.com/BerriAI/litellm/pull/28801)
- chore(ci): merge dev branch by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28657](https://github.com/BerriAI/litellm/pull/28657)
- fix(ui): show 2-decimal precision for max\_budget on key overview by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28809](https://github.com/BerriAI/litellm/pull/28809)
- feat(proxy): allow `llm_api_routes` virtual keys to list MCP servers by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28442](https://github.com/BerriAI/litellm/pull/28442)
- chore(ci): merge dev branch by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28807](https://github.com/BerriAI/litellm/pull/28807)
- fix(team): keep team\_alias cache in sync on \_cache\_team\_object writes by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28737](https://github.com/BerriAI/litellm/pull/28737)
- chore(ci): merge dev branch by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28822](https://github.com/BerriAI/litellm/pull/28822)
- ci: daily oss-agent-shin canonical branch by [@​ishaan-berri](https://github.com/ishaan-berri) in [#​28829](https://github.com/BerriAI/litellm/pull/28829)
- test(proxy): add harness for proxy\_server.py behavior-pinning by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28827](https://github.com/BerriAI/litellm/pull/28827)
- feat(openai): apply regional-processing cost uplift for EU/US data residency by [@​mateo-berri](https://github.com/mateo-berri) in [#​28626](https://github.com/BerriAI/litellm/pull/28626)
- chore(admin-ui): regenerate static export with trailingSlash: true by [@​mateo-berri](https://github.com/mateo-berri) in [#​28112](https://github.com/BerriAI/litellm/pull/28112)
- fix(azure): preserve AD token refresh in v1 OpenAI client path by [@​mateo-berri](https://github.com/mateo-berri) in [#​28627](https://github.com/BerriAI/litellm/pull/28627)
- fix(ui): route API Reference back to query-param page by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28726](https://github.com/BerriAI/litellm/pull/28726)
- fix(model-edit): allow clearing custom pricing on wildcard models by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28719](https://github.com/BerriAI/litellm/pull/28719)
- fix(tests/vcr): make Redis cassette cache replay deterministically (zero VCR misses on consecutive runs) by [@​mateo-berri](https://github.com/mateo-berri) in [#​28826](https://github.com/BerriAI/litellm/pull/28826)
- fix(proxy): strip LiteLLM policy tracking from OpenAI batch metadata by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​28425](https://github.com/BerriAI/litellm/pull/28425)
- Litellm OpenAI double prefix bug by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​28661](https://github.com/BerriAI/litellm/pull/28661)
- Litellm oss staging 250526 by [@​Sameerlite](https://github.com/Sameerlite) in [#​28770](https://github.com/BerriAI/litellm/pull/28770)
- fix(bedrock): align toolUse/toolSpec names and allow hyphens by [@​Sameerlite](https://github.com/Sameerlite) in [#​28874](https://github.com/BerriAI/litellm/pull/28874)
- fix(realtime): send TEXT frames and valid guardrail session.update by [@​Sameerlite](https://github.com/Sameerlite) in [#​28848](https://github.com/BerriAI/litellm/pull/28848)
- fix(mcp): extend key access-group union to MCP servers by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28890](https://github.com/BerriAI/litellm/pull/28890)
- fix(galileo): support hosted v2 spans API and string output extraction by [@​Sameerlite](https://github.com/Sameerlite) in [#​28771](https://github.com/BerriAI/litellm/pull/28771)
- fix(proxy): exclude proxy\_server\_request from its own body snapshot by [@​michelligabriele](https://github.com/michelligabriele) in [#​28618](https://github.com/BerriAI/litellm/pull/28618)
- \[Feat] Add tool calling support for gemini and vertex ai live api by [@​Sameerlite](https://github.com/Sameerlite) in [#​26590](https://github.com/BerriAI/litellm/pull/26590)
- refactor(ui): remove dead App Router scaffolding in (dashboard)/\* by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28891](https://github.com/BerriAI/litellm/pull/28891)
- fix(docker): use system Node in componentized builders + retry apk add by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28888](https://github.com/BerriAI/litellm/pull/28888)
- docs(agents): require consent before writing new third-party names by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​28908](https://github.com/BerriAI/litellm/pull/28908)
- refactor(ui): extract auth state into AuthContext by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28910](https://github.com/BerriAI/litellm/pull/28910)
- fix(mcp): resolve team.access\_group\_ids → MCP servers by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28997](https://github.com/BerriAI/litellm/pull/28997)
- test(ui): e2e cover team model edit + admin identity in navbar by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​28652](https://github.com/BerriAI/litellm/pull/28652)
- test(e2e): cover add-fallback flow in Router Settings by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29069](https://github.com/BerriAI/litellm/pull/29069)
- test(e2e): cover Team-BYOK add-model flow as proxy admin by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29068](https://github.com/BerriAI/litellm/pull/29068)
- fix(containers): record ownership for service-account keys + fix Prisma Json serialization by [@​Sameerlite](https://github.com/Sameerlite) in [#​28990](https://github.com/BerriAI/litellm/pull/28990)
- test(e2e): cover add-MCP-server flow via discovery → custom form by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29070](https://github.com/BerriAI/litellm/pull/29070)
- test(e2e): cover AI Hub make-public flow and public model\_hub\_table by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29071](https://github.com/BerriAI/litellm/pull/29071)
- \[internal copy of [#​28877](https://github.com/BerriAI/litellm/issues/28877)] feat: add support for claude code goal mode for bedrock opus output config by [@​mateo-berri](https://github.com/mateo-berri) in [#​28898](https://github.com/BerriAI/litellm/pull/28898)
- feat(guardrails): wire apply\_guardrail into proxy logging callbacks by [@​Sameerlite](https://github.com/Sameerlite) in [#​28970](https://github.com/BerriAI/litellm/pull/28970)
- chore(ci): merge dev brach by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29192](https://github.com/BerriAI/litellm/pull/29192)
- perf(streaming): cut per-chunk overhead \~30% on Anthropic + Bedrock hot path by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28720](https://github.com/BerriAI/litellm/pull/28720)
- fix(proxy): enforce tag budgets for key-level tags by [@​Sameerlite](https://github.com/Sameerlite) in [#​29108](https://github.com/BerriAI/litellm/pull/29108)
- fix(vertex-ai): use DB credentials in video handlers + implement Veo video edit by [@​Sameerlite](https://github.com/Sameerlite) in [#​29098](https://github.com/BerriAI/litellm/pull/29098)
- fix(datadog): drain cost-management queue + opt-in FinOps tag allowlist by [@​michelligabriele](https://github.com/michelligabriele) in [#​28487](https://github.com/BerriAI/litellm/pull/28487)
- feat(helm): split per-component ServiceAccounts for gateway, backend, and UI by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28712](https://github.com/BerriAI/litellm/pull/28712)
- chore(ci): bump deps ([#​29208](https://github.com/BerriAI/litellm/issues/29208)) by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29226](https://github.com/BerriAI/litellm/pull/29226)
- fix(tests/vcr): mint Google OAuth tokens live to prevent stale-token replay by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29229](https://github.com/BerriAI/litellm/pull/29229)
- chore(cookbook): bump Go directive to 1.26.3 in gollem example by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29234](https://github.com/BerriAI/litellm/pull/29234)
- chore(ci): bump version by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29242](https://github.com/BerriAI/litellm/pull/29242)
- feat(anthropic): add Claude Opus 4.8 and prune reasoning-effort flags by [@​mateo-berri](https://github.com/mateo-berri) in [#​29238](https://github.com/BerriAI/litellm/pull/29238)
- chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29243](https://github.com/BerriAI/litellm/pull/29243)
- fix(ci): restore real Bedrock batch S3 bucket/role in oai\_misc\_config by [@​mateo-berri](https://github.com/mateo-berri) in [#​29245](https://github.com/BerriAI/litellm/pull/29245)
- fix(guardrails): persist disable\_global\_guardrails on keys by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29233](https://github.com/BerriAI/litellm/pull/29233)
- test(e2e): cover Team Admin view + member + key flows by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29072](https://github.com/BerriAI/litellm/pull/29072)
- docs: hand-written CLAUDE.md; remove AGENTS.md, point GEMINI.md at it by [@​mateo-berri](https://github.com/mateo-berri) in [#​29252](https://github.com/BerriAI/litellm/pull/29252)
- fix(teams): expose keys\_count on /v2/team/list and wire UI Resources badge by [@​michelligabriele](https://github.com/michelligabriele) in [#​28502](https://github.com/BerriAI/litellm/pull/28502)
- fix(anthropic): stop injecting unsupported output\_config.effort=xhigh for Claude Code on Sonnet/Opus 4.6 by [@​mateo-berri](https://github.com/mateo-berri) in [#​29304](https://github.com/BerriAI/litellm/pull/29304)
- test(e2e): cover Internal Viewer nav, key, and team-info gating by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29075](https://github.com/BerriAI/litellm/pull/29075)
- test(e2e): cover Internal User key modal, team info, key page by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29074](https://github.com/BerriAI/litellm/pull/29074)
- test(e2e): cover navbar Logout flow as proxy admin by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29076](https://github.com/BerriAI/litellm/pull/29076)
- fix(mcp): resolve key.access\_group\_ids → MCP servers (ungated) by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29195](https://github.com/BerriAI/litellm/pull/29195)
- fix(router): enforce deployment budgets for dynamically added models by [@​Sameerlite](https://github.com/Sameerlite) in [#​29273](https://github.com/BerriAI/litellm/pull/29273)
- fix(proxy): map stripped batch body.model to proxy alias for auth by [@​Sameerlite](https://github.com/Sameerlite) in [#​29264](https://github.com/BerriAI/litellm/pull/29264)
- feat(mcp): support stateless and stateful clients via session-id routing by [@​Sameerlite](https://github.com/Sameerlite) in [#​26857](https://github.com/BerriAI/litellm/pull/26857)
- fix(bedrock): support tool search results + chat annotations by [@​Sameerlite](https://github.com/Sameerlite) in [#​29120](https://github.com/BerriAI/litellm/pull/29120)
- fix(mcp): ignore stale ids on key save by [@​Sameerlite](https://github.com/Sameerlite) in [#​29128](https://github.com/BerriAI/litellm/pull/29128)
- feat(a2a): well-known agent-card discovery + LangGraph Platform mode by [@​Sameerlite](https://github.com/Sameerlite) in [#​28860](https://github.com/BerriAI/litellm/pull/28860)
- fix(proxy): link passthrough success spans to the SERVER root OTEL span by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29315](https://github.com/BerriAI/litellm/pull/29315)
- \[internal copy of [#​29089](https://github.com/BerriAI/litellm/issues/29089)] fix: duplicate claude code traces by [@​mateo-berri](https://github.com/mateo-berri) in [#​29311](https://github.com/BerriAI/litellm/pull/29311)
- feat(otel): typed semconv-aligned OpenTelemetry instrumentation by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​28909](https://github.com/BerriAI/litellm/pull/28909)
- tests(proxy\_server): surface current behavior in tests by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29309](https://github.com/BerriAI/litellm/pull/29309)
- test(e2e): cover Internal User create-key flow when in no teams by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29083](https://github.com/BerriAI/litellm/pull/29083)
- test(e2e): assert internal-user navbar identity is scoped to that user by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29077](https://github.com/BerriAI/litellm/pull/29077)
- feat(otel): add team\_metadata, http.route, and model names to inference spans by [@​yassin-berriai](https://github.com/yassin-berriai) in [#​29319](https://github.com/BerriAI/litellm/pull/29319)
- feat(context\_management): compact\_20260112 polyfill for non-Anthropic providers by [@​Sameerlite](https://github.com/Sameerlite) in [#​28868](https://github.com/BerriAI/litellm/pull/28868)
- feat(enterprise): add RESEND\_FROM\_EMAIL for self-hosted Resend sends by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​28830](https://github.com/BerriAI/litellm/pull/28830)
- Revert Bedrock CI back to the reactivated AWS account ([`8886022`](https://github.com/BerriAI/litellm/commit/888602223428)) by [@​mateo-berri](https://github.com/mateo-berri) in [#​29326](https://github.com/BerriAI/litellm/pull/29326)
- fix(mcp): preserve source\_url in GET /v1/mcp/server list responses by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29249](https://github.com/BerriAI/litellm/pull/29249)
- fix(mcp): preserve omitted fields on PUT /v1/mcp/server partial updates by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29253](https://github.com/BerriAI/litellm/pull/29253)
- fix(ci): make litellm\_internal\_staging green (logging test + Bedrock Opus 4.7 self-heal) by [@​mateo-berri](https://github.com/mateo-berri) in [#​29344](https://github.com/BerriAI/litellm/pull/29344)
- refactor(proxy/auth): normalize Bearer prefix in safe-hash helper by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29343](https://github.com/BerriAI/litellm/pull/29343)
- test(reasoning-effort-grid): cover Claude Opus 4.8 across provider routes by [@​mateo-berri](https://github.com/mateo-berri) in [#​29327](https://github.com/BerriAI/litellm/pull/29327)
- fix(guardrails): return HTTP 400 for litellm content filter blocks by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​28418](https://github.com/BerriAI/litellm/pull/28418)
- fix(proxy): restrict vector store index create/delete to proxy admins by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29202](https://github.com/BerriAI/litellm/pull/29202)
- feat(pass\_through): extend passthrough\_managed\_object\_ids to Azure by [@​Sameerlite](https://github.com/Sameerlite) in [#​29160](https://github.com/BerriAI/litellm/pull/29160)
- fix(proxy): enforce allowed\_passthrough\_routes for auth=true pass-thr… by [@​shivamrawat1](https://github.com/shivamrawat1) in [#​29256](https://github.com/BerriAI/litellm/pull/29256)
- feat(mcp/auth): additive key access-group grants + opt-in member assignment by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29313](https://github.com/BerriAI/litellm/pull/29313)
- fix(reset\_budget): write only {spend, budget\_reset\_at} and stop pre-zeroing counter by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29358](https://github.com/BerriAI/litellm/pull/29358)
- test(e2e): cover PROXY\_LOGOUT\_URL redirect on Logout by [@​ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#​29080](https://github.com/BerriAI/litellm/pull/29080)
- fix(ui): break logout redirect loop across dev and proxy origins by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29360](https://github.com/BerriAI/litellm/pull/29360)
- fix(openai-moderation): wire streaming flags through to unified dispatcher by [@​michelligabriele](https://github.com/michelligabriele) in [#​27324](https://github.com/BerriAI/litellm/pull/27324)
- chore(ci): build ui by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29366](https://github.com/BerriAI/litellm/pull/29366)
- fix(v3 limiter): cap no-max\_tokens TPM floor at smallest configured limit by [@​michelligabriele](https://github.com/michelligabriele) in [#​28805](https://github.com/BerriAI/litellm/pull/28805)
- fix(e2e): tolerate trailing slash in SERVER\_ROOT\_PATH login redirect by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29369](https://github.com/BerriAI/litellm/pull/29369)
- chore(deps): bump deps by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29373](https://github.com/BerriAI/litellm/pull/29373)
- chore(ci): promote internal staging to main by [@​yuneng-berri](https://github.com/yuneng-berri) in [#​29372](https://github.com/BerriAI/litellm/pull/29372)
- chore(release): patch v1.88.0-rc.1 with four staged fixes by [@​mateo-berri](https://github.com/mateo-berri) in [#​29632](https://github.com/BerriAI/litellm/pull/29632)
- chore(release): patch v1.88.0-rc.1 with [#​29612](https://github.com/BerriAI/litellm/issues/29612) (session-token budget-ceiling exemption) by [@​mateo-berri](https://github.com/mateo-berri) in [#​29637](https://github.com/BerriAI/litellm/pull/29637)
- fix(key\_generate): harden GHSA-q775 …
…with examples/default entry point
providerblocks from both AWS and GCP stack roots so the modules can be consumed withcount,for_each,depends_on, assumed-role or aliased providers; patterns that are forbidden when a module owns its own provider configurationexamples/default/thin-root wrappers for both stacks that wire the provider (AWS) / providers (google + google-beta) and call the module with a curated variable surface, preserving the one-command deploy experienceterraform.tfvars.examplefiles intoexamples/default/alongside the new roots; update example comments to reflect the curated variable surfacelocal.tags(containinglitellm:stack,managed-by, andvar.tags) explicitly onto every taggable AWS resource since the module no longer controls the provider'sdefault_tags; GCP resource labels already flow through the module'slabelsinputexamples/default/variables.tfandoutputs.tffor both stacks, exposing the most-used knobs and re-exporting all module outputsterraform initis reproducible without a network fetchfor_eachmulti-tenant pattern, and theexamples/default/quick-start pathReview follow-ups (Greptile)
managed-bytag from the AWS exampleproviders.tfdefault_tags(the module already stamps it vialocal.tags); reserved that block for pure org-wide tags.for_eachprovider limitation (GCP): documented that the module declares noconfiguration_aliases, so afor_eachover it runs every instance against the same project/region/credentials; noted in the GCP README and the examplemain.tf.terraform state mvupgrade guide (and flagged that the "No changes" assertion wouldn't hold once the managed SSL cert is renamed). These stacks have never been published/tagged, so there's no existing deployment on the old root-module addresses to migrate from; the upgrade path doesn't exist. Rather than document a hypothetical (and partly inaccurate) migration, the guides were dropped from both READMEs and the docs PR.Subsequent additions
Two follow-on commits land matching
proxy_configmount + OpenTelemetry v2 work on both stacks; they sit on top of the refactor so the module-first layout above is what callers see.feat(terraform/aws): mount proxy_config from S3 and wire OpenTelemetry v2(ab2c928) drops the inlineLITELLM_PROXY_CONFIG_B64env var, uploads the YAML to S3 atconfig/litellm-config.yaml, and has the gateway and backend entrypoints pull it to/tmp/litellm-config.yamlvia boto3 before exec'ing uvicorn. The S3 object etag is wired into the task definition so a config-only edit forces a new task-def revision and a rolling redeploy.feat(terraform/gcp): mount proxy_config from GCS and wire OpenTelemetry v2(c1492a5) does the same on the GCP side but uses a true Cloud Run v2 gcsfuse volume mount instead of an entrypoint download. The YAML is uploaded to a dedicated GCS bucket and mounted read-only into both the gateway and backend at/etc/litellm;CONFIG_FILE_PATHpoints at the mount path. An md5 of the YAML rides along asPROXY_CONFIG_HASHso a config-only edit forces a new Cloud Run revision (gcsfuse only surfaces a new object on container restart, so without the hash an updatedproxy_configwould sit in the bucket unread). The config bucket is separate from the data-plane bucket so the runtime SA can holdobjectViewerhere while keepingobjectAdminon the data-plane bucket.AWS / GCP consistency pass
refactor(terraform): make AWS and GCP stacks behave identically(9c04aa1) brings both modules to the same surface and runtime shape so swapping clouds (or reading either README) is symmetric.Labels and tags. GCP previously stamped
var.labelsonto only the two GCS buckets, leaving Cloud Run, Cloud SQL, Memorystore, Secret Manager, and the LB resources unlabeled. The module now computeslocal.labels(litellm-stack+managed-by+var.labels, mirroring AWS'slocal.tags) and threads it onto every label-supporting resource: Cloud Run services and the migrations job, Cloud SQL writer and reader (viauser_labels), Memorystore, every Secret Manager entry, both GCS buckets, the LB global address, and the http/https forwarding rules. GCP useslitellm-stackinstead of AWS'slitellm:stackbecause GCP label keys forbid colons.OTel is opt-in on both stacks. AWS already gated everything on
otel_endpoint; GCP was stampingLITELLM_OTEL_V2=trueinto the shared env unconditionally. Both stacks now do the same thing: leaveotel_endpointempty and nothing OTel-related lands in the container env; set it and gateway and backend getLITELLM_OTEL_V2=trueplusOTEL_EXPORTER,OTEL_ENDPOINT,OTEL_ENVIRONMENT_NAME,OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT, and a per-componentOTEL_SERVICE_NAME(${tenant}-litellm-${env}-{gateway,backend}). AWS gains the richer GCP surface:otel_environment_name(defaults tovar.env),otel_capture_message_content(defaults tono_content), and*_extra_envoverride filtering so a caller-setOTEL_*key wins over the default for that service.var.otel_service_nameon AWS is gone, replaced by the per-component naming.uvicorn workers. GCP gains
gateway_num_workers, matching AWS; threads into the gateway args as--workers ${var.gateway_num_workers}.Docs reflect the parity: each README's OTel section, the GCP "Using as a module" Labels paragraph, and a new feature-parity table in the top-level README (and in the docs PR) that lays out the AWS/GCP input mapping side by side.
AWS 1-click end-to-end verification
fix(terraform/aws): expose skip_final_snapshot through the default example(1edeba1) ships a small follow-up surfaced by an end-to-end test of the AWS 1-click against a fresh account.The example wrapper already exposed
s3_force_destroyso a trial stack could be torn down without pre-emptying the bucket, but the matching Aurora knob (skip_final_snapshot) was buried inside the module call, so destroying a trial stack always left a<cluster>-final-<short-sha>snapshot behind. Both knobs guard the same kind of accidental data loss; both should be reachable from the example. The fix addsvar.skip_final_snapshotto the example wrapper (defaultfalse, tripwire stays intact) and threads it into the module input, mirroring hows3_force_destroyis wired. Tfvars example updated.The verification itself: stood up the full stack from
examples/default/on a clean AWS account inus-east-1withallow_plaintext_alb = true/s3_force_destroy = true/skip_final_snapshot = trueand the rest at defaults.terraform applyprovisioned 93 resources in ~20 minutes (Aurora cluster + writer + reader, ElastiCache, ALB, three ECS services, S3, NAT, the works); the auto-bootstrap created thelitellm_appIAM-authed Postgres user; the migration task applied the full Prisma schema; the three ECS services all reachedrolloutState: COMPLETED/ "steady state".Live curl against the ALB (master key fetched from Secrets Manager, no
proxy_config/ no extra secrets so it's the pure OSS / trial path):db: connectedon/health/readinessconfirms the Aurora-IAM chain (generate_iam_auth_token→DatabaseURLSettings.apply_to_env→ Prisma) works end-to-end./v1/modelsreturning 200 with the master key confirms gateway auth and ALB path-routing (LLM data plane → gateway target group)./key/inforeturning 404 "Key not found" instead of 401 confirms the backend is reachable through its own path-routing rule; the 404 is correct semantics, the master key is not a DB row./and/uireturning Next.js HTML confirms the UI service is healthy and the static-asset routing rules (/,/ui,/_next/*,/litellm-asset-prefix/*) work.terraform destroythen cleanly tore down all 93 resources (no errors, no leftover Aurora snapshot or S3 objects).Relevant issues
Linear ticket
Resolves LIT-3504
Docs
Companion docs PR: BerriAI/litellm-docs#275
Pre-Submission checklist
tests/test_litellm/directory, Adding at least 1 test is a hard requirement; see detailsmake test-unit@greptileaiand received a Confidence Score of at least 4/5 before requesting a maintainer reviewType
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
Changes
Infra/docs-only change to the
terraform/litellm/stacks and the deploy docs. No Python runtime code is touched.