Skip to content

refactor: convert AWS and GCP Terraform stacks into reusable modules …#28103

Merged
yassin-berriai merged 10 commits into
litellm_internal_stagingfrom
litellm_fix/tf-refactor
Jun 6, 2026
Merged

refactor: convert AWS and GCP Terraform stacks into reusable modules …#28103
yassin-berriai merged 10 commits into
litellm_internal_stagingfrom
litellm_fix/tf-refactor

Conversation

@yassin-berriai

@yassin-berriai yassin-berriai commented May 17, 2026

Copy link
Copy Markdown
Contributor

…with examples/default entry point

  • Remove provider blocks from both AWS and GCP stack roots so the modules can be consumed with count, for_each, depends_on, assumed-role or aliased providers; patterns that are forbidden when a module owns its own provider configuration
  • Add examples/default/ thin-root wrappers for both stacks that wire the provider (AWS) / providers (google + google-beta) and call the module with a curated variable surface, preserving the one-command deploy experience
  • Move terraform.tfvars.example files into examples/default/ alongside the new roots; update example comments to reflect the curated variable surface
  • Thread local.tags (containing litellm:stack, managed-by, and var.tags) explicitly onto every taggable AWS resource since the module no longer controls the provider's default_tags; GCP resource labels already flow through the module's labels input
  • Add examples/default/variables.tf and outputs.tf for both stacks, exposing the most-used knobs and re-exporting all module outputs
  • Commit provider lock files for both examples so terraform init is reproducible without a network fetch
  • Update top-level and per-stack READMEs to document the module-first design, the for_each multi-tenant pattern, and the examples/default/ quick-start path

Review follow-ups (Greptile)

  • P2 — redundant tag: removed the managed-by tag from the AWS example providers.tf default_tags (the module already stamps it via local.tags); reserved that block for pure org-wide tags.
  • P2 — for_each provider limitation (GCP): documented that the module declares no configuration_aliases, so a for_each over it runs every instance against the same project/region/credentials; noted in the GCP README and the example main.tf.
  • P1 — state migration: Greptile suggested a terraform state mv upgrade guide (and flagged that the "No changes" assertion wouldn't hold once the managed SSL cert is renamed). These stacks have never been published/tagged, so there's no existing deployment on the old root-module addresses to migrate from; the upgrade path doesn't exist. Rather than document a hypothetical (and partly inaccurate) migration, the guides were dropped from both READMEs and the docs PR.

Subsequent additions

Two follow-on commits land matching proxy_config mount + OpenTelemetry v2 work on both stacks; they sit on top of the refactor so the module-first layout above is what callers see.

feat(terraform/aws): mount proxy_config from S3 and wire OpenTelemetry v2 (ab2c928) drops the inline LITELLM_PROXY_CONFIG_B64 env var, uploads the YAML to S3 at config/litellm-config.yaml, and has the gateway and backend entrypoints pull it to /tmp/litellm-config.yaml via boto3 before exec'ing uvicorn. The S3 object etag is wired into the task definition so a config-only edit forces a new task-def revision and a rolling redeploy.

feat(terraform/gcp): mount proxy_config from GCS and wire OpenTelemetry v2 (c1492a5) does the same on the GCP side but uses a true Cloud Run v2 gcsfuse volume mount instead of an entrypoint download. The YAML is uploaded to a dedicated GCS bucket and mounted read-only into both the gateway and backend at /etc/litellm; CONFIG_FILE_PATH points at the mount path. An md5 of the YAML rides along as PROXY_CONFIG_HASH so a config-only edit forces a new Cloud Run revision (gcsfuse only surfaces a new object on container restart, so without the hash an updated proxy_config would sit in the bucket unread). The config bucket is separate from the data-plane bucket so the runtime SA can hold objectViewer here while keeping objectAdmin on the data-plane bucket.

AWS / GCP consistency pass

refactor(terraform): make AWS and GCP stacks behave identically (9c04aa1) brings both modules to the same surface and runtime shape so swapping clouds (or reading either README) is symmetric.

Labels and tags. GCP previously stamped var.labels onto only the two GCS buckets, leaving Cloud Run, Cloud SQL, Memorystore, Secret Manager, and the LB resources unlabeled. The module now computes local.labels (litellm-stack + managed-by + var.labels, mirroring AWS's local.tags) and threads it onto every label-supporting resource: Cloud Run services and the migrations job, Cloud SQL writer and reader (via user_labels), Memorystore, every Secret Manager entry, both GCS buckets, the LB global address, and the http/https forwarding rules. GCP uses litellm-stack instead of AWS's litellm:stack because GCP label keys forbid colons.

OTel is opt-in on both stacks. AWS already gated everything on otel_endpoint; GCP was stamping LITELLM_OTEL_V2=true into the shared env unconditionally. Both stacks now do the same thing: leave otel_endpoint empty and nothing OTel-related lands in the container env; set it and gateway and backend get LITELLM_OTEL_V2=true plus OTEL_EXPORTER, OTEL_ENDPOINT, OTEL_ENVIRONMENT_NAME, OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT, and a per-component OTEL_SERVICE_NAME (${tenant}-litellm-${env}-{gateway,backend}). AWS gains the richer GCP surface: otel_environment_name (defaults to var.env), otel_capture_message_content (defaults to no_content), and *_extra_env override filtering so a caller-set OTEL_* key wins over the default for that service. var.otel_service_name on AWS is gone, replaced by the per-component naming.

uvicorn workers. GCP gains gateway_num_workers, matching AWS; threads into the gateway args as --workers ${var.gateway_num_workers}.

Docs reflect the parity: each README's OTel section, the GCP "Using as a module" Labels paragraph, and a new feature-parity table in the top-level README (and in the docs PR) that lays out the AWS/GCP input mapping side by side.

AWS 1-click end-to-end verification

fix(terraform/aws): expose skip_final_snapshot through the default example (1edeba1) ships a small follow-up surfaced by an end-to-end test of the AWS 1-click against a fresh account.

The example wrapper already exposed s3_force_destroy so a trial stack could be torn down without pre-emptying the bucket, but the matching Aurora knob (skip_final_snapshot) was buried inside the module call, so destroying a trial stack always left a <cluster>-final-<short-sha> snapshot behind. Both knobs guard the same kind of accidental data loss; both should be reachable from the example. The fix adds var.skip_final_snapshot to the example wrapper (default false, tripwire stays intact) and threads it into the module input, mirroring how s3_force_destroy is wired. Tfvars example updated.

The verification itself: stood up the full stack from examples/default/ on a clean AWS account in us-east-1 with allow_plaintext_alb = true / s3_force_destroy = true / skip_final_snapshot = true and the rest at defaults. terraform apply provisioned 93 resources in ~20 minutes (Aurora cluster + writer + reader, ElastiCache, ALB, three ECS services, S3, NAT, the works); the auto-bootstrap created the litellm_app IAM-authed Postgres user; the migration task applied the full Prisma schema; the three ECS services all reached rolloutState: COMPLETED / "steady state".

Live curl against the ALB (master key fetched from Secrets Manager, no proxy_config / no extra secrets so it's the pure OSS / trial path):

===gateway: /health/readiness===
HTTP 200
{"status":"healthy","db":"connected"}
===backend: /key/info===
HTTP 404
{"error":{"message":"Key not found in database",...}}
===backend: /user/info===
HTTP 404
{"error":{"message":"User default_user_id not found",...}}
===ui: /===
HTTP 200
<!DOCTYPE html>... (Next.js index page)
===ui: /ui===
HTTP 200
<!DOCTYPE html>... (Next.js index page)
===gateway: /v1/models===
HTTP 200
{"data":[],"object":"list"}

db: connected on /health/readiness confirms the Aurora-IAM chain (generate_iam_auth_tokenDatabaseURLSettings.apply_to_env → Prisma) works end-to-end. /v1/models returning 200 with the master key confirms gateway auth and ALB path-routing (LLM data plane → gateway target group). /key/info returning 404 "Key not found" instead of 401 confirms the backend is reachable through its own path-routing rule; the 404 is correct semantics, the master key is not a DB row. / and /ui returning Next.js HTML confirms the UI service is healthy and the static-asset routing rules (/, /ui, /_next/*, /litellm-asset-prefix/*) work.

terraform destroy then cleanly tore down all 93 resources (no errors, no leftover Aurora snapshot or S3 objects).

Relevant issues

Linear ticket

Resolves LIT-3504

Docs

Companion docs PR: BerriAI/litellm-docs#275

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement; see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🧹 Refactoring
📖 Documentation
🚄 Infrastructure

Changes

Infra/docs-only change to the terraform/litellm/ stacks and the deploy docs. No Python runtime code is touched.

@CLAassistant

CLAassistant commented May 17, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 3 committers have signed the CLA.

❌ yassin-berriai
❌ yassinkortam
❌ claude
You have signed the CLA already but the status is still pending? Let us recheck it.

@codecov

codecov Bot commented May 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR refactors the AWS and GCP Terraform stacks into provider-free reusable modules, adds examples/default/ thin-root wrappers that own the provider config, threads local.tags explicitly onto every taggable AWS resource, adds a lifecycle { ignore_changes = [settings[0].disk_size] } fix to prevent Cloud SQL autoresize-driven destroy/recreate, and renames the GCP managed SSL cert to include a domain hash (with create_before_destroy) to fix the resourceInUseByAnotherResource failure on domain changes.

  • AWS module: providers.tf deleted; local.tags added to every taggable resource across alb.tf, ecs.tf, iam.tf, network.tf, rds.tf, redis.tf, s3.tf, secrets.tf, bootstrap.tf, and migrations.tf; state migration guide included in README.
  • GCP module: providers.tf deleted; Cloud SQL disk_size ignore_changes fix added; managed cert renamed with domain hash and create_before_destroy; state migration guide included in README — but the guide's "expect: No changes" claim is incorrect for TLS-enabled stacks because the cert rename triggers a replacement even after address migration.
  • Both stacks: New examples/default/ roots (provider config, curated variables, outputs, lock files) and updated READMEs documenting the module-first design and for_each multi-tenant pattern.

Confidence Score: 4/5

The refactor is structurally sound, but the GCP state migration guide tells operators to expect no plan changes when there will actually be a managed SSL certificate replacement for any TLS-enabled stack.

The AWS changes are clean: every taggable resource receives local.tags, the provider deletion is correct, and the migration guide accurately claims No Changes after address migration. The GCP Cloud SQL ignore_changes fix and the cert create_before_destroy rename are both correct improvements. The sole defect is that the GCP README migration guide claims terraform plan expect No Changes, which will not hold for any deployment with TLS enabled.

terraform/litellm/gcp/README.md — the state migration guide needs a note that one managed SSL certificate replacement is expected for TLS-enabled stacks and that it is safe due to create_before_destroy.

Important Files Changed

Filename Overview
terraform/litellm/aws/locals.tf Adds local.tags combining litellm:stack, managed-by, and var.tags; replaces what the now-deleted provider default_tags was doing.
terraform/litellm/aws/providers.tf Deleted — provider block moved to examples/default/providers.tf so the module can be consumed with count/for_each/aliased providers.
terraform/litellm/aws/examples/default/main.tf New thin-root wrapper that wires the AWS provider to the module with a curated variable surface; all looks correct.
terraform/litellm/aws/examples/default/providers.tf Root-level AWS provider with commented-out default_tags slot; managed-by redundancy from previous thread is addressed.
terraform/litellm/gcp/load_balancer.tf Adds hashed cert name and create_before_destroy to fix resourceInUseByAnotherResource on domain changes; the name change is a breaking rename for existing TLS deployments that the migration guide's "No changes" claim does not account for.
terraform/litellm/gcp/cloudsql.tf Adds lifecycle { ignore_changes = [settings[0].disk_size] } to both writer and reader instances to prevent disk autoresize triggering a destroy/recreate; correct fix.
terraform/litellm/gcp/providers.tf Deleted — google/google-beta provider blocks moved to gcp/examples/default/providers.tf.
terraform/litellm/gcp/examples/default/main.tf New thin-root wrapper for GCP; correctly notes the configuration_aliases limitation for for_each multi-project deployments.
terraform/litellm/gcp/README.md Adds module-first design docs, for_each multi-tenant pattern, and state migration guide; the guide's "expect: No changes" assertion is incorrect for TLS-enabled stacks due to the cert rename.
terraform/litellm/aws/README.md Adds module-first docs, for_each multi-tenant pattern, and AWS state migration guide; terraform plan # expect: No changes is accurate for AWS (no managed certs are renamed here).
terraform/litellm/aws/alb.tf Threads local.tags onto all ALB resources (lb, target groups, listeners, listener rules); coverage is complete.
terraform/litellm/aws/network.tf Replaces bare { Name = ... } tags with merge(local.tags, { Name = ... }) and adds local.tags to security groups; correct approach.

Reviews (2): Last reviewed commit: "docs(terraform): address review — state-..." | Re-trigger Greptile

Comment thread terraform/litellm/aws/README.md
Comment thread terraform/litellm/aws/examples/default/providers.tf
Comment thread terraform/litellm/gcp/examples/default/main.tf

Copy link
Copy Markdown
Contributor Author

@greptileai


Generated by Claude Code

Comment thread terraform/litellm/gcp/README.md Outdated
done
terraform plan # expect: No changes
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 GCP state migration "No changes" claim is incorrect for TLS-enabled stacks

This PR renames google_compute_managed_ssl_certificate from ${local.name}-cert to ${local.name}-cert-${substr(sha1(join(",", var.lb_domains)), 0, 8)} (in load_balancer.tf). After running the scripted state mv loop, Terraform will still see this resource as needing replacement because the name stored in state (acme-litellm-prod-cert) doesn't match the newly computed name. The create_before_destroy lifecycle makes the replacement safe, but the guide's terraform plan # expect: No changes assertion will fail for anyone who has lb_domains set — leaving operators unsure whether the replacement is expected or a sign that the migration went wrong.

The migration guide should note that one resource replacement is expected for TLS-enabled stacks, and that the create_before_destroy lifecycle handles it safely.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the cert-rename edge case. Rather than patch the wording, I've removed the state-migration guides from both the AWS and GCP READMEs entirely (and the corresponding note from the docs PR).

Rationale: these Terraform stacks have never been published or tagged for external consumption, so there are no existing deployments sitting on the old root-module addresses to migrate from. The terraform state mv guidance — and therefore the inaccurate "expect: No changes" assertion that wouldn't hold once the managed SSL cert is renamed — only applied to a hypothetical upgrade path that doesn't exist. Dropping it removes the incorrect claim at the source.

Done in d28283a (cert note) → superseded by 4c00811 (guides removed).


Generated by Claude Code

yassinkortam and others added 6 commits June 6, 2026 10:22
…with examples/default entry point

- Remove `provider` blocks from both AWS and GCP stack roots so the modules
  can be consumed with `count`, `for_each`, `depends_on`, assumed-role or
  aliased providers — patterns that are forbidden when a module owns its own
  provider configuration
- Add `examples/default/` thin-root wrappers for both stacks that wire the
  provider (AWS) / providers (google + google-beta) and call the module with
  a curated variable surface, preserving the one-command deploy experience
- Move `terraform.tfvars.example` files into `examples/default/` alongside
  the new roots; update example comments to reflect the curated variable surface
- Thread `local.tags` (containing `litellm:stack`, `managed-by`, and
  `var.tags`) explicitly onto every taggable AWS resource since the module no
  longer controls the provider's `default_tags`; GCP resource labels already
  flow through the module's `labels` input
- Add `examples/default/variables.tf` and `outputs.tf` for both stacks,
  exposing the most-used knobs and re-exporting all module outputs
- Commit provider lock files for both examples so `terraform init` is
  reproducible without a network fetch
- Update top-level and per-stack READMEs to document the module-first design,
  the `for_each` multi-tenant pattern, and the `examples/default/` quick-start path
…for_each note

- Add 'Migrating an existing deployment' section to AWS & GCP READMEs
  documenting the required terraform state mv step (resource addresses now
  gain a module.litellm. prefix under the examples/default root)
- Remove redundant managed-by tag from the AWS example providers.tf;
  reserve default_tags there for org-wide tags only
- Document the for_each single-provider limitation for GCP (no
  configuration_aliases) in the README and example main.tf

Resolves LIT-3504
…ation guide

The managed SSL cert is named with a hash of lb_domains, so TLS-enabled
stacks that migrated from the old un-hashed name will see one
create_before_destroy cert replacement after terraform state mv — not a
clean 'No changes'. Document that this single replacement is expected and
safe.
The AWS/GCP stacks have never been published, so there are no existing
deployments to migrate from the old root-module layout. Remove the
'Migrating an existing deployment' sections from both READMEs.
…click

The GCP stack's default image_registry points at ghcr.io, which Cloud
Run won't authenticate against, so any real deploy (HCP Terraform
no-code or otherwise) must override it. Document that as a hard
requirement on the GCP README rather than a side note, and add a
top-level HCP Terraform 1-click section enumerating the required
inputs per stack and the migration-task caveat for HCP-hosted runners.
…y v2

proxy_config

Drop the inline LITELLM_PROXY_CONFIG_B64 env var. Upload the YAML to S3
at config/litellm-config.yaml; gateway and backend container entrypoints
download it to /tmp/litellm-config.yaml via boto3 before exec'ing
uvicorn. The S3 object etag is wired into the task definition so a
config edit produces a new task-def revision and a rolling redeploy. The
existing s3_access policy already grants the task role s3:GetObject on
this bucket, so no IAM changes were needed for the mount itself.

OpenTelemetry v2

New variables otel_endpoint, otel_exporter, otel_service_name, and
otel_headers_secret_arn. Setting otel_endpoint to a non-empty value adds
LITELLM_OTEL_V2=true plus OTEL_EXPORTER / OTEL_ENDPOINT /
OTEL_SERVICE_NAME / OTEL_ENVIRONMENT_NAME to the shared env block; an
optional Secrets Manager ARN backs OTEL_HEADERS for collectors that need
an auth header. Execution role auto-gains GetSecretValue on that ARN.
Empty endpoint = nothing added, so existing deployments are unchanged.
@yassin-berriai yassin-berriai force-pushed the litellm_fix/tf-refactor branch from 4794582 to ab2c928 Compare June 6, 2026 17:23
Wires up a Cloud Shell "Open in Cloud Shell" badge backed by the
GoogleCloudPlatform DeployStack flow so examples/default can be
installed from a click in the README without a local terraform setup.

- examples/default/deploystack.json drives project/region collection
  plus prompts for tenant, env, image_tag, and allow_plaintext_lb.
  Complex inputs (proxy_config, *_extra_secrets, lb_domains) and
  sensitive vars (litellm_master_key, litellm_license, ui_password)
  stay tfvars / env only so they never land in a committed file.
- examples/default/TUTORIAL.md is a Cloud Shell walkthrough that
  enables required APIs, creates the GHCR-passthrough Artifact
  Registry repo, optionally exports the TF_VAR_* secrets, runs
  `deploystack install`, and shows how to fetch the master key plus
  migrate from plaintext LB to TLS.
- Renames var.project to var.project_id across the module and the
  examples/default wrapper to match the variable DeployStack injects
  from `collect_project: true`. Breaking rename for anyone with a
  `project = ...` line in terraform.tfvars; the fix is one line.
…ry v2

proxy_config

Drop the inline LITELLM_PROXY_CONFIG_B64 env var and the python-decode
startup fragment. Upload the YAML to a dedicated GCS bucket as
config.yaml, then mount it read-only into the gateway and backend at
/etc/litellm via Cloud Run v2's gcsfuse volume. CONFIG_FILE_PATH points
at the mount; an md5 of the YAML rides along as PROXY_CONFIG_HASH so a
config-only edit forces a new Cloud Run revision (gcsfuse only surfaces
new objects on container restart, so without the hash an updated
proxy_config would sit in the bucket unread).

The config bucket is separate from the data-plane bucket so the runtime
SA can hold objectViewer here (read-only at runtime) while keeping
objectAdmin on the data-plane bucket. Both bucket and IAM binding are
gated on proxy_config != {}; an empty config skips bucket creation and
mounts nothing.

OpenTelemetry v2

LITELLM_OTEL_V2=true is now wired into shared_env_kv unconditionally so
both the gateway and backend boot with the integration enabled. It's
dormant until otel_endpoint is non-empty; setting it injects
OTEL_EXPORTER / OTEL_ENDPOINT / OTEL_ENVIRONMENT_NAME plus a
per-component OTEL_SERVICE_NAME (\${tenant}-litellm-\${env}-{gateway,backend})
so spans land tagged with the right hop. otel_headers_secret takes a
Secret Manager resource ID for OTEL_HEADERS (collector auth); the
runtime SA auto-gains roles/secretmanager.secretAccessor on it.
otel_capture_message_content defaults to no_content matching the litellm
default. Any OTEL_* key set in *_extra_env wins over the defaults so
Cloud Run doesn't reject the apply on the duplicate-env-name check.
{
"name": "allow_plaintext_lb",
"description": "Skip TLS on the load balancer (HTTP-only). Set true for trial/dev. For production, leave false and add lb_domains to terraform.tfvars after the first apply",
"default": "true",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High: Plaintext load balancer by default

allow_plaintext_lb defaults to true for the one-click GCP deploy, which publishes the proxy and UI over http://<lb-ip> and the tutorial has users sign in with the master key on that URL. A network-path attacker can capture the master key or bearer tokens from users who follow the default DeployStack flow; make TLS the default path by collecting lb_domains before apply, or require an explicit opt-in for plaintext rather than defaulting to it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — this is a deliberate, documented tradeoff for the one-click DeployStack path, so we're keeping the allow_plaintext_lb = "true" default here rather than flipping it.

Rationale: GCP's external HTTPS LB uses a Google-managed SSL certificate, which is only issued once the domain in lb_domains resolves to the LB's anycast IP. That IP doesn't exist until after the first apply, so TLS is inherently a two-step flow (apply → point DNS at lb_ip → set lb_domains and re-apply). A true single-click deploy can't provision managed TLS up front, so the installer defaults to HTTP to keep the zero-input path working for trial/dev.

Guardrails already in place:

  • The module is TLS-first by default outside DeployStack: terraform plan refuses to build an HTTP-only LB unless allow_plaintext_lb = true is set explicitly (precondition in load_balancer.tf). The plaintext default lives only in the one-click deploystack.json, which is explicitly the trial/dev entry point.
  • The setting is surfaced as a first-class toggle in the installer with the description telling users to leave it off and set lb_domains for production.
  • The README "TLS" section documents the production two-step flow and the 301→HTTPS upgrade behavior.

We'll make the production-hardening path more prominent in the DeployStack docs, but the HTTP default for the one-click trial flow is intended. Flagging the master-key-over-HTTP exposure for the trial path is fair, and that's why it's gated behind an explicit dev/trial opt-in everywhere except this installer.


Generated by Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is working as intended. The DeployStack one-click is a trial/dev convenience path, not the production entry point; that's why the prompt is explicitly labeled "Set true for trial/dev. For production, leave false and add lb_domains to terraform.tfvars after the first apply", and the tutorial has a dedicated "Going to TLS" step covering DNS + a managed cert.

The module and the examples/default root both default allow_plaintext_lb to false, so the production path fails closed: a plan without lb_domains errors out until the operator either provides a managed-cert domain or consciously opts into plaintext. The end user owns their own infrastructure and is responsible for standing it up securely in prod; we give them a secure-by-default module plus a clearly-scoped trial shortcut, which is the right split.

Not making a change here.


Generated by Claude Code

@veria-ai

veria-ai Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

PR overview

This PR refactors the AWS and GCP Terraform stacks into reusable modules, including example deployment configuration for the LiteLLM GCP stack. The changed GCP DeployStack example defines the default one-click deployment inputs for publishing the proxy and UI behind a load balancer.

There is one open security issue: the default GCP one-click deployment currently enables a plaintext load balancer, causing users who follow the default flow to access the proxy and UI over HTTP. Because the tutorial flow involves signing in with the master key, a network-path attacker could capture credentials or bearer tokens unless TLS is made the default or plaintext is made an explicit opt-in. No issues have been addressed yet, so the PR still carries a clear deployment-default exposure.

Open issues (1)

Fixed/addressed: 0 · PR risk: 7/10

Bring both modules to the same surface and the same runtime behavior so
swapping clouds (or reading either README) is symmetric.

Labels and tags. GCP previously stamped var.labels onto only the two GCS
buckets, leaving Cloud Run, Cloud SQL, Memorystore, Secret Manager, and
the LB resources unlabeled; the variable description claimed full
coverage. Now the module computes local.labels (litellm-stack +
managed-by + var.labels, mirroring AWS's local.tags) and threads it onto
every label-supporting resource: Cloud Run services and the migrations
job, Cloud SQL writer and reader (via user_labels), Memorystore, Secret
Manager entries (master_key, license, ui_password, db_password), both
GCS buckets, the global LB address, and the http/https forwarding rules.
GCP keys use 'litellm-stack' instead of AWS's 'litellm:stack' because
GCP label keys forbid colons; var.labels now defaults to {}.

OpenTelemetry v2 is opt-in on both stacks. AWS already gated everything
on otel_endpoint; GCP previously stamped LITELLM_OTEL_V2=true into
shared_env unconditionally and only ungated the OTEL_* block. Both
stacks now do the same thing: leave otel_endpoint empty and nothing
OTel-related lands in the container env; set it and gateway and backend
get LITELLM_OTEL_V2=true plus OTEL_EXPORTER, OTEL_ENDPOINT,
OTEL_ENVIRONMENT_NAME, OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT,
and a per-component OTEL_SERVICE_NAME (${tenant}-litellm-${env}-gateway
or -backend) so spans land tagged with the right hop. AWS picks up the
richer GCP surface: otel_environment_name (defaults to var.env),
otel_capture_message_content (defaults to no_content), and *_extra_env
override filtering so a caller-set OTEL_* key wins over the default for
that service (ECS allows duplicates, but the filter gives the same
predictable last-wins shape Cloud Run enforces). var.otel_service_name
on AWS is gone, replaced by the per-component naming.

uvicorn workers. GCP gains gateway_num_workers, matching AWS; threads
into the gateway args as --workers ${var.gateway_num_workers}.

Docs reflect the parity: each README's OTel section, the GCP 'Using as
a module' Labels paragraph, and a new feature-parity table in the
top-level README that lays out the AWS/GCP input mapping side by side.
…ample

The example wrapper already exposed `s3_force_destroy` so ephemeral / CI
stacks could destroy the S3 bucket without manual cleanup, but the matching
Aurora knob (`skip_final_snapshot`) was hidden behind the module surface.
That meant a `terraform destroy` on a trial stack still produced a
`<cluster>-final-<short-sha>` snapshot, with no opt-out short of editing the
module call.

Adds `var.skip_final_snapshot` to the example (default `false`, preserving
the data-loss tripwire) and threads it through to the module input,
mirroring the existing `s3_force_destroy` pattern. Documented alongside it
in the tfvars example.

Verified by deploying the example end-to-end against a clean AWS account
(VPC + Aurora w/ IAM auth + Redis + ALB + 3 ECS services), confirming all
services reach steady state and the data plane serves traffic, then running
`terraform destroy` with `skip_final_snapshot = true` to a clean teardown
(93 destroyed, no Aurora snapshot left behind, no leftover billable
resources).
@yassin-berriai yassin-berriai merged commit 1cff02f into litellm_internal_staging Jun 6, 2026
116 of 117 checks passed
@yassin-berriai yassin-berriai deleted the litellm_fix/tf-refactor branch June 6, 2026 19:57
hbjydev pushed a commit to hbjydev/phoebe that referenced this pull request Jun 14, 2026
…9.0) (#93)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/berriai/litellm](https://images.chainguard.dev/directory/image/wolfi-base/overview) ([source](https://github.com/BerriAI/litellm)) | minor | `v1.88.1` → `v1.89.0` |

---

### Release Notes

<details>
<summary>BerriAI/litellm (ghcr.io/berriai/litellm)</summary>

### [`v1.89.0`](https://github.com/BerriAI/litellm/releases/tag/v1.89.0)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.89.0...v1.89.0)

##### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.89.0
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.89.0/cosign.pub \
  ghcr.io/berriai/litellm:v1.89.0
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

##### What's Changed

- test(responses): bump deprecated gemini-3-pro-preview to gemini-3.1-pro-preview by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29433](https://github.com/BerriAI/litellm/pull/29433)
- fix: map mistral/ministral-8b-latest in model price map by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29453](https://github.com/BerriAI/litellm/pull/29453)
- fix(datadog): split oversized batches on 413 instead of re-queueing forever by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29444](https://github.com/BerriAI/litellm/pull/29444)
- feat(otel): allowlist team\_metadata sub-keys promoted to baggage by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29442](https://github.com/BerriAI/litellm/pull/29442)
- fix: stop use\_chat\_completions\_api flag from leaking into provider request body by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29447](https://github.com/BerriAI/litellm/pull/29447)
- fix(anthropic, fireworks): inline legacy $ref defs in tool schemas by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28646](https://github.com/BerriAI/litellm/pull/28646)
- fix(proxy): omit OpenAI \[DONE] on google-genai streamGenerateContent by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29426](https://github.com/BerriAI/litellm/pull/29426)
- ci(release): create stable/X.Y.x line branch on X.Y.0 tags by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29457](https://github.com/BerriAI/litellm/pull/29457)
- fix(vector-stores): support engines URL for Vertex AI Search by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;27885](https://github.com/BerriAI/litellm/pull/27885)
- fix(ui): render caller-supplied filter options in caller order by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29462](https://github.com/BerriAI/litellm/pull/29462)
- fix(batches): skip unnecessary batch input file reads by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29114](https://github.com/BerriAI/litellm/pull/29114)
- docs(agents): clarify when to create new test files by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29472](https://github.com/BerriAI/litellm/pull/29472)
- Litellm OSS Staging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29161](https://github.com/BerriAI/litellm/pull/29161)
- fix(mcp): clear allowed\_tools and tool overrides on MCP server edit by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29411](https://github.com/BerriAI/litellm/pull/29411)
- Litellm OSS Staging 010626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29422](https://github.com/BerriAI/litellm/pull/29422)
- fix(ci): make CircleCI rerun-failed-tests collect tests when 2+ test files fail by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29475](https://github.com/BerriAI/litellm/pull/29475)
- feat(a2a): watsonx Orchestrate agent provider by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29410](https://github.com/BerriAI/litellm/pull/29410)
- fix(azure\_ai): strip tool-level extra fields on 400 and retry by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29479](https://github.com/BerriAI/litellm/pull/29479)
- fix(docs): remove fixed dimensions from README hero image by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29496](https://github.com/BerriAI/litellm/pull/29496)
- Litellm oss staging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29492](https://github.com/BerriAI/litellm/pull/29492)
- fix: small CLAUDE.md nits by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29504](https://github.com/BerriAI/litellm/pull/29504)
- Add MCP semantic conventions to otelv2 by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29468](https://github.com/BerriAI/litellm/pull/29468)
- fix(passthrough): emit otel guardrail span when a guardrail blocks by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29470](https://github.com/BerriAI/litellm/pull/29470)
- fix(proxy): strip NUL bytes from spend log payloads to prevent PostgreSQL 22P05 by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29515](https://github.com/BerriAI/litellm/pull/29515)
- \[internal copy of [#&#8203;28008](https://github.com/BerriAI/litellm/issues/28008)] Support MCP OAuth passthrough and issuer-scoped JWT auth by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28356](https://github.com/BerriAI/litellm/pull/28356)
- feat(vector-stores): forward per-request params to Vertex AI Search by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29459](https://github.com/BerriAI/litellm/pull/29459)
- feat(proxy): add per-MCP-server RPM rate limiting for keys and teams by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29482](https://github.com/BerriAI/litellm/pull/29482)
- fix(tests): drop module-level test calls that break local\_testing collection by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29520](https://github.com/BerriAI/litellm/pull/29520)
- feat(agents): add LangFlow agent provider with A2A session bridging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28963](https://github.com/BerriAI/litellm/pull/28963)
- fix(ui/agents): make A2A skill tags enterable and validated by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29512](https://github.com/BerriAI/litellm/pull/29512)
- \[internal copy of [#&#8203;29232](https://github.com/BerriAI/litellm/issues/29232)] feat: route future Claude models to Anthropic provider via pattern matching by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29239](https://github.com/BerriAI/litellm/pull/29239)
- fix(tests): drop import-time completion call in test\_register\_model by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29521](https://github.com/BerriAI/litellm/pull/29521)
- test: stabilize batch VCR coverage and stop live upload/network leaks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29477](https://github.com/BerriAI/litellm/pull/29477)
- \[internal copy of [#&#8203;29003](https://github.com/BerriAI/litellm/issues/29003)] fix(vertex\_ai): use user-supplied api\_base as is for Model Garden OpenAI-compat path by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29530](https://github.com/BerriAI/litellm/pull/29530)
- feat(proxy): native /health/drain preStop hook for graceful shutdown by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29439](https://github.com/BerriAI/litellm/pull/29439)
- fix(auth): preserve 401 status for expired JWTs in OTel traces by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29510](https://github.com/BerriAI/litellm/pull/29510)
- fix(otel): capture 401 error details in management endpoint spans by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29535](https://github.com/BerriAI/litellm/pull/29535)
- test(proxy/utils): pin bottom-of-file helper behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29509](https://github.com/BerriAI/litellm/pull/29509)
- test(proxy/utils): pin PrismaClient and spend-update behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29488](https://github.com/BerriAI/litellm/pull/29488)
- test(proxy/utils): pin ProxyLogging behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29485](https://github.com/BerriAI/litellm/pull/29485)
- fix: missing span for guardrail passthrough by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29552](https://github.com/BerriAI/litellm/pull/29552)
- fix(auth): let internal users view search tools by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29542](https://github.com/BerriAI/litellm/pull/29542)
- fix: missing mcp otel attributes by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29554](https://github.com/BerriAI/litellm/pull/29554)
- fix(proxy): resolve managed video model ids for auth by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29545](https://github.com/BerriAI/litellm/pull/29545)
- fix(key\_generate): allow team members to create keys on org-scoped teams by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29310](https://github.com/BerriAI/litellm/pull/29310)
- test(pass-through): move Gemini pass-through tests to gemini-3.1-flash-lite by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29595](https://github.com/BerriAI/litellm/pull/29595)
- Litellm oss staging 030626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29578](https://github.com/BerriAI/litellm/pull/29578)
- Fix : a2a bugs 030626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29566](https://github.com/BerriAI/litellm/pull/29566)
- \[internal copy of [#&#8203;29533](https://github.com/BerriAI/litellm/issues/29533)] fix(anthropic/adapter): emit thinking block for reasoning\_content-only streaming chunks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29600](https://github.com/BerriAI/litellm/pull/29600)
- ci: reproduce default-Windows wheel install to guard MAX\_PATH by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29597](https://github.com/BerriAI/litellm/pull/29597)
- fix(vertex): strip output\_config.effort for Vertex Claude models that reject it (Haiku 4.5) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29585](https://github.com/BerriAI/litellm/pull/29585)
- Litellm websocket improvements by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29563](https://github.com/BerriAI/litellm/pull/29563)
- feat(arize/phoenix): OpenInference rendering parity — tool\_calls, cost, passthrough I/O, session/user, multimodal, cache tokens by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28800](https://github.com/BerriAI/litellm/pull/28800)
- \[internal copy of [#&#8203;29550](https://github.com/BerriAI/litellm/issues/29550)] fix: passthrough endpoints duplicate logs by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29598](https://github.com/BerriAI/litellm/pull/29598)
- fix(ci): keep coverage rename green when a parallel node runs no tests by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29608](https://github.com/BerriAI/litellm/pull/29608)
- test(vcr): close out the remaining VCR live-call leaks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29603](https://github.com/BerriAI/litellm/pull/29603)
- fix(key\_generate): exempt UI/CLI session tokens from the budget ceiling for team keys by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29612](https://github.com/BerriAI/litellm/pull/29612)
- fix(realtime): allow null transcripts in stream logging payloads by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29625](https://github.com/BerriAI/litellm/pull/29625)
- build(ui): migrate eslint to flat config + bump eslint-config-next to 16 by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29626](https://github.com/BerriAI/litellm/pull/29626)
- fix(key\_generate): scope session-token team-key budget exemption to caller-supplied team\_id by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29641](https://github.com/BerriAI/litellm/pull/29641)
- fix(proxy): disable proxy buffering on streaming SSE responses by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29557](https://github.com/BerriAI/litellm/pull/29557)
- fix(mcp): gate /public/mcp\_hub strictly on litellm.public\_mcp\_servers by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;27764](https://github.com/BerriAI/litellm/pull/27764)
- ci(ui): frontend-lint job enforcing prettier + eslint on changed files by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29633](https://github.com/BerriAI/litellm/pull/29633)
- fix(gemini): googleSearch + server-side tools and googleMaps JSON schema by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29582](https://github.com/BerriAI/litellm/pull/29582)
- fix(proxy): passthrough 404 when SERVER\_ROOT\_PATH is set by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29658](https://github.com/BerriAI/litellm/pull/29658)
- fix(gemini-realtime): use GA event names for Pipecat 1.3.x compatibility by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29662](https://github.com/BerriAI/litellm/pull/29662)
- Litellm oss staging 040626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29671](https://github.com/BerriAI/litellm/pull/29671)
- style(ui): prettier formatting pass over the dashboard by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29622](https://github.com/BerriAI/litellm/pull/29622)
- chore: ignore prettier dashboard reformat in git blame by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29695](https://github.com/BerriAI/litellm/pull/29695)
- fix(helm): Enable Backend Deployment to mount Gateway config.yaml by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29605](https://github.com/BerriAI/litellm/pull/29605)
- \[internal copy of [#&#8203;29277](https://github.com/BerriAI/litellm/issues/29277)] fix(proxy): add default=None to LiteLLM\_TeamMembership.litellm\_budget\_table by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29684](https://github.com/BerriAI/litellm/pull/29684)
- test: make custom\_tokenizer proxy tests hermetic by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29643](https://github.com/BerriAI/litellm/pull/29643)
- test(proxy): stop running real-DB tests in GitHub Actions unit jobs by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29700](https://github.com/BerriAI/litellm/pull/29700)
- chore(ui): remove the bare-fetch lint rule by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29712](https://github.com/BerriAI/litellm/pull/29712)
- Litellm jwt mapping virtualkeys by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;28510](https://github.com/BerriAI/litellm/pull/28510)
- refactor(ui): shared HTTP client + location-pinned fetch() lint rule by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29723](https://github.com/BerriAI/litellm/pull/29723)
- fix(proxy): stop team BYOK model name corruption on model edit by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29731](https://github.com/BerriAI/litellm/pull/29731)
- \[internal copy of [#&#8203;29511](https://github.com/BerriAI/litellm/issues/29511)] feat(guardrails): add sensitive data routing to on-premise models by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29531](https://github.com/BerriAI/litellm/pull/29531)
- fix(proxy/hooks): populate llm\_provider on internal rate-limit errors by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;27707](https://github.com/BerriAI/litellm/pull/27707)
- fix(vertex/anthropic): handle namespace tools and strip client\_metadata for codex compatibility by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29489](https://github.com/BerriAI/litellm/pull/29489)
- Support OAuth M2M for Databricks Apps A2A agents by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29586](https://github.com/BerriAI/litellm/pull/29586)
- fix: small CLAUDE.md nit by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29749](https://github.com/BerriAI/litellm/pull/29749)
- fix(anthropic): route Claude Opus 4.8 through adaptive thinking by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29702](https://github.com/BerriAI/litellm/pull/29702)
- fix(proxy): persist oauth2\_flow on MCP server registration by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;29690](https://github.com/BerriAI/litellm/pull/29690)
- \[internal copy of [#&#8203;27491](https://github.com/BerriAI/litellm/issues/27491)] fix(realtime): Fix Realtime Audio Token Cost Tracking by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29722](https://github.com/BerriAI/litellm/pull/29722)
- fix(galileo): use ingest traces API and standard logging payload by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29651](https://github.com/BerriAI/litellm/pull/29651)
- fix(auth): expand all-team-models sentinel in can\_key\_call\_model for batch validation by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29746](https://github.com/BerriAI/litellm/pull/29746)
- test(vcr): stop refreshing cassette TTL on read so cassettes lapse after 24h by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29784](https://github.com/BerriAI/litellm/pull/29784)
- test(ci): record/replay OpenAI image gen so the spend E2E isn't outage-bound by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29787](https://github.com/BerriAI/litellm/pull/29787)
- fix(ui): route MCP playground auth by oauth2 mode instead of token\_url by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29714](https://github.com/BerriAI/litellm/pull/29714)
- refactor(ui): centralize proxy base URL resolution into tested resolver by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29793](https://github.com/BerriAI/litellm/pull/29793)
- Litellm oss staging 050626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29774](https://github.com/BerriAI/litellm/pull/29774)
- test(google): add google-genai SDK proxy integration tests by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29781](https://github.com/BerriAI/litellm/pull/29781)
- fix(jwt): use resolved DB user\_id for spend on legacy email match by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29217](https://github.com/BerriAI/litellm/pull/29217)
- feat(ui): generate dashboard API types from the proxy OpenAPI spec by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29816](https://github.com/BerriAI/litellm/pull/29816)
- fix(proxy): drop deleted team BYOK model name from team.models by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29820](https://github.com/BerriAI/litellm/pull/29820)
- feat(mcp): per-server env vars with global + per-user scopes by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28917](https://github.com/BerriAI/litellm/pull/28917)
- refactor(ui): route behavior-preserving networking calls through apiClient by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29806](https://github.com/BerriAI/litellm/pull/29806)
- fix(mcp): persist Tools-tab MCP OAuth token to DB by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29809](https://github.com/BerriAI/litellm/pull/29809)
- fix(ui): require new expiration when regenerating an expired key by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29838](https://github.com/BerriAI/litellm/pull/29838)
- refactor(ui): route query-building networking calls through apiClient by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29815](https://github.com/BerriAI/litellm/pull/29815)
- Make the image-gen record/replay proxy report cache mode and per-request HIT/MISS by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29802](https://github.com/BerriAI/litellm/pull/29802)
- feat(proxy): hot-reload .env in dev when running with --reload by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29783](https://github.com/BerriAI/litellm/pull/29783)
- fix(ui): stop MCP playground tool calls from sending twice by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29821](https://github.com/BerriAI/litellm/pull/29821)
- feat(fal\_ai): add Nano Banana / Gemini 2.5 Flash Image generation support by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29798](https://github.com/BerriAI/litellm/pull/29798)
- Title: Fix managed batch cancel credential resolution by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29734](https://github.com/BerriAI/litellm/pull/29734)
- Title: fix(proxy): resolve vector store file list credentials from team deployments by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29739](https://github.com/BerriAI/litellm/pull/29739)
- refactor: convert AWS and GCP Terraform stacks into reusable modules … by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28103](https://github.com/BerriAI/litellm/pull/28103)
- chore(ui): build ui for release by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29853](https://github.com/BerriAI/litellm/pull/29853)
- fix(terraform/gcp): prompt for image\_registry in DeployStack one-click by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29852](https://github.com/BerriAI/litellm/pull/29852)
- fix(terraform/gcp): abandon SQL user on destroy by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29855](https://github.com/BerriAI/litellm/pull/29855)
- Extend the record/replay proxy to chat, embeddings, moderations, rerank, and Anthropic by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29847](https://github.com/BerriAI/litellm/pull/29847)
- chore(deps): bump deps by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29860](https://github.com/BerriAI/litellm/pull/29860)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29861](https://github.com/BerriAI/litellm/pull/29861)
- fix: 400 on Anthropic context overflow; seed identity on failed auth by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29848](https://github.com/BerriAI/litellm/pull/29848)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29862](https://github.com/BerriAI/litellm/pull/29862)
- chore(release): patch v1.89.0-rc.1 with [#&#8203;30064](https://github.com/BerriAI/litellm/issues/30064) (Claude Fable 5) for v1.89.0-rc.2 by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;30143](https://github.com/BerriAI/litellm/pull/30143)

**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.89.0>

### [`v1.89.0`](https://github.com/BerriAI/litellm/releases/tag/v1.89.0)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.2...v1.89.0)

##### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.89.0
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.89.0/cosign.pub \
  ghcr.io/berriai/litellm:v1.89.0
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

##### What's Changed

- test(responses): bump deprecated gemini-3-pro-preview to gemini-3.1-pro-preview by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29433](https://github.com/BerriAI/litellm/pull/29433)
- fix: map mistral/ministral-8b-latest in model price map by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29453](https://github.com/BerriAI/litellm/pull/29453)
- fix(datadog): split oversized batches on 413 instead of re-queueing forever by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29444](https://github.com/BerriAI/litellm/pull/29444)
- feat(otel): allowlist team\_metadata sub-keys promoted to baggage by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29442](https://github.com/BerriAI/litellm/pull/29442)
- fix: stop use\_chat\_completions\_api flag from leaking into provider request body by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29447](https://github.com/BerriAI/litellm/pull/29447)
- fix(anthropic, fireworks): inline legacy $ref defs in tool schemas by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28646](https://github.com/BerriAI/litellm/pull/28646)
- fix(proxy): omit OpenAI \[DONE] on google-genai streamGenerateContent by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29426](https://github.com/BerriAI/litellm/pull/29426)
- ci(release): create stable/X.Y.x line branch on X.Y.0 tags by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29457](https://github.com/BerriAI/litellm/pull/29457)
- fix(vector-stores): support engines URL for Vertex AI Search by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;27885](https://github.com/BerriAI/litellm/pull/27885)
- fix(ui): render caller-supplied filter options in caller order by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29462](https://github.com/BerriAI/litellm/pull/29462)
- fix(batches): skip unnecessary batch input file reads by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29114](https://github.com/BerriAI/litellm/pull/29114)
- docs(agents): clarify when to create new test files by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29472](https://github.com/BerriAI/litellm/pull/29472)
- Litellm OSS Staging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29161](https://github.com/BerriAI/litellm/pull/29161)
- fix(mcp): clear allowed\_tools and tool overrides on MCP server edit by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29411](https://github.com/BerriAI/litellm/pull/29411)
- Litellm OSS Staging 010626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29422](https://github.com/BerriAI/litellm/pull/29422)
- fix(ci): make CircleCI rerun-failed-tests collect tests when 2+ test files fail by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29475](https://github.com/BerriAI/litellm/pull/29475)
- feat(a2a): watsonx Orchestrate agent provider by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29410](https://github.com/BerriAI/litellm/pull/29410)
- fix(azure\_ai): strip tool-level extra fields on 400 and retry by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29479](https://github.com/BerriAI/litellm/pull/29479)
- fix(docs): remove fixed dimensions from README hero image by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29496](https://github.com/BerriAI/litellm/pull/29496)
- Litellm oss staging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29492](https://github.com/BerriAI/litellm/pull/29492)
- fix: small CLAUDE.md nits by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29504](https://github.com/BerriAI/litellm/pull/29504)
- Add MCP semantic conventions to otelv2 by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29468](https://github.com/BerriAI/litellm/pull/29468)
- fix(passthrough): emit otel guardrail span when a guardrail blocks by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29470](https://github.com/BerriAI/litellm/pull/29470)
- fix(proxy): strip NUL bytes from spend log payloads to prevent PostgreSQL 22P05 by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29515](https://github.com/BerriAI/litellm/pull/29515)
- \[internal copy of [#&#8203;28008](https://github.com/BerriAI/litellm/issues/28008)] Support MCP OAuth passthrough and issuer-scoped JWT auth by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28356](https://github.com/BerriAI/litellm/pull/28356)
- feat(vector-stores): forward per-request params to Vertex AI Search by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29459](https://github.com/BerriAI/litellm/pull/29459)
- feat(proxy): add per-MCP-server RPM rate limiting for keys and teams by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29482](https://github.com/BerriAI/litellm/pull/29482)
- fix(tests): drop module-level test calls that break local\_testing collection by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29520](https://github.com/BerriAI/litellm/pull/29520)
- feat(agents): add LangFlow agent provider with A2A session bridging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28963](https://github.com/BerriAI/litellm/pull/28963)
- fix(ui/agents): make A2A skill tags enterable and validated by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29512](https://github.com/BerriAI/litellm/pull/29512)
- \[internal copy of [#&#8203;29232](https://github.com/BerriAI/litellm/issues/29232)] feat: route future Claude models to Anthropic provider via pattern matching by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29239](https://github.com/BerriAI/litellm/pull/29239)
- fix(tests): drop import-time completion call in test\_register\_model by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29521](https://github.com/BerriAI/litellm/pull/29521)
- test: stabilize batch VCR coverage and stop live upload/network leaks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29477](https://github.com/BerriAI/litellm/pull/29477)
- \[internal copy of [#&#8203;29003](https://github.com/BerriAI/litellm/issues/29003)] fix(vertex\_ai): use user-supplied api\_base as is for Model Garden OpenAI-compat path by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29530](https://github.com/BerriAI/litellm/pull/29530)
- feat(proxy): native /health/drain preStop hook for graceful shutdown by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29439](https://github.com/BerriAI/litellm/pull/29439)
- fix(auth): preserve 401 status for expired JWTs in OTel traces by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29510](https://github.com/BerriAI/litellm/pull/29510)
- fix(otel): capture 401 error details in management endpoint spans by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29535](https://github.com/BerriAI/litellm/pull/29535)
- test(proxy/utils): pin bottom-of-file helper behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29509](https://github.com/BerriAI/litellm/pull/29509)
- test(proxy/utils): pin PrismaClient and spend-update behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29488](https://github.com/BerriAI/litellm/pull/29488)
- test(proxy/utils): pin ProxyLogging behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29485](https://github.com/BerriAI/litellm/pull/29485)
- fix: missing span for guardrail passthrough by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29552](https://github.com/BerriAI/litellm/pull/29552)
- fix(auth): let internal users view search tools by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29542](https://github.com/BerriAI/litellm/pull/29542)
- fix: missing mcp otel attributes by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29554](https://github.com/BerriAI/litellm/pull/29554)
- fix(proxy): resolve managed video model ids for auth by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29545](https://github.com/BerriAI/litellm/pull/29545)
- fix(key\_generate): allow team members to create keys on org-scoped teams by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29310](https://github.com/BerriAI/litellm/pull/29310)
- test(pass-through): move Gemini pass-through tests to gemini-3.1-flash-lite by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29595](https://github.com/BerriAI/litellm/pull/29595)
- Litellm oss staging 030626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29578](https://github.com/BerriAI/litellm/pull/29578)
- Fix : a2a bugs 030626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29566](https://github.com/BerriAI/litellm/pull/29566)
- \[internal copy of [#&#8203;29533](https://github.com/BerriAI/litellm/issues/29533)] fix(anthropic/adapter): emit thinking block for reasoning\_content-only streaming chunks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29600](https://github.com/BerriAI/litellm/pull/29600)
- ci: reproduce default-Windows wheel install to guard MAX\_PATH by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29597](https://github.com/BerriAI/litellm/pull/29597)
- fix(vertex): strip output\_config.effort for Vertex Claude models that reject it (Haiku 4.5) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29585](https://github.com/BerriAI/litellm/pull/29585)
- Litellm websocket improvements by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29563](https://github.com/BerriAI/litellm/pull/29563)
- feat(arize/phoenix): OpenInference rendering parity — tool\_calls, cost, passthrough I/O, session/user, multimodal, cache tokens by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28800](https://github.com/BerriAI/litellm/pull/28800)
- \[internal copy of [#&#8203;29550](https://github.com/BerriAI/litellm/issues/29550)] fix: passthrough endpoints duplicate logs by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29598](https://github.com/BerriAI/litellm/pull/29598)
- fix(ci): keep coverage rename green when a parallel node runs no tests by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29608](https://github.com/BerriAI/litellm/pull/29608)
- test(vcr): close out the remaining VCR live-call leaks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29603](https://github.com/BerriAI/litellm/pull/29603)
- fix(key\_generate): exempt UI/CLI session tokens from the budget ceiling for team keys by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29612](https://github.com/BerriAI/litellm/pull/29612)
- fix(realtime): allow null transcripts in stream logging payloads by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29625](https://github.com/BerriAI/litellm/pull/29625)
- build(ui): migrate eslint to flat config + bump eslint-config-next to 16 by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29626](https://github.com/BerriAI/litellm/pull/29626)
- fix(key\_generate): scope session-token team-key budget exemption to caller-supplied team\_id by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29641](https://github.com/BerriAI/litellm/pull/29641)
- fix(proxy): disable proxy buffering on streaming SSE responses by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29557](https://github.com/BerriAI/litellm/pull/29557)
- fix(mcp): gate /public/mcp\_hub strictly on litellm.public\_mcp\_servers by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;27764](https://github.com/BerriAI/litellm/pull/27764)
- ci(ui): frontend-lint job enforcing prettier + eslint on changed files by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29633](https://github.com/BerriAI/litellm/pull/29633)
- fix(gemini): googleSearch + server-side tools and googleMaps JSON schema by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29582](https://github.com/BerriAI/litellm/pull/29582)
- fix(proxy): passthrough 404 when SERVER\_ROOT\_PATH is set by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29658](https://github.com/BerriAI/litellm/pull/29658)
- fix(gemini-realtime): use GA event names for Pipecat 1.3.x compatibility by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29662](https://github.com/BerriAI/litellm/pull/29662)
- Litellm oss staging 040626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29671](https://github.com/BerriAI/litellm/pull/29671)
- style(ui): prettier formatting pass over the dashboard by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29622](https://github.com/BerriAI/litellm/pull/29622)
- chore: ignore prettier dashboard reformat in git blame by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29695](https://github.com/BerriAI/litellm/pull/29695)
- fix(helm): Enable Backend Deployment to mount Gateway config.yaml by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29605](https://github.com/BerriAI/litellm/pull/29605)
- \[internal copy of [#&#8203;29277](https://github.com/BerriAI/litellm/issues/29277)] fix(proxy): add default=None to LiteLLM\_TeamMembership.litellm\_budget\_table by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29684](https://github.com/BerriAI/litellm/pull/29684)
- test: make custom\_tokenizer proxy tests hermetic by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29643](https://github.com/BerriAI/litellm/pull/29643)
- test(proxy): stop running real-DB tests in GitHub Actions unit jobs by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29700](https://github.com/BerriAI/litellm/pull/29700)
- chore(ui): remove the bare-fetch lint rule by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29712](https://github.com/BerriAI/litellm/pull/29712)
- Litellm jwt mapping virtualkeys by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;28510](https://github.com/BerriAI/litellm/pull/28510)
- refactor(ui): shared HTTP client + location-pinned fetch() lint rule by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29723](https://github.com/BerriAI/litellm/pull/29723)
- fix(proxy): stop team BYOK model name corruption on model edit by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29731](https://github.com/BerriAI/litellm/pull/29731)
- \[internal copy of [#&#8203;29511](https://github.com/BerriAI/litellm/issues/29511)] feat(guardrails): add sensitive data routing to on-premise models by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29531](https://github.com/BerriAI/litellm/pull/29531)
- fix(proxy/hooks): populate llm\_provider on internal rate-limit errors by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;27707](https://github.com/BerriAI/litellm/pull/27707)
- fix(vertex/anthropic): handle namespace tools and strip client\_metadata for codex compatibility by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29489](https://github.com/BerriAI/litellm/pull/29489)
- Support OAuth M2M for Databricks Apps A2A agents by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29586](https://github.com/BerriAI/litellm/pull/29586)
- fix: small CLAUDE.md nit by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29749](https://github.com/BerriAI/litellm/pull/29749)
- fix(anthropic): route Claude Opus 4.8 through adaptive thinking by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29702](https://github.com/BerriAI/litellm/pull/29702)
- fix(proxy): persist oauth2\_flow on MCP server registration by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;29690](https://github.com/BerriAI/litellm/pull/29690)
- \[internal copy of [#&#8203;27491](https://github.com/BerriAI/litellm/issues/27491)] fix(realtime): Fix Realtime Audio Token Cost Tracking by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29722](https://github.com/BerriAI/litellm/pull/29722)
- fix(galileo): use ingest traces API and standard logging payload by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29651](https://github.com/BerriAI/litellm/pull/29651)
- fix(auth): expand all-team-models sentinel in can\_key\_call\_model for batch validation by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29746](https://github.com/BerriAI/litellm/pull/29746)
- test(vcr): stop refreshing cassette TTL on read so cassettes lapse after 24h by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29784](https://github.com/BerriAI/litellm/pull/29784)
- test(ci): record/replay OpenAI image gen so the spend E2E isn't outage-bound by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29787](https://github.com/BerriAI/litellm/pull/29787)
- fix(ui): route MCP playground auth by oauth2 mode instead of token\_url by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29714](https://github.com/BerriAI/litellm/pull/29714)
- refactor(ui): centralize proxy base URL resolution into tested resolver by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29793](https://github.com/BerriAI/litellm/pull/29793)
- Litellm oss staging 050626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29774](https://github.com/BerriAI/litellm/pull/29774)
- test(google): add google-genai SDK proxy integration tests by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29781](https://github.com/BerriAI/litellm/pull/29781)
- fix(jwt): use resolved DB user\_id for spend on legacy email match by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29217](https://github.com/BerriAI/litellm/pull/29217)
- feat(ui): generate dashboard API types from the proxy OpenAPI spec by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29816](https://github.com/BerriAI/litellm/pull/29816)
- fix(proxy): drop deleted team BYOK model name from team.models by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29820](https://github.com/BerriAI/litellm/pull/29820)
- feat(mcp): per-server env vars with global + per-user scopes by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28917](https://github.com/BerriAI/litellm/pull/28917)
- refactor(ui): route behavior-preserving networking calls through apiClient by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29806](https://github.com/BerriAI/litellm/pull/29806)
- fix(mcp): persist Tools-tab MCP OAuth token to DB by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29809](https://github.com/BerriAI/litellm/pull/29809)
- fix(ui): require new expiration when regenerating an expired key by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29838](https://github.com/BerriAI/litellm/pull/29838)
- refactor(ui): route query-building networking calls through apiClient by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29815](https://github.com/BerriAI/litellm/pull/29815)
- Make the image-gen record/replay proxy report cache mode and per-request HIT/MISS by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29802](https://github.com/BerriAI/litellm/pull/29802)
- feat(proxy): hot-reload .env in dev when running with --reload by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29783](https://github.com/BerriAI/litellm/pull/29783)
- fix(ui): stop MCP playground tool calls from sending twice by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29821](https://github.com/BerriAI/litellm/pull/29821)
- feat(fal\_ai): add Nano Banana / Gemini 2.5 Flash Image generation support by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29798](https://github.com/BerriAI/litellm/pull/29798)
- Title: Fix managed batch cancel credential resolution by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29734](https://github.com/BerriAI/litellm/pull/29734)
- Title: fix(proxy): resolve vector store file list credentials from team deployments by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29739](https://github.com/BerriAI/litellm/pull/29739)
- refactor: convert AWS and GCP Terraform stacks into reusable modules … by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28103](https://github.com/BerriAI/litellm/pull/28103)
- chore(ui): build ui for release by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29853](https://github.com/BerriAI/litellm/pull/29853)
- fix(terraform/gcp): prompt for image\_registry in DeployStack one-click by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29852](https://github.com/BerriAI/litellm/pull/29852)
- fix(terraform/gcp): abandon SQL user on destroy by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29855](https://github.com/BerriAI/litellm/pull/29855)
- Extend the record/replay proxy to chat, embeddings, moderations, rerank, and Anthropic by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29847](https://github.com/BerriAI/litellm/pull/29847)
- chore(deps): bump deps by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29860](https://github.com/BerriAI/litellm/pull/29860)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29861](https://github.com/BerriAI/litellm/pull/29861)
- fix: 400 on Anthropic context overflow; seed identity on failed auth by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29848](https://github.com/BerriAI/litellm/pull/29848)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29862](https://github.com/BerriAI/litellm/pull/29862)
- chore(release): patch v1.89.0-rc.1 with [#&#8203;30064](https://github.com/BerriAI/litellm/issues/30064) (Claude Fable 5) for v1.89.0-rc.2 by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;30143](https://github.com/BerriAI/litellm/pull/30143)

**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.89.0>

### [`v1.88.2`](https://github.com/BerriAI/litellm/releases/tag/v1.88.2)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.2...v1.88.2)

##### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.2
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.2/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.2
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

##### What's Changed

- chore(release): backport Fable 5, batch-file auth, CrowdStrike AIDR, Mantle Responses SigV4, and NetApp streaming-cost fix to stable/1.88.x and cut 1.88.2 by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;30144](https://github.com/BerriAI/litellm/pull/30144)
- chore(release): backport DB-resilience, passthrough, model-info, budget, and deps fixes to stable/1.88.x by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;30408](https://github.com/BerriAI/litellm/pull/30408)

**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2>

### [`v1.88.2`](https://github.com/BerriAI/litellm/releases/tag/v1.88.2)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2)

##### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.2
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.2/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.2
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

##### What's Changed

- chore(release): backport Fable 5, batch-file auth, CrowdStrike AIDR, Mantle Responses SigV4, and NetApp streaming-cost fix to stable/1.88.x and cut 1.88.2 by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;30144](https://github.com/BerriAI/litellm/pull/30144)
- chore(release): backport DB-resilience, passthrough, model-info, budget, and deps fixes to stable/1.88.x by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;30408](https://github.com/BerriAI/litellm/pull/30408)

**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2>

</details>

---

### Configuration

📅 **Schedule**: (in timezone Europe/London)

- Branch creation
  - At any time (no schedule defined)
- Automerge
  - At any time (no schedule defined)

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4yMTkuMCIsInVwZGF0ZWRJblZlciI6IjQzLjIxOS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL21pbm9yIl19-->

Reviewed-on: https://forgejo.hayden.moe/hayden/phoebe/pulls/93
blake-hamm added a commit to blake-hamm/bhamm-lab that referenced this pull request Jun 16, 2026
…to v1.89.0 (#200)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [https://github.com/BerriAI/litellm.git](https://github.com/BerriAI/litellm) | minor | `v1.85.1` → `v1.89.0` |

---

> ⚠️ **Warning**
>
> Some dependencies could not be looked up. Check the [Dependency Dashboard](issues/155) for more information.

---

### Release Notes

<details>
<summary>BerriAI/litellm (https://github.com/BerriAI/litellm.git)</summary>

### [`v1.89.0`](https://github.com/BerriAI/litellm/releases/tag/v1.89.0)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.2...v1.89.0)

#### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.89.0
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.89.0/cosign.pub \
  ghcr.io/berriai/litellm:v1.89.0
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

#### What's Changed

- test(responses): bump deprecated gemini-3-pro-preview to gemini-3.1-pro-preview by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29433](https://github.com/BerriAI/litellm/pull/29433)
- fix: map mistral/ministral-8b-latest in model price map by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29453](https://github.com/BerriAI/litellm/pull/29453)
- fix(datadog): split oversized batches on 413 instead of re-queueing forever by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29444](https://github.com/BerriAI/litellm/pull/29444)
- feat(otel): allowlist team\_metadata sub-keys promoted to baggage by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29442](https://github.com/BerriAI/litellm/pull/29442)
- fix: stop use\_chat\_completions\_api flag from leaking into provider request body by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29447](https://github.com/BerriAI/litellm/pull/29447)
- fix(anthropic, fireworks): inline legacy $ref defs in tool schemas by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28646](https://github.com/BerriAI/litellm/pull/28646)
- fix(proxy): omit OpenAI \[DONE] on google-genai streamGenerateContent by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29426](https://github.com/BerriAI/litellm/pull/29426)
- ci(release): create stable/X.Y.x line branch on X.Y.0 tags by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29457](https://github.com/BerriAI/litellm/pull/29457)
- fix(vector-stores): support engines URL for Vertex AI Search by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;27885](https://github.com/BerriAI/litellm/pull/27885)
- fix(ui): render caller-supplied filter options in caller order by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29462](https://github.com/BerriAI/litellm/pull/29462)
- fix(batches): skip unnecessary batch input file reads by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29114](https://github.com/BerriAI/litellm/pull/29114)
- docs(agents): clarify when to create new test files by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29472](https://github.com/BerriAI/litellm/pull/29472)
- Litellm OSS Staging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29161](https://github.com/BerriAI/litellm/pull/29161)
- fix(mcp): clear allowed\_tools and tool overrides on MCP server edit by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29411](https://github.com/BerriAI/litellm/pull/29411)
- Litellm OSS Staging 010626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29422](https://github.com/BerriAI/litellm/pull/29422)
- fix(ci): make CircleCI rerun-failed-tests collect tests when 2+ test files fail by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29475](https://github.com/BerriAI/litellm/pull/29475)
- feat(a2a): watsonx Orchestrate agent provider by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29410](https://github.com/BerriAI/litellm/pull/29410)
- fix(azure\_ai): strip tool-level extra fields on 400 and retry by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29479](https://github.com/BerriAI/litellm/pull/29479)
- fix(docs): remove fixed dimensions from README hero image by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29496](https://github.com/BerriAI/litellm/pull/29496)
- Litellm oss staging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29492](https://github.com/BerriAI/litellm/pull/29492)
- fix: small CLAUDE.md nits by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29504](https://github.com/BerriAI/litellm/pull/29504)
- Add MCP semantic conventions to otelv2 by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29468](https://github.com/BerriAI/litellm/pull/29468)
- fix(passthrough): emit otel guardrail span when a guardrail blocks by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29470](https://github.com/BerriAI/litellm/pull/29470)
- fix(proxy): strip NUL bytes from spend log payloads to prevent PostgreSQL 22P05 by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29515](https://github.com/BerriAI/litellm/pull/29515)
- \[internal copy of [#&#8203;28008](https://github.com/BerriAI/litellm/issues/28008)] Support MCP OAuth passthrough and issuer-scoped JWT auth by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28356](https://github.com/BerriAI/litellm/pull/28356)
- feat(vector-stores): forward per-request params to Vertex AI Search by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29459](https://github.com/BerriAI/litellm/pull/29459)
- feat(proxy): add per-MCP-server RPM rate limiting for keys and teams by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29482](https://github.com/BerriAI/litellm/pull/29482)
- fix(tests): drop module-level test calls that break local\_testing collection by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29520](https://github.com/BerriAI/litellm/pull/29520)
- feat(agents): add LangFlow agent provider with A2A session bridging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28963](https://github.com/BerriAI/litellm/pull/28963)
- fix(ui/agents): make A2A skill tags enterable and validated by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29512](https://github.com/BerriAI/litellm/pull/29512)
- \[internal copy of [#&#8203;29232](https://github.com/BerriAI/litellm/issues/29232)] feat: route future Claude models to Anthropic provider via pattern matching by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29239](https://github.com/BerriAI/litellm/pull/29239)
- fix(tests): drop import-time completion call in test\_register\_model by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29521](https://github.com/BerriAI/litellm/pull/29521)
- test: stabilize batch VCR coverage and stop live upload/network leaks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29477](https://github.com/BerriAI/litellm/pull/29477)
- \[internal copy of [#&#8203;29003](https://github.com/BerriAI/litellm/issues/29003)] fix(vertex\_ai): use user-supplied api\_base as is for Model Garden OpenAI-compat path by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29530](https://github.com/BerriAI/litellm/pull/29530)
- feat(proxy): native /health/drain preStop hook for graceful shutdown by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29439](https://github.com/BerriAI/litellm/pull/29439)
- fix(auth): preserve 401 status for expired JWTs in OTel traces by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29510](https://github.com/BerriAI/litellm/pull/29510)
- fix(otel): capture 401 error details in management endpoint spans by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29535](https://github.com/BerriAI/litellm/pull/29535)
- test(proxy/utils): pin bottom-of-file helper behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29509](https://github.com/BerriAI/litellm/pull/29509)
- test(proxy/utils): pin PrismaClient and spend-update behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29488](https://github.com/BerriAI/litellm/pull/29488)
- test(proxy/utils): pin ProxyLogging behavior by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29485](https://github.com/BerriAI/litellm/pull/29485)
- fix: missing span for guardrail passthrough by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29552](https://github.com/BerriAI/litellm/pull/29552)
- fix(auth): let internal users view search tools by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29542](https://github.com/BerriAI/litellm/pull/29542)
- fix: missing mcp otel attributes by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29554](https://github.com/BerriAI/litellm/pull/29554)
- fix(proxy): resolve managed video model ids for auth by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29545](https://github.com/BerriAI/litellm/pull/29545)
- fix(key\_generate): allow team members to create keys on org-scoped teams by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29310](https://github.com/BerriAI/litellm/pull/29310)
- test(pass-through): move Gemini pass-through tests to gemini-3.1-flash-lite by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29595](https://github.com/BerriAI/litellm/pull/29595)
- Litellm oss staging 030626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29578](https://github.com/BerriAI/litellm/pull/29578)
- Fix : a2a bugs 030626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29566](https://github.com/BerriAI/litellm/pull/29566)
- \[internal copy of [#&#8203;29533](https://github.com/BerriAI/litellm/issues/29533)] fix(anthropic/adapter): emit thinking block for reasoning\_content-only streaming chunks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29600](https://github.com/BerriAI/litellm/pull/29600)
- ci: reproduce default-Windows wheel install to guard MAX\_PATH by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29597](https://github.com/BerriAI/litellm/pull/29597)
- fix(vertex): strip output\_config.effort for Vertex Claude models that reject it (Haiku 4.5) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29585](https://github.com/BerriAI/litellm/pull/29585)
- Litellm websocket improvements by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29563](https://github.com/BerriAI/litellm/pull/29563)
- feat(arize/phoenix): OpenInference rendering parity — tool\_calls, cost, passthrough I/O, session/user, multimodal, cache tokens by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28800](https://github.com/BerriAI/litellm/pull/28800)
- \[internal copy of [#&#8203;29550](https://github.com/BerriAI/litellm/issues/29550)] fix: passthrough endpoints duplicate logs by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29598](https://github.com/BerriAI/litellm/pull/29598)
- fix(ci): keep coverage rename green when a parallel node runs no tests by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29608](https://github.com/BerriAI/litellm/pull/29608)
- test(vcr): close out the remaining VCR live-call leaks by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29603](https://github.com/BerriAI/litellm/pull/29603)
- fix(key\_generate): exempt UI/CLI session tokens from the budget ceiling for team keys by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29612](https://github.com/BerriAI/litellm/pull/29612)
- fix(realtime): allow null transcripts in stream logging payloads by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29625](https://github.com/BerriAI/litellm/pull/29625)
- build(ui): migrate eslint to flat config + bump eslint-config-next to 16 by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29626](https://github.com/BerriAI/litellm/pull/29626)
- fix(key\_generate): scope session-token team-key budget exemption to caller-supplied team\_id by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29641](https://github.com/BerriAI/litellm/pull/29641)
- fix(proxy): disable proxy buffering on streaming SSE responses by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29557](https://github.com/BerriAI/litellm/pull/29557)
- fix(mcp): gate /public/mcp\_hub strictly on litellm.public\_mcp\_servers by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;27764](https://github.com/BerriAI/litellm/pull/27764)
- ci(ui): frontend-lint job enforcing prettier + eslint on changed files by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29633](https://github.com/BerriAI/litellm/pull/29633)
- fix(gemini): googleSearch + server-side tools and googleMaps JSON schema by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29582](https://github.com/BerriAI/litellm/pull/29582)
- fix(proxy): passthrough 404 when SERVER\_ROOT\_PATH is set by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29658](https://github.com/BerriAI/litellm/pull/29658)
- fix(gemini-realtime): use GA event names for Pipecat 1.3.x compatibility by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29662](https://github.com/BerriAI/litellm/pull/29662)
- Litellm oss staging 040626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29671](https://github.com/BerriAI/litellm/pull/29671)
- style(ui): prettier formatting pass over the dashboard by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29622](https://github.com/BerriAI/litellm/pull/29622)
- chore: ignore prettier dashboard reformat in git blame by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29695](https://github.com/BerriAI/litellm/pull/29695)
- fix(helm): Enable Backend Deployment to mount Gateway config.yaml by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29605](https://github.com/BerriAI/litellm/pull/29605)
- \[internal copy of [#&#8203;29277](https://github.com/BerriAI/litellm/issues/29277)] fix(proxy): add default=None to LiteLLM\_TeamMembership.litellm\_budget\_table by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29684](https://github.com/BerriAI/litellm/pull/29684)
- test: make custom\_tokenizer proxy tests hermetic by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29643](https://github.com/BerriAI/litellm/pull/29643)
- test(proxy): stop running real-DB tests in GitHub Actions unit jobs by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29700](https://github.com/BerriAI/litellm/pull/29700)
- chore(ui): remove the bare-fetch lint rule by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29712](https://github.com/BerriAI/litellm/pull/29712)
- Litellm jwt mapping virtualkeys by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;28510](https://github.com/BerriAI/litellm/pull/28510)
- refactor(ui): shared HTTP client + location-pinned fetch() lint rule by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29723](https://github.com/BerriAI/litellm/pull/29723)
- fix(proxy): stop team BYOK model name corruption on model edit by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29731](https://github.com/BerriAI/litellm/pull/29731)
- \[internal copy of [#&#8203;29511](https://github.com/BerriAI/litellm/issues/29511)] feat(guardrails): add sensitive data routing to on-premise models by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29531](https://github.com/BerriAI/litellm/pull/29531)
- fix(proxy/hooks): populate llm\_provider on internal rate-limit errors by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;27707](https://github.com/BerriAI/litellm/pull/27707)
- fix(vertex/anthropic): handle namespace tools and strip client\_metadata for codex compatibility by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29489](https://github.com/BerriAI/litellm/pull/29489)
- Support OAuth M2M for Databricks Apps A2A agents by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29586](https://github.com/BerriAI/litellm/pull/29586)
- fix: small CLAUDE.md nit by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29749](https://github.com/BerriAI/litellm/pull/29749)
- fix(anthropic): route Claude Opus 4.8 through adaptive thinking by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29702](https://github.com/BerriAI/litellm/pull/29702)
- fix(proxy): persist oauth2\_flow on MCP server registration by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;29690](https://github.com/BerriAI/litellm/pull/29690)
- \[internal copy of [#&#8203;27491](https://github.com/BerriAI/litellm/issues/27491)] fix(realtime): Fix Realtime Audio Token Cost Tracking by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29722](https://github.com/BerriAI/litellm/pull/29722)
- fix(galileo): use ingest traces API and standard logging payload by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29651](https://github.com/BerriAI/litellm/pull/29651)
- fix(auth): expand all-team-models sentinel in can\_key\_call\_model for batch validation by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29746](https://github.com/BerriAI/litellm/pull/29746)
- test(vcr): stop refreshing cassette TTL on read so cassettes lapse after 24h by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29784](https://github.com/BerriAI/litellm/pull/29784)
- test(ci): record/replay OpenAI image gen so the spend E2E isn't outage-bound by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29787](https://github.com/BerriAI/litellm/pull/29787)
- fix(ui): route MCP playground auth by oauth2 mode instead of token\_url by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29714](https://github.com/BerriAI/litellm/pull/29714)
- refactor(ui): centralize proxy base URL resolution into tested resolver by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29793](https://github.com/BerriAI/litellm/pull/29793)
- Litellm oss staging 050626 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29774](https://github.com/BerriAI/litellm/pull/29774)
- test(google): add google-genai SDK proxy integration tests by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29781](https://github.com/BerriAI/litellm/pull/29781)
- fix(jwt): use resolved DB user\_id for spend on legacy email match by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29217](https://github.com/BerriAI/litellm/pull/29217)
- feat(ui): generate dashboard API types from the proxy OpenAPI spec by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29816](https://github.com/BerriAI/litellm/pull/29816)
- fix(proxy): drop deleted team BYOK model name from team.models by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29820](https://github.com/BerriAI/litellm/pull/29820)
- feat(mcp): per-server env vars with global + per-user scopes by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28917](https://github.com/BerriAI/litellm/pull/28917)
- refactor(ui): route behavior-preserving networking calls through apiClient by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29806](https://github.com/BerriAI/litellm/pull/29806)
- fix(mcp): persist Tools-tab MCP OAuth token to DB by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29809](https://github.com/BerriAI/litellm/pull/29809)
- fix(ui): require new expiration when regenerating an expired key by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;29838](https://github.com/BerriAI/litellm/pull/29838)
- refactor(ui): route query-building networking calls through apiClient by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29815](https://github.com/BerriAI/litellm/pull/29815)
- Make the image-gen record/replay proxy report cache mode and per-request HIT/MISS by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29802](https://github.com/BerriAI/litellm/pull/29802)
- feat(proxy): hot-reload .env in dev when running with --reload by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29783](https://github.com/BerriAI/litellm/pull/29783)
- fix(ui): stop MCP playground tool calls from sending twice by [@&#8203;tin-berri](https://github.com/tin-berri) in [#&#8203;29821](https://github.com/BerriAI/litellm/pull/29821)
- feat(fal\_ai): add Nano Banana / Gemini 2.5 Flash Image generation support by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29798](https://github.com/BerriAI/litellm/pull/29798)
- Title: Fix managed batch cancel credential resolution by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29734](https://github.com/BerriAI/litellm/pull/29734)
- Title: fix(proxy): resolve vector store file list credentials from team deployments by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29739](https://github.com/BerriAI/litellm/pull/29739)
- refactor: convert AWS and GCP Terraform stacks into reusable modules … by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28103](https://github.com/BerriAI/litellm/pull/28103)
- chore(ui): build ui for release by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29853](https://github.com/BerriAI/litellm/pull/29853)
- fix(terraform/gcp): prompt for image\_registry in DeployStack one-click by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29852](https://github.com/BerriAI/litellm/pull/29852)
- fix(terraform/gcp): abandon SQL user on destroy by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29855](https://github.com/BerriAI/litellm/pull/29855)
- Extend the record/replay proxy to chat, embeddings, moderations, rerank, and Anthropic by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29847](https://github.com/BerriAI/litellm/pull/29847)
- chore(deps): bump deps by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29860](https://github.com/BerriAI/litellm/pull/29860)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29861](https://github.com/BerriAI/litellm/pull/29861)
- fix: 400 on Anthropic context overflow; seed identity on failed auth by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29848](https://github.com/BerriAI/litellm/pull/29848)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29862](https://github.com/BerriAI/litellm/pull/29862)
- chore(release): patch v1.89.0-rc.1 with [#&#8203;30064](https://github.com/BerriAI/litellm/issues/30064) (Claude Fable 5) for v1.89.0-rc.2 by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;30143](https://github.com/BerriAI/litellm/pull/30143)

**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.89.0>

### [`v1.88.2`](https://github.com/BerriAI/litellm/releases/tag/v1.88.2)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2)

#### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.2
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.2/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.2
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

#### What's Changed

- chore(release): backport Fable 5, batch-file auth, CrowdStrike AIDR, Mantle Responses SigV4, and NetApp streaming-cost fix to stable/1.88.x and cut 1.88.2 by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;30144](https://github.com/BerriAI/litellm/pull/30144)
- chore(release): backport DB-resilience, passthrough, model-info, budget, and deps fixes to stable/1.88.x by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;30408](https://github.com/BerriAI/litellm/pull/30408)

**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.1...v1.88.2>

### [`v1.88.1`](https://github.com/BerriAI/litellm/releases/tag/v1.88.1)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.88.0...v1.88.1)

#### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.1
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.1/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.1
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

#### What's Changed

- build(deps): bump pyjwt to 2.13.0 and ws override to 8.20.1 (1.88.x) by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29987](https://github.com/BerriAI/litellm/pull/29987)
- chore(release): bump version to 1.88.1 by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29989](https://github.com/BerriAI/litellm/pull/29989)

**Full Changelog**: <https://github.com/BerriAI/litellm/compare/v1.88.0...v1.88.1>

### [`v1.88.0`](https://github.com/BerriAI/litellm/releases/tag/v1.88.0)

[Compare Source](https://github.com/BerriAI/litellm/compare/v1.87.3...v1.88.0)

#### Verify Docker Image Signature

All LiteLLM Docker images are signed with [cosign](https://docs.sigstore.dev/cosign/overview/). Every release is signed with the same key introduced in [commit `0112e53`](https://github.com/BerriAI/litellm/commit/0112e53046018d726492c814b3644b7d376029d0).

**Verify using the pinned commit hash (recommended):**

A commit hash is cryptographically immutable, so this is the strongest way to ensure you are using the original signing key:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/0112e53046018d726492c814b3644b7d376029d0/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.0
```

**Verify using the release tag (convenience):**

Tags are protected in this repository and resolve to the same key. This option is easier to read but relies on tag protection rules:

```bash
cosign verify \
  --key https://raw.githubusercontent.com/BerriAI/litellm/v1.88.0/cosign.pub \
  ghcr.io/berriai/litellm:v1.88.0
```

Expected output:

```
The following checks were performed on each of these signatures:
  - The cosign claims were validated
  - The signatures were verified against the specified public key
```

***

#### What's Changed

- fix(proxy): gate team allowed\_passthrough\_routes to proxy admins by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28097](https://github.com/BerriAI/litellm/pull/28097)
- fix(tests): stabilize image-edit VCR cassettes to stop live gpt-image-1 spend by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28110](https://github.com/BerriAI/litellm/pull/28110)
- fix(bedrock/cohere): send embedding\_types as JSON array, not string by [@&#8203;ishaan-berri](https://github.com/ishaan-berri) in [#&#8203;28172](https://github.com/BerriAI/litellm/pull/28172)
- fix(tests): migrate realtime + rerank tests off shut-down upstream models by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28191](https://github.com/BerriAI/litellm/pull/28191)
- fix(caching): replay openai/responses bridge cache hits as chat streams by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28158](https://github.com/BerriAI/litellm/pull/28158)
- Litellm oss staging by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28161](https://github.com/BerriAI/litellm/pull/28161)
- feat(prometheus): add user\_email and user\_alias to user budget metrics by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28155](https://github.com/BerriAI/litellm/pull/28155)
- test(callbacks): harden flaky proxy callback-leak detector by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28195](https://github.com/BerriAI/litellm/pull/28195)
- fix(bedrock): sanitize batch metadata to prevent Pydantic ValidationError by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28202](https://github.com/BerriAI/litellm/pull/28202)
- fix(deepseek): use native /anthropic/v1/messages endpoint and sanitize tools by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28200](https://github.com/BerriAI/litellm/pull/28200)
- feat(ui): add Interactions API endpoint to playground with SSE streaming by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28156](https://github.com/BerriAI/litellm/pull/28156)
- fix(proxy): decode bytes and pass-through SSE for Google-native streamGenerateContent ([#&#8203;27444](https://github.com/BerriAI/litellm/issues/27444)) by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28213](https://github.com/BerriAI/litellm/pull/28213)
- refactor(bedrock/sagemaker): switch to lazy loading for response stre… by [@&#8203;harish-berri](https://github.com/harish-berri) in [#&#8203;28189](https://github.com/BerriAI/litellm/pull/28189)
- \[Refactor] UI - Spend Logs: consolidate filter state and extract components by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;25847](https://github.com/BerriAI/litellm/pull/25847)
- fix(tests): replace shut-down gpt-4o-audio-preview with gpt-audio-1.5 by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28281](https://github.com/BerriAI/litellm/pull/28281)
- chore(ci): bump versions by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28287](https://github.com/BerriAI/litellm/pull/28287)
- feat: propagate team\_id and team\_alias to all child OTEL spans by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28273](https://github.com/BerriAI/litellm/pull/28273)
- Day 0 support : Gemini 3.5 Flash by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28268](https://github.com/BerriAI/litellm/pull/28268)
- Gemini managed agents support by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28270](https://github.com/BerriAI/litellm/pull/28270)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28292](https://github.com/BerriAI/litellm/pull/28292)
- feat(gemini): add gemini-3.1-flash-lite model cost map by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28320](https://github.com/BerriAI/litellm/pull/28320)
- fix(spend\_counter): seed Redis counter via SET NX to prevent cross-pod double-seed by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;27854](https://github.com/BerriAI/litellm/pull/27854)
- fix(proxy): normalize batch file IDs before ManagedObjectTable write by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28339](https://github.com/BerriAI/litellm/pull/28339)
- fix(router): use forwarded model\_id for native Azure container IDs by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;27921](https://github.com/BerriAI/litellm/pull/27921)
- fix(ui): restore log filter loading indicator by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28282](https://github.com/BerriAI/litellm/pull/28282)
- test(e2e): migrate runner to uv, add All Proxy Models key test by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28313](https://github.com/BerriAI/litellm/pull/28313)
- feat(ui): team passthrough routes create parity + edit load fix by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28098](https://github.com/BerriAI/litellm/pull/28098)
- fix(mcp): JWT on tools/list and REST tools/call server resolution by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28227](https://github.com/BerriAI/litellm/pull/28227)
- feat(interactions): migrate to Google Interactions API steps schema (May 2026) by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28153](https://github.com/BerriAI/litellm/pull/28153)
- test(ui-e2e): admin key creation with a specific proxy model by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28365](https://github.com/BerriAI/litellm/pull/28365)
- fix(vertex\_ai): omit function\_call id on Vertex Gemini 3.5+ tool turns by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28324](https://github.com/BerriAI/litellm/pull/28324)
- feat(mcp): allow native MCP OAuth support for cursor by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28327](https://github.com/BerriAI/litellm/pull/28327)
- fix(interactions): never drop streamed text deltas; always emit terminal completion by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28394](https://github.com/BerriAI/litellm/pull/28394)
- fix(proxy): expose Prisma idle/connect timeout + extra DB URL params by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28395](https://github.com/BerriAI/litellm/pull/28395)
- Litellm oss staging 1 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28337](https://github.com/BerriAI/litellm/pull/28337)
- fix: serialize guardrail\_response to JSON in OTEL traces by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28362](https://github.com/BerriAI/litellm/pull/28362)
- chore(ci): merge dev branch by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28314](https://github.com/BerriAI/litellm/pull/28314)
- test(realtime): expect session.created as xAI realtime initial event by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28424](https://github.com/BerriAI/litellm/pull/28424)
- feat(tests): behavior-pinning harness + Key Tier-1 matrix by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28321](https://github.com/BerriAI/litellm/pull/28321)
- fix(proxy): hydrate wildcard discovery credentials ([#&#8203;28284](https://github.com/BerriAI/litellm/issues/28284)) - CCI Run by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28419](https://github.com/BerriAI/litellm/pull/28419)
- Litellm oss staging 04 21 2026 2 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;26569](https://github.com/BerriAI/litellm/pull/26569)
- chore(ci): merge dev branch by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28290](https://github.com/BerriAI/litellm/pull/28290)
- fix(vertex\_gemma): strip `context_management` from request body by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28438](https://github.com/BerriAI/litellm/pull/28438)
- fix(logging): recalculate cost after router retry failures by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28476](https://github.com/BerriAI/litellm/pull/28476)
- fix(otel): emit guardrail span on violation, surface status + categories by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28364](https://github.com/BerriAI/litellm/pull/28364)
- test(proxy): behavior-pinning matrix for team management endpoints by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28441](https://github.com/BerriAI/litellm/pull/28441)
- test(vertex\_ai): tolerate transient 500 in google maps grounding test by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28503](https://github.com/BerriAI/litellm/pull/28503)
- fix(docker): restore npm to non\_root builder image by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28519](https://github.com/BerriAI/litellm/pull/28519)
- chore(ci): bump deps by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28524](https://github.com/BerriAI/litellm/pull/28524)
- build(deps-dev): bump black to 26.3.1 and apply formatting by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28525](https://github.com/BerriAI/litellm/pull/28525)
- chore(deps): bump deps by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28528](https://github.com/BerriAI/litellm/pull/28528)
- test(e2e): forward LITELLM\_LICENSE to UI e2e proxy by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28398](https://github.com/BerriAI/litellm/pull/28398)
- Add granian as a ASGI compliant web server. Provider better throughput stability, by [@&#8203;harish-berri](https://github.com/harish-berri) in [#&#8203;26027](https://github.com/BerriAI/litellm/pull/26027)
- Fix conflicts and UI by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28477](https://github.com/BerriAI/litellm/pull/28477)
- Add error\_description and hint for oauth flows by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28471](https://github.com/BerriAI/litellm/pull/28471)
- feat(mcp): Add tool call and tool list support via UI for Oauth mcps by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28454](https://github.com/BerriAI/litellm/pull/28454)
- feat(proxy): persist allowlisted OIDC claims in CLI SSO poll by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28463](https://github.com/BerriAI/litellm/pull/28463)
- fix(responses): use OpenAI SSEDecoder for Responses API streaming by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28566](https://github.com/BerriAI/litellm/pull/28566)
- Litellm oss staging 2 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28582](https://github.com/BerriAI/litellm/pull/28582)
- \[internal copy of [#&#8203;28269](https://github.com/BerriAI/litellm/issues/28269)] Codex cli jwt team alias by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28621](https://github.com/BerriAI/litellm/pull/28621)
- fix(check\_licenses): read PEP 639 license-expression metadata by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28529](https://github.com/BerriAI/litellm/pull/28529)
- test(proxy): behavior-pinning matrix for tier-2/3 key + team management endpoints by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28620](https://github.com/BerriAI/litellm/pull/28620)
- chore(test): remove dead old Playwright e2e suite by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28632](https://github.com/BerriAI/litellm/pull/28632)
- fix(sagemaker): send native Cohere embed payload to Cohere SageMaker endpoints by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28613](https://github.com/BerriAI/litellm/pull/28613)
- style: apply black formatting to fix lint CI (LIT-3274) ([#&#8203;28639](https://github.com/BerriAI/litellm/issues/28639)) by [@&#8203;krrish-berri-2](https://github.com/krrish-berri-2) in [#&#8203;28641](https://github.com/BerriAI/litellm/pull/28641)
- fix(bedrock): decouple STS region from Bedrock aws\_region\_name by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28245](https://github.com/BerriAI/litellm/pull/28245)
- test(streaming): tolerate Vertex 429 wrapped in MidStreamFallbackError by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28669](https://github.com/BerriAI/litellm/pull/28669)
- feat(guardrails): add Microsoft Purview DLP guardrail by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;24966](https://github.com/BerriAI/litellm/pull/24966)
- fix(mcp): forward upstream initialize instructions on cold gateway init by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;28231](https://github.com/BerriAI/litellm/pull/28231)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28680](https://github.com/BerriAI/litellm/pull/28680)
- CI: copy of [#&#8203;25177](https://github.com/BerriAI/litellm/issues/25177) (OCI GenAI: embeddings, streaming/reasoning fixes, model catalog) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28223](https://github.com/BerriAI/litellm/pull/28223)
- Encrypt callback\_vars in key/team metadata in DB by [@&#8203;Michael-RZ-Berri](https://github.com/Michael-RZ-Berri) in [#&#8203;27141](https://github.com/BerriAI/litellm/pull/27141)
- perf: reduce per-request and per-chunk overhead across Anthropic streaming hot paths by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28289](https://github.com/BerriAI/litellm/pull/28289)
- feat(azure): add Speech STT config support by [@&#8203;ishaan-berri](https://github.com/ishaan-berri) in [#&#8203;27482](https://github.com/BerriAI/litellm/pull/27482)
- test(proxy): phase-4 payload behavior pinning for tier-2/3 key + team management endpoints by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28681](https://github.com/BerriAI/litellm/pull/28681)
- feat(prometheus): emit per-token-type detail metrics (LIT-3220) ([#&#8203;28372](https://github.com/BerriAI/litellm/issues/28372)) by [@&#8203;ishaan-berri](https://github.com/ishaan-berri) in [#&#8203;28378](https://github.com/BerriAI/litellm/pull/28378)
- fix(otel): stamp http.response.status\_code on all error responses by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28405](https://github.com/BerriAI/litellm/pull/28405)
- chore(ui): build ui by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28707](https://github.com/BerriAI/litellm/pull/28707)
- fix(helm): drop main- prefix from default image tag by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28710](https://github.com/BerriAI/litellm/pull/28710)
- test(model\_prices): allow audio\_transcription\_config in schema by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28708](https://github.com/BerriAI/litellm/pull/28708)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28709](https://github.com/BerriAI/litellm/pull/28709)
- fix(team): refresh team cache on team\_model\_add/delete (LIT-3244) by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28683](https://github.com/BerriAI/litellm/pull/28683)
- fix(ui/add-model): stop vertex\_ai-anthropic\_models from leaking into Anthropic dropdown by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28723](https://github.com/BerriAI/litellm/pull/28723)
- Fix spend logs v2 route permissions by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28705](https://github.com/BerriAI/litellm/pull/28705)
- fix(proxy): Bedrock Knowledge Base pass-through: preserve SigV4 headers and signed request body by [@&#8203;milan-berri](https://github.com/milan-berri) in [#&#8203;27526](https://github.com/BerriAI/litellm/pull/27526)
- chore(tests): migrate Bedrock CI to AWS account [`9412775`](https://github.com/BerriAI/litellm/commit/941277531214) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28728](https://github.com/BerriAI/litellm/pull/28728)
- fix(otel): export SERVER span on management-endpoint success without http\_request by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28794](https://github.com/BerriAI/litellm/pull/28794)
- chore(ci): merge dev branch by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28801](https://github.com/BerriAI/litellm/pull/28801)
- chore(ci): merge dev branch by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28657](https://github.com/BerriAI/litellm/pull/28657)
- fix(ui): show 2-decimal precision for max\_budget on key overview by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28809](https://github.com/BerriAI/litellm/pull/28809)
- feat(proxy): allow `llm_api_routes` virtual keys to list MCP servers by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28442](https://github.com/BerriAI/litellm/pull/28442)
- chore(ci): merge dev branch by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28807](https://github.com/BerriAI/litellm/pull/28807)
- fix(team): keep team\_alias cache in sync on \_cache\_team\_object writes by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28737](https://github.com/BerriAI/litellm/pull/28737)
- chore(ci): merge dev branch by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28822](https://github.com/BerriAI/litellm/pull/28822)
- ci: daily oss-agent-shin canonical branch by [@&#8203;ishaan-berri](https://github.com/ishaan-berri) in [#&#8203;28829](https://github.com/BerriAI/litellm/pull/28829)
- test(proxy): add harness for proxy\_server.py behavior-pinning by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28827](https://github.com/BerriAI/litellm/pull/28827)
- feat(openai): apply regional-processing cost uplift for EU/US data residency by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28626](https://github.com/BerriAI/litellm/pull/28626)
- chore(admin-ui): regenerate static export with trailingSlash: true by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28112](https://github.com/BerriAI/litellm/pull/28112)
- fix(azure): preserve AD token refresh in v1 OpenAI client path by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28627](https://github.com/BerriAI/litellm/pull/28627)
- fix(ui): route API Reference back to query-param page by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28726](https://github.com/BerriAI/litellm/pull/28726)
- fix(model-edit): allow clearing custom pricing on wildcard models by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28719](https://github.com/BerriAI/litellm/pull/28719)
- fix(tests/vcr): make Redis cassette cache replay deterministically (zero VCR misses on consecutive runs) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28826](https://github.com/BerriAI/litellm/pull/28826)
- fix(proxy): strip LiteLLM policy tracking from OpenAI batch metadata by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;28425](https://github.com/BerriAI/litellm/pull/28425)
- Litellm OpenAI double prefix bug by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;28661](https://github.com/BerriAI/litellm/pull/28661)
- Litellm oss staging 250526 by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28770](https://github.com/BerriAI/litellm/pull/28770)
- fix(bedrock): align toolUse/toolSpec names and allow hyphens by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28874](https://github.com/BerriAI/litellm/pull/28874)
- fix(realtime): send TEXT frames and valid guardrail session.update by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28848](https://github.com/BerriAI/litellm/pull/28848)
- fix(mcp): extend key access-group union to MCP servers by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28890](https://github.com/BerriAI/litellm/pull/28890)
- fix(galileo): support hosted v2 spans API and string output extraction by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28771](https://github.com/BerriAI/litellm/pull/28771)
- fix(proxy): exclude proxy\_server\_request from its own body snapshot by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;28618](https://github.com/BerriAI/litellm/pull/28618)
- \[Feat] Add tool calling support for gemini and vertex ai live api by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;26590](https://github.com/BerriAI/litellm/pull/26590)
- refactor(ui): remove dead App Router scaffolding in (dashboard)/\* by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28891](https://github.com/BerriAI/litellm/pull/28891)
- fix(docker): use system Node in componentized builders + retry apk add by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28888](https://github.com/BerriAI/litellm/pull/28888)
- docs(agents): require consent before writing new third-party names by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;28908](https://github.com/BerriAI/litellm/pull/28908)
- refactor(ui): extract auth state into AuthContext by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28910](https://github.com/BerriAI/litellm/pull/28910)
- fix(mcp): resolve team.access\_group\_ids → MCP servers by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28997](https://github.com/BerriAI/litellm/pull/28997)
- test(ui): e2e cover team model edit + admin identity in navbar by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;28652](https://github.com/BerriAI/litellm/pull/28652)
- test(e2e): cover add-fallback flow in Router Settings by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29069](https://github.com/BerriAI/litellm/pull/29069)
- test(e2e): cover Team-BYOK add-model flow as proxy admin by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29068](https://github.com/BerriAI/litellm/pull/29068)
- fix(containers): record ownership for service-account keys + fix Prisma Json serialization by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28990](https://github.com/BerriAI/litellm/pull/28990)
- test(e2e): cover add-MCP-server flow via discovery → custom form by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29070](https://github.com/BerriAI/litellm/pull/29070)
- test(e2e): cover AI Hub make-public flow and public model\_hub\_table by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29071](https://github.com/BerriAI/litellm/pull/29071)
- \[internal copy of [#&#8203;28877](https://github.com/BerriAI/litellm/issues/28877)] feat: add support for claude code goal mode for bedrock opus output config by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;28898](https://github.com/BerriAI/litellm/pull/28898)
- feat(guardrails): wire apply\_guardrail into proxy logging callbacks by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28970](https://github.com/BerriAI/litellm/pull/28970)
- chore(ci): merge dev brach by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29192](https://github.com/BerriAI/litellm/pull/29192)
- perf(streaming): cut per-chunk overhead \~30% on Anthropic + Bedrock hot path by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28720](https://github.com/BerriAI/litellm/pull/28720)
- fix(proxy): enforce tag budgets for key-level tags by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29108](https://github.com/BerriAI/litellm/pull/29108)
- fix(vertex-ai): use DB credentials in video handlers + implement Veo video edit by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29098](https://github.com/BerriAI/litellm/pull/29098)
- fix(datadog): drain cost-management queue + opt-in FinOps tag allowlist by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;28487](https://github.com/BerriAI/litellm/pull/28487)
- feat(helm): split per-component ServiceAccounts for gateway, backend, and UI by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28712](https://github.com/BerriAI/litellm/pull/28712)
- chore(ci): bump deps ([#&#8203;29208](https://github.com/BerriAI/litellm/issues/29208)) by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29226](https://github.com/BerriAI/litellm/pull/29226)
- fix(tests/vcr): mint Google OAuth tokens live to prevent stale-token replay by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29229](https://github.com/BerriAI/litellm/pull/29229)
- chore(cookbook): bump Go directive to 1.26.3 in gollem example by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29234](https://github.com/BerriAI/litellm/pull/29234)
- chore(ci): bump version by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29242](https://github.com/BerriAI/litellm/pull/29242)
- feat(anthropic): add Claude Opus 4.8 and prune reasoning-effort flags by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29238](https://github.com/BerriAI/litellm/pull/29238)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29243](https://github.com/BerriAI/litellm/pull/29243)
- fix(ci): restore real Bedrock batch S3 bucket/role in oai\_misc\_config by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29245](https://github.com/BerriAI/litellm/pull/29245)
- fix(guardrails): persist disable\_global\_guardrails on keys by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29233](https://github.com/BerriAI/litellm/pull/29233)
- test(e2e): cover Team Admin view + member + key flows by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29072](https://github.com/BerriAI/litellm/pull/29072)
- docs: hand-written CLAUDE.md; remove AGENTS.md, point GEMINI.md at it by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29252](https://github.com/BerriAI/litellm/pull/29252)
- fix(teams): expose keys\_count on /v2/team/list and wire UI Resources badge by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;28502](https://github.com/BerriAI/litellm/pull/28502)
- fix(anthropic): stop injecting unsupported output\_config.effort=xhigh for Claude Code on Sonnet/Opus 4.6 by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29304](https://github.com/BerriAI/litellm/pull/29304)
- test(e2e): cover Internal Viewer nav, key, and team-info gating by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29075](https://github.com/BerriAI/litellm/pull/29075)
- test(e2e): cover Internal User key modal, team info, key page by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29074](https://github.com/BerriAI/litellm/pull/29074)
- test(e2e): cover navbar Logout flow as proxy admin by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29076](https://github.com/BerriAI/litellm/pull/29076)
- fix(mcp): resolve key.access\_group\_ids → MCP servers (ungated) by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29195](https://github.com/BerriAI/litellm/pull/29195)
- fix(router): enforce deployment budgets for dynamically added models by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29273](https://github.com/BerriAI/litellm/pull/29273)
- fix(proxy): map stripped batch body.model to proxy alias for auth by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29264](https://github.com/BerriAI/litellm/pull/29264)
- feat(mcp): support stateless and stateful clients via session-id routing by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;26857](https://github.com/BerriAI/litellm/pull/26857)
- fix(bedrock): support tool search results + chat annotations by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29120](https://github.com/BerriAI/litellm/pull/29120)
- fix(mcp): ignore stale ids on key save by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29128](https://github.com/BerriAI/litellm/pull/29128)
- feat(a2a): well-known agent-card discovery + LangGraph Platform mode by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28860](https://github.com/BerriAI/litellm/pull/28860)
- fix(proxy): link passthrough success spans to the SERVER root OTEL span by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29315](https://github.com/BerriAI/litellm/pull/29315)
- \[internal copy of [#&#8203;29089](https://github.com/BerriAI/litellm/issues/29089)] fix: duplicate claude code traces by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29311](https://github.com/BerriAI/litellm/pull/29311)
- feat(otel): typed semconv-aligned OpenTelemetry instrumentation by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;28909](https://github.com/BerriAI/litellm/pull/28909)
- tests(proxy\_server): surface current behavior in tests by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29309](https://github.com/BerriAI/litellm/pull/29309)
- test(e2e): cover Internal User create-key flow when in no teams by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29083](https://github.com/BerriAI/litellm/pull/29083)
- test(e2e): assert internal-user navbar identity is scoped to that user by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29077](https://github.com/BerriAI/litellm/pull/29077)
- feat(otel): add team\_metadata, http.route, and model names to inference spans by [@&#8203;yassin-berriai](https://github.com/yassin-berriai) in [#&#8203;29319](https://github.com/BerriAI/litellm/pull/29319)
- feat(context\_management): compact\_20260112 polyfill for non-Anthropic providers by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;28868](https://github.com/BerriAI/litellm/pull/28868)
- feat(enterprise): add RESEND\_FROM\_EMAIL for self-hosted Resend sends by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;28830](https://github.com/BerriAI/litellm/pull/28830)
- Revert Bedrock CI back to the reactivated AWS account ([`8886022`](https://github.com/BerriAI/litellm/commit/888602223428)) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29326](https://github.com/BerriAI/litellm/pull/29326)
- fix(mcp): preserve source\_url in GET /v1/mcp/server list responses by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29249](https://github.com/BerriAI/litellm/pull/29249)
- fix(mcp): preserve omitted fields on PUT /v1/mcp/server partial updates by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29253](https://github.com/BerriAI/litellm/pull/29253)
- fix(ci): make litellm\_internal\_staging green (logging test + Bedrock Opus 4.7 self-heal) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29344](https://github.com/BerriAI/litellm/pull/29344)
- refactor(proxy/auth): normalize Bearer prefix in safe-hash helper by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29343](https://github.com/BerriAI/litellm/pull/29343)
- test(reasoning-effort-grid): cover Claude Opus 4.8 across provider routes by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29327](https://github.com/BerriAI/litellm/pull/29327)
- fix(guardrails): return HTTP 400 for litellm content filter blocks by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;28418](https://github.com/BerriAI/litellm/pull/28418)
- fix(proxy): restrict vector store index create/delete to proxy admins by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29202](https://github.com/BerriAI/litellm/pull/29202)
- feat(pass\_through): extend passthrough\_managed\_object\_ids to Azure by [@&#8203;Sameerlite](https://github.com/Sameerlite) in [#&#8203;29160](https://github.com/BerriAI/litellm/pull/29160)
- fix(proxy): enforce allowed\_passthrough\_routes for auth=true pass-thr… by [@&#8203;shivamrawat1](https://github.com/shivamrawat1) in [#&#8203;29256](https://github.com/BerriAI/litellm/pull/29256)
- feat(mcp/auth): additive key access-group grants + opt-in member assignment by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29313](https://github.com/BerriAI/litellm/pull/29313)
- fix(reset\_budget): write only {spend, budget\_reset\_at} and stop pre-zeroing counter by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29358](https://github.com/BerriAI/litellm/pull/29358)
- test(e2e): cover PROXY\_LOGOUT\_URL redirect on Logout by [@&#8203;ryan-crabbe-berri](https://github.com/ryan-crabbe-berri) in [#&#8203;29080](https://github.com/BerriAI/litellm/pull/29080)
- fix(ui): break logout redirect loop across dev and proxy origins by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29360](https://github.com/BerriAI/litellm/pull/29360)
- fix(openai-moderation): wire streaming flags through to unified dispatcher by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;27324](https://github.com/BerriAI/litellm/pull/27324)
- chore(ci): build ui by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29366](https://github.com/BerriAI/litellm/pull/29366)
- fix(v3 limiter): cap no-max\_tokens TPM floor at smallest configured limit by [@&#8203;michelligabriele](https://github.com/michelligabriele) in [#&#8203;28805](https://github.com/BerriAI/litellm/pull/28805)
- fix(e2e): tolerate trailing slash in SERVER\_ROOT\_PATH login redirect by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29369](https://github.com/BerriAI/litellm/pull/29369)
- chore(deps): bump deps by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29373](https://github.com/BerriAI/litellm/pull/29373)
- chore(ci): promote internal staging to main by [@&#8203;yuneng-berri](https://github.com/yuneng-berri) in [#&#8203;29372](https://github.com/BerriAI/litellm/pull/29372)
- chore(release): patch v1.88.0-rc.1 with four staged fixes by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29632](https://github.com/BerriAI/litellm/pull/29632)
- chore(release): patch v1.88.0-rc.1 with [#&#8203;29612](https://github.com/BerriAI/litellm/issues/29612) (session-token budget-ceiling exemption) by [@&#8203;mateo-berri](https://github.com/mateo-berri) in [#&#8203;29637](https://github.com/BerriAI/litellm/pull/29637)
- fix(key\_generate): harden GHSA-q775 …
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants