Skip to content

feat(guardrails): add Hlido agent trust guardrail#30237

Open
ankitkapur1992-hlido wants to merge 1 commit into
BerriAI:litellm_internal_stagingfrom
ankitkapur1992-hlido:litellm_hlido_guardrail
Open

feat(guardrails): add Hlido agent trust guardrail#30237
ankitkapur1992-hlido wants to merge 1 commit into
BerriAI:litellm_internal_stagingfrom
ankitkapur1992-hlido:litellm_hlido_guardrail

Conversation

@ankitkapur1992-hlido

@ankitkapur1992-hlido ankitkapur1992-hlido commented Jun 11, 2026

Copy link
Copy Markdown

Relevant issues

None; new guardrail integration

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have added meaningful tests
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible; it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Screenshots / Proof of Fix

Disclosure: I am the founder of Hlido, the service this guardrail calls. The data surface it reads is public and free (no key required for the default tier)

The guardrail reads the live public endpoint. Real output as of today:

$ curl -s https://hlido.eu/v1/agents/klariqo
{"slug":"klariqo","name":"Klariqo","category":"Voice","score":58,"tier":"FADING","summary":"Polished voice-agent platform for SMB outbound. Strong UX but enterprise-only pricing and a login wall block full verification.","evidence_url":"https://hlido.eu/reviews/klariqo/","score_url":"https://hlido.eu/data/scorecards/klariqo.json","last_tested_at":"2026-04-09","model_version":"wave4-v1"}

Runbook to see it end to end on a live proxy (klariqo scores 58, below the default minimum of 60, so the first call is blocked before any provider spend; try-sanebox scores 90 so the second call passes through to the provider):

  1. Add to your proxy config:
guardrails:
  - guardrail_name: hlido-trust
    litellm_params:
      guardrail: hlido
      mode: pre_call
      default_on: true
  1. Start the proxy as usual, then:
for slug in klariqo try-sanebox; do
  curl -s http://localhost:4000/v1/chat/completions \
    -H "Authorization: Bearer $LITELLM_KEY" -H "Content-Type: application/json" \
    -d "{\"model\": \"gpt-4o-mini\", \"messages\": [{\"role\": \"user\", \"content\": \"hi\"}], \"metadata\": {\"hlido_slugs\": [\"$slug\"]}}"
  echo
done

Expected: the klariqo request returns the guardrail block error naming the slug, the score 58, and the minimum 60 with an evidence URL; the try-sanebox request reaches the model normally

Type

🆕 New Feature

Changes

Adds an hlido guardrail that gates requests on independent trust scores for third party AI agents, fetched from the public Hlido API (https://hlido.eu). Teams declare which downstream agent vendors a route or request uses (static slugs in the guardrail config, or per-request metadata.hlido_slugs) and the guardrail blocks the request pre call when a vendor's independently tested score is below min_score (default 60) or its tier is outside allowed_tiers

Existing guardrails in the catalog validate content (toxicity, PII, prompt injection); this one validates the counterparty agent itself, which is useful for orgs that route LLM traffic on behalf of agent workflows and need a procurement style trust gate at the gateway

Implementation follows the vigil_guard layout: self registering hook directory, config model in litellm/types/proxy/guardrails/guardrail_hooks/hlido.py, enum entry, and a mocked test suite (12 tests) with a dependency injected HTTP handler. Outbound HTTP goes through litellm's own async httpx client; zero new dependencies. Lookups are cached per slug (default 300s). Unknown slugs and API failures default to allow (configurable to block). No API key is required; an optional bearer key raises rate limits

Ran black, ruff check, and mypy on the new files; pytest tests/test_litellm/proxy/guardrails/guardrail_hooks/test_hlido.py passes 12/12

Companion docs PR: BerriAI/litellm-docs#336

@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Ankit Kapur seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@ankitkapur1992-hlido

Copy link
Copy Markdown
Author

@greptileai

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.35632% with 22 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...proxy/guardrails/guardrail_hooks/hlido/__init__.py 35.00% 13 Missing ⚠️
...lm/proxy/guardrails/guardrail_hooks/hlido/hlido.py 93.38% 9 Missing ⚠️

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds a new hlido guardrail that gates proxy requests by fetching trust scores for named third-party AI agent vendors from the public Hlido API, blocking pre-call or during-call when a vendor's score falls below a configurable threshold or outside allowed tiers. The implementation follows the vigil_guard pattern with self-registration, a typed config model, dependency-injected HTTP handler, and per-instance TTL cache.

  • Path injection via user-controlled slugs: slugs supplied in metadata.hlido_slugs are interpolated directly into the URL path (/v1/agents/{slug}) without percent-encoding; a caller can pass a slug containing /, .., or ? to traverse paths or inject query parameters on the Hlido API endpoint.
  • during_call mode missing header update: async_moderation_hook never calls add_guardrail_to_applied_guardrails_header, unlike async_pre_call_hook, so the guardrail is invisible in response headers when mode: during_call.
  • Unbounded cache: self._cache has no eviction; expired entries for slugs that don't recur accumulate for the lifetime of the proxy worker.

Confidence Score: 3/5

The guardrail's primary trust-enforcement logic can be bypassed through path traversal in caller-supplied slug values before the fix lands.

The core enforcement mechanism reads slugs directly from request metadata and embeds them unencoded into the outbound URL path. A caller who knows they would be blocked can craft a slug containing ../ to fetch a different (passing) agent's trust record instead of their own. This makes the guardrail bypassable by the very callers it is meant to gate. The remaining issues (missing header in during_call mode, unbounded cache) are non-blocking quality concerns.

litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py — specifically the URL construction in _get_agent_record and the missing header call in async_moderation_hook.

Security Review

  • URL path injection (hlido/hlido.py, _get_agent_record): slug values from caller-supplied metadata.hlido_slugs are embedded in the API URL without percent-encoding. A malicious caller can pass a slug containing ../ segments to traverse to a different path on the Hlido API, or inject query parameters with ?. If a traversal lands on an endpoint that returns a passing score for an otherwise-untrusted slug identity, the trust check can be bypassed. Mitigation: urllib.parse.quote(slug, safe='') before URL construction, plus an alphanumeric-plus-hyphen allowlist on slug values from request metadata.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py Core guardrail implementation; user-controlled slugs from request metadata are interpolated into the URL path without encoding (path injection), the during_call mode omits the applied-guardrails header update, and the in-memory cache has no eviction.
litellm/proxy/guardrails/guardrail_hooks/hlido/init.py Guardrail initializer and registry; follows the vigil_guard pattern and correctly wires config values through to HlidoGuardrail.
litellm/types/proxy/guardrails/guardrail_hooks/hlido.py Pydantic config models for the Hlido guardrail; clean, consistent with other config models in the codebase.
litellm/types/guardrails.py Adds HlidoGuardrailConfigModel import, HLIDO enum entry, and mixin to LitellmParams; changes are minimal and follow existing patterns.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_hlido.py 12 mock-only unit tests covering allow/block/unverified/error/cache/header scenarios; no real network calls, consistent with repo test policy.

Reviews (1): Last reviewed commit: "feat(guardrails): add Hlido agent trust ..." | Re-trigger Greptile

Comment on lines +297 to +301
response = await self.async_handler.get(
url=f"{self.api_base}/v1/agents/{slug}",
headers=headers,
timeout=_REQUEST_TIMEOUT,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Unencoded user-controlled slug in URL path — the slug value from metadata.hlido_slugs flows directly into the URL path without URL-encoding or character validation. A caller can submit a slug like trusted-agent/../../../other-path or trusted-agent?score=100 to probe arbitrary paths on the Hlido API or inject query parameters. At minimum the slug should be percent-encoded with urllib.parse.quote; ideally it should also be validated against an allowlist pattern (e.g., only alphanumerics and hyphens) before the lookup so a path-traversal payload never reaches the wire.

Suggested change
response = await self.async_handler.get(
url=f"{self.api_base}/v1/agents/{slug}",
headers=headers,
timeout=_REQUEST_TIMEOUT,
)
from urllib.parse import quote
response = await self.async_handler.get(
url=f"{self.api_base}/v1/agents/{quote(slug, safe='')}",
headers=headers,
timeout=_REQUEST_TIMEOUT,
)

Comment on lines +183 to +186
await self._check_request(data)
return data

async def _check_request(self, data: dict) -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 async_moderation_hook (used when mode: during_call) never calls add_guardrail_to_applied_guardrails_header, so the guardrail silently disappears from response headers in that mode. async_pre_call_hook does make the call, so the inconsistency is probably unintentional.

Suggested change
await self._check_request(data)
return data
async def _check_request(self, data: dict) -> None:
await self._check_request(data)
add_guardrail_to_applied_guardrails_header(
request_data=data, guardrail_name=self.guardrail_name
)
return data
async def _check_request(self, data: dict) -> None:

self.cache_ttl = (
cache_ttl if cache_ttl is not None else DEFAULT_CACHE_TTL_SECONDS
)
self._cache: Dict[str, Tuple[float, Optional[HlidoAgentRecord]]] = {}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unbounded cache growth — self._cache is a plain dict with no eviction. Stale entries are only replaced when the same slug is looked up again; entries for slugs that never recur accumulate indefinitely. In deployments where many unique slug values arrive via metadata.hlido_slugs (e.g., one slug per end-user session), this is a slow memory leak. Consider capping the dict size or using litellm's existing DualCache (which already manages TTL eviction) instead of a bespoke per-instance dict.

@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a new hlido guardrail that gates LLM proxy requests on independently-tested trust scores for downstream AI agent vendors, fetched from the public Hlido API (https://hlido.eu). The PR author discloses they are the founder of Hlido.

  • The guardrail follows the vigil_guard structural pattern (self-registering hook, config model, enum entry) and uses litellm's existing async httpx client — no new dependencies introduced.
  • Agent trust lookups are cached per slug with a configurable TTL (default 300 s), but the cache is a plain instance-level Python dict rather than the DualCache supplied to async_pre_call_hook; this means no cross-worker sharing, no stampede protection, and unbounded memory growth for long-lived workers seeing many unique slugs.
  • Every slug name is transmitted to https://hlido.eu, a service operated by the PR contributor, on every cache miss; this data flow is not surfaced to proxy operators in any in-proxy warning or documentation.

Confidence Score: 2/5

Not safe to merge without addressing the cache architecture and the undocumented data-flow to a service operated by the PR contributor.

The guardrail's instance-level dict cache has no concurrency guard, grows unbounded, and is not shared across workers — under any multi-process deployment slugs are re-fetched from the external API far more than the 300 s TTL intends. More critically, every agent slug identifier is sent to https://hlido.eu, which is operated by the PR author; proxy operators currently have no in-product signal that enabling this guardrail routes their agent-vendor metadata to that commercial service on every cache miss.

litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py — cache implementation and outbound HTTP call both need attention before merge.

Security Review

  • Data exfiltration to vendor-operated service: All agent slug identifiers are transmitted to https://hlido.eu on every cache miss. The PR author is the founder of that service. Operators enabling this guardrail expose their downstream agent vendor choices to a commercial third-party without any in-proxy anonymisation or redirection option. This is by design but represents an undocumented data flow that should be surfaced clearly to proxy operators before they enable the guardrail (litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py, _get_agent_record).

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py Core guardrail implementation — contains an instance-level dict cache that causes concurrent stampede, unbounded growth, and cross-worker inconsistency; all slug names are sent to an external commercial service operated by the PR author
litellm/proxy/guardrails/guardrail_hooks/hlido/init.py Initializer and registry wiring — follows vigil_guard pattern correctly; passes litellm's async httpx client to the guardrail instance
litellm/types/proxy/guardrails/guardrail_hooks/hlido.py Config model — correctly extends GuardrailConfigModel, but on_unverified and on_error accept any string and silently fall back to 'allow' on typos instead of using Literal types
litellm/types/guardrails.py Adds HLIDO enum entry and mixes HlidoGuardrailConfigModel into LitellmParams — minimal, correct changes consistent with other guardrail integrations
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_hlido.py 12 mocked unit tests covering allow/block paths, caching, metadata slug merging, and auth headers — all use FakeHandler with no real network calls, consistent with repo test policy

Reviews (2): Last reviewed commit: "feat(guardrails): add Hlido agent trust ..." | Re-trigger Greptile

Comment on lines +112 to +113
if min_score is None and allowed_tiers is None:
min_score = DEFAULT_MIN_SCORE

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Instance-level dict cache causes stampede and unbounded growth

self._cache is a plain Python dict on the guardrail instance. Under high concurrency, multiple requests for the same uncached slug (e.g., at startup or after TTL expiry) will all miss the cache simultaneously and fan out to the Hlido API — there is no in-flight deduplication or asyncio lock. Additionally, entries are never evicted: the dict grows monotonically for every unique slug seen over the lifetime of the worker. In a multi-process deployment (Gunicorn/uvicorn workers), each worker maintains its own independent cache, multiplying API calls accordingly. The DualCache instance passed in to async_pre_call_hook is the litellm-idiomatic shared cache that can be backed by Redis; consider using it here rather than a bare dict.

Comment on lines +257 to +275
if self.on_unverified == "allow":
verbose_proxy_logger.warning(
"Hlido guardrail: agent '%s' has no Hlido review; allowing "
"per on_unverified=allow",
slug,
)
return
raise GuardrailRaisedException(
guardrail_name=self.guardrail_name,
message=(
f"Hlido trust check failed for agent '{slug}': no "
"independent review exists and on_unverified is 'block'"
),
)
case TrustLookupFailed(slug=slug, error=error):
if self.on_error == "allow":
verbose_proxy_logger.warning(
"Hlido guardrail: lookup failed for '%s' (%s); allowing "
"per on_error=allow",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security Every slug name is sent to an external service operated by the PR author

The PR description explicitly discloses that the author is the founder of Hlido. Every unique agent slug that passes through this guardrail (whether from static config or per-request hlido_slugs metadata) is transmitted to https://hlido.eu — a commercial service operated by the PR contributor. Proxy operators who enable this guardrail will leak their downstream agent vendor identifiers to that external service on every cache miss. Users opting into the guardrail may not be aware of this data flow, and there is no redirection mechanism, anonymisation, or self-hosting option described. At a minimum, this data flow should be clearly documented in the proxy config docs and in the config model's api_base description so operators can evaluate the privacy implication before enabling the guardrail.

Comment on lines +30 to +41
on_unverified: Optional[str] = Field(
default=None,
description=(
"Action when a slug has no Hlido review: 'allow' (default) or 'block'."
),
)
on_error: Optional[str] = Field(
default=None,
description=(
"Action when the Hlido API is unreachable: 'allow' (default) or 'block'."
),
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 on_unverified and on_error are typed as Optional[str], so any non-"block" string (e.g. a typo like "bloc") silently defaults to "allow" without any validation error. Using Literal["allow", "block"] propagates misconfiguration to the operator immediately.

Suggested change
on_unverified: Optional[str] = Field(
default=None,
description=(
"Action when a slug has no Hlido review: 'allow' (default) or 'block'."
),
)
on_error: Optional[str] = Field(
default=None,
description=(
"Action when the Hlido API is unreachable: 'allow' (default) or 'block'."
),
)
on_unverified: Optional[Literal["allow", "block"]] = Field(
default=None,
description=(
"Action when a slug has no Hlido review: 'allow' (default) or 'block'."
),
)
on_error: Optional[Literal["allow", "block"]] = Field(
default=None,
description=(
"Action when the Hlido API is unreachable: 'allow' (default) or 'block'."
),
)

request_slugs: Tuple[str, ...] = ()
metadata = data.get("metadata") or data.get("litellm_metadata")
if isinstance(metadata, dict):
raw = metadata.get("hlido_slugs")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Client-controlled trust subject

metadata is supplied by the API caller, so a caller can omit hlido_slugs or provide a reviewed slug for a different agent and still get the request through when no static slugs are configured. The slug being verified should come from trusted server-side config or the proxy's agent registry, and requests without a trusted slug should block rather than silently returning.

slugs = self._collect_slugs(data)
if not slugs:
return
for slug in slugs:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium: Unbounded slug lookups

A caller can submit a large list of unique metadata.hlido_slugs; the guardrail performs one outbound Hlido request per slug and then stores each unique slug in the in-memory cache. Add a small maximum slug count, validate slug length/format before lookup, and use a bounded cache or evict expired entries.

@veria-ai

veria-ai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

PR overview

This PR adds a Hlido agent trust guardrail under the LiteLLM proxy guardrail hooks, using Hlido slug checks to decide whether an agent/request should be allowed. The touched code integrates outbound Hlido lookups and caching around those trust checks.

There are still two open security concerns in the new guardrail path. The main issue is that the trust subject can be supplied by the API caller via metadata, allowing requests to influence which slug is checked or avoid a check when no trusted server-side slug is configured. The implementation also allows unbounded caller-driven slug lookups and cache growth, creating a potential resource-exhaustion path; no issues have been addressed yet.

Open issues (2)

Fixed/addressed: 0 · PR risk: 6/10

@Sameerlite

Copy link
Copy Markdown
Collaborator

Thanks for contributing this guardrail, @ankitkapur1992-hlido! Before this can merge, Greptile (scored 2/5) flagged a few things that need attention:

  1. URL path injectionmetadata.hlido_slugs is user-controlled and used directly in a URL path without encoding. This is a potential security bypass of the guardrail itself — please URL-encode the slug values before constructing the request URL.
  2. Unbounded in-memory cache — the instance-level dict has no eviction policy, stampede protection, or cross-worker sharing. This should either use a TTL-limited structure or integrate with the shared litellm cache.
  3. Other items — Greptile also flagged missing test coverage for the evaluate code path, missing __all__ export, and a hardcoded API timeout.

Once those are addressed, this will be in great shape!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants