feat(guardrails): add Hlido agent trust guardrail by ankitkapur1992-hlido · Pull Request #30237 · BerriAI/litellm

ankitkapur1992-hlido · 2026-06-11T20:52:40Z

Relevant issues

None; new guardrail integration

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have added meaningful tests
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible; it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Screenshots / Proof of Fix

Disclosure: I am the founder of Hlido, the service this guardrail calls. The data surface it reads is public and free (no key required for the default tier)

The guardrail reads the live public endpoint. Real output as of today:

$ curl -s https://hlido.eu/v1/agents/klariqo
{"slug":"klariqo","name":"Klariqo","category":"Voice","score":58,"tier":"FADING","summary":"Polished voice-agent platform for SMB outbound. Strong UX but enterprise-only pricing and a login wall block full verification.","evidence_url":"https://hlido.eu/reviews/klariqo/","score_url":"https://hlido.eu/data/scorecards/klariqo.json","last_tested_at":"2026-04-09","model_version":"wave4-v1"}

Runbook to see it end to end on a live proxy (klariqo scores 58, below the default minimum of 60, so the first call is blocked before any provider spend; try-sanebox scores 90 so the second call passes through to the provider):

Add to your proxy config:

guardrails:
  - guardrail_name: hlido-trust
    litellm_params:
      guardrail: hlido
      mode: pre_call
      default_on: true

Start the proxy as usual, then:

for slug in klariqo try-sanebox; do
  curl -s http://localhost:4000/v1/chat/completions \
    -H "Authorization: Bearer $LITELLM_KEY" -H "Content-Type: application/json" \
    -d "{\"model\": \"gpt-4o-mini\", \"messages\": [{\"role\": \"user\", \"content\": \"hi\"}], \"metadata\": {\"hlido_slugs\": [\"$slug\"]}}"
  echo
done

Expected: the klariqo request returns the guardrail block error naming the slug, the score 58, and the minimum 60 with an evidence URL; the try-sanebox request reaches the model normally

Type

🆕 New Feature

Changes

Adds an hlido guardrail that gates requests on independent trust scores for third party AI agents, fetched from the public Hlido API (https://hlido.eu). Teams declare which downstream agent vendors a route or request uses (static slugs in the guardrail config, or per-request metadata.hlido_slugs) and the guardrail blocks the request pre call when a vendor's independently tested score is below min_score (default 60) or its tier is outside allowed_tiers

Existing guardrails in the catalog validate content (toxicity, PII, prompt injection); this one validates the counterparty agent itself, which is useful for orgs that route LLM traffic on behalf of agent workflows and need a procurement style trust gate at the gateway

Implementation follows the vigil_guard layout: self registering hook directory, config model in litellm/types/proxy/guardrails/guardrail_hooks/hlido.py, enum entry, and a mocked test suite (12 tests) with a dependency injected HTTP handler. Outbound HTTP goes through litellm's own async httpx client; zero new dependencies. Lookups are cached per slug (default 300s). Unknown slugs and API failures default to allow (configurable to block). No API key is required; an optional bearer key raises rate limits

Ran black, ruff check, and mypy on the new files; pytest tests/test_litellm/proxy/guardrails/guardrail_hooks/test_hlido.py passes 12/12

Companion docs PR: BerriAI/litellm-docs#336

CLAassistant · 2026-06-11T20:52:46Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Ankit Kapur seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

ankitkapur1992-hlido · 2026-06-11T20:52:54Z

@greptileai

codecov · 2026-06-11T20:55:53Z

Codecov Report

❌ Patch coverage is 87.35632% with 22 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...proxy/guardrails/guardrail_hooks/hlido/__init__.py	35.00%	13 Missing ⚠️
...lm/proxy/guardrails/guardrail_hooks/hlido/hlido.py	93.38%	9 Missing ⚠️

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-11T20:57:01Z

Greptile Summary

Adds a new hlido guardrail that gates proxy requests by fetching trust scores for named third-party AI agent vendors from the public Hlido API, blocking pre-call or during-call when a vendor's score falls below a configurable threshold or outside allowed tiers. The implementation follows the vigil_guard pattern with self-registration, a typed config model, dependency-injected HTTP handler, and per-instance TTL cache.

Path injection via user-controlled slugs: slugs supplied in metadata.hlido_slugs are interpolated directly into the URL path (/v1/agents/{slug}) without percent-encoding; a caller can pass a slug containing /, .., or ? to traverse paths or inject query parameters on the Hlido API endpoint.
during_call mode missing header update: async_moderation_hook never calls add_guardrail_to_applied_guardrails_header, unlike async_pre_call_hook, so the guardrail is invisible in response headers when mode: during_call.
Unbounded cache: self._cache has no eviction; expired entries for slugs that don't recur accumulate for the lifetime of the proxy worker.

Confidence Score: 3/5

The guardrail's primary trust-enforcement logic can be bypassed through path traversal in caller-supplied slug values before the fix lands.

The core enforcement mechanism reads slugs directly from request metadata and embeds them unencoded into the outbound URL path. A caller who knows they would be blocked can craft a slug containing ../ to fetch a different (passing) agent's trust record instead of their own. This makes the guardrail bypassable by the very callers it is meant to gate. The remaining issues (missing header in during_call mode, unbounded cache) are non-blocking quality concerns.

litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py — specifically the URL construction in _get_agent_record and the missing header call in async_moderation_hook.

Security Review

URL path injection (hlido/hlido.py, _get_agent_record): slug values from caller-supplied metadata.hlido_slugs are embedded in the API URL without percent-encoding. A malicious caller can pass a slug containing ../ segments to traverse to a different path on the Hlido API, or inject query parameters with ?. If a traversal lands on an endpoint that returns a passing score for an otherwise-untrusted slug identity, the trust check can be bypassed. Mitigation: urllib.parse.quote(slug, safe='') before URL construction, plus an alphanumeric-plus-hyphen allowlist on slug values from request metadata.

Important Files Changed

Filename	Overview
litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py	Core guardrail implementation; user-controlled slugs from request metadata are interpolated into the URL path without encoding (path injection), the during_call mode omits the applied-guardrails header update, and the in-memory cache has no eviction.
litellm/proxy/guardrails/guardrail_hooks/hlido/init.py	Guardrail initializer and registry; follows the vigil_guard pattern and correctly wires config values through to HlidoGuardrail.
litellm/types/proxy/guardrails/guardrail_hooks/hlido.py	Pydantic config models for the Hlido guardrail; clean, consistent with other config models in the codebase.
litellm/types/guardrails.py	Adds HlidoGuardrailConfigModel import, HLIDO enum entry, and mixin to LitellmParams; changes are minimal and follow existing patterns.
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_hlido.py	12 mock-only unit tests covering allow/block/unverified/error/cache/header scenarios; no real network calls, consistent with repo test policy.

_{Reviews (1): Last reviewed commit: "feat(guardrails): add Hlido agent trust ..." | Re-trigger Greptile}

greptile-apps · 2026-06-11T20:57:05Z

+        response = await self.async_handler.get(
+            url=f"{self.api_base}/v1/agents/{slug}",
+            headers=headers,
+            timeout=_REQUEST_TIMEOUT,
+        )


Unencoded user-controlled slug in URL path — the slug value from metadata.hlido_slugs flows directly into the URL path without URL-encoding or character validation. A caller can submit a slug like trusted-agent/../../../other-path or trusted-agent?score=100 to probe arbitrary paths on the Hlido API or inject query parameters. At minimum the slug should be percent-encoded with urllib.parse.quote; ideally it should also be validated against an allowlist pattern (e.g., only alphanumerics and hyphens) before the lookup so a path-traversal payload never reaches the wire.

Suggested change

response = await self.async_handler.get(

url=f"{self.api_base}/v1/agents/{slug}",

headers=headers,

timeout=_REQUEST_TIMEOUT,

)

from urllib.parse import quote

response = await self.async_handler.get(

url=f"{self.api_base}/v1/agents/{quote(slug, safe='')}",

headers=headers,

timeout=_REQUEST_TIMEOUT,

)

greptile-apps · 2026-06-11T20:57:06Z

+        await self._check_request(data)
+        return data
+
+    async def _check_request(self, data: dict) -> None:


async_moderation_hook (used when mode: during_call) never calls add_guardrail_to_applied_guardrails_header, so the guardrail silently disappears from response headers in that mode. async_pre_call_hook does make the call, so the inconsistency is probably unintentional.

Suggested change

await self._check_request(data)

return data

async def _check_request(self, data: dict) -> None:

await self._check_request(data)

add_guardrail_to_applied_guardrails_header(

request_data=data, guardrail_name=self.guardrail_name

)

return data

async def _check_request(self, data: dict) -> None:

greptile-apps · 2026-06-11T20:57:07Z

+        self.cache_ttl = (
+            cache_ttl if cache_ttl is not None else DEFAULT_CACHE_TTL_SECONDS
+        )
+        self._cache: Dict[str, Tuple[float, Optional[HlidoAgentRecord]]] = {}


Unbounded cache growth — self._cache is a plain dict with no eviction. Stale entries are only replaced when the same slug is looked up again; entries for slugs that never recur accumulate indefinitely. In deployments where many unique slug values arrive via metadata.hlido_slugs (e.g., one slug per end-user session), this is a slow memory leak. Consider capping the dict size or using litellm's existing DualCache (which already manages TTL eviction) instead of a bespoke per-instance dict.

greptile-apps · 2026-06-11T20:57:24Z

Greptile Summary

This PR adds a new hlido guardrail that gates LLM proxy requests on independently-tested trust scores for downstream AI agent vendors, fetched from the public Hlido API (https://hlido.eu). The PR author discloses they are the founder of Hlido.

The guardrail follows the vigil_guard structural pattern (self-registering hook, config model, enum entry) and uses litellm's existing async httpx client — no new dependencies introduced.
Agent trust lookups are cached per slug with a configurable TTL (default 300 s), but the cache is a plain instance-level Python dict rather than the DualCache supplied to async_pre_call_hook; this means no cross-worker sharing, no stampede protection, and unbounded memory growth for long-lived workers seeing many unique slugs.
Every slug name is transmitted to https://hlido.eu, a service operated by the PR contributor, on every cache miss; this data flow is not surfaced to proxy operators in any in-proxy warning or documentation.

Confidence Score: 2/5

Not safe to merge without addressing the cache architecture and the undocumented data-flow to a service operated by the PR contributor.

The guardrail's instance-level dict cache has no concurrency guard, grows unbounded, and is not shared across workers — under any multi-process deployment slugs are re-fetched from the external API far more than the 300 s TTL intends. More critically, every agent slug identifier is sent to https://hlido.eu, which is operated by the PR author; proxy operators currently have no in-product signal that enabling this guardrail routes their agent-vendor metadata to that commercial service on every cache miss.

litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py — cache implementation and outbound HTTP call both need attention before merge.

Security Review

Data exfiltration to vendor-operated service: All agent slug identifiers are transmitted to https://hlido.eu on every cache miss. The PR author is the founder of that service. Operators enabling this guardrail expose their downstream agent vendor choices to a commercial third-party without any in-proxy anonymisation or redirection option. This is by design but represents an undocumented data flow that should be surfaced clearly to proxy operators before they enable the guardrail (litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py, _get_agent_record).

Important Files Changed

Filename	Overview
litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py	Core guardrail implementation — contains an instance-level dict cache that causes concurrent stampede, unbounded growth, and cross-worker inconsistency; all slug names are sent to an external commercial service operated by the PR author
litellm/proxy/guardrails/guardrail_hooks/hlido/init.py	Initializer and registry wiring — follows vigil_guard pattern correctly; passes litellm's async httpx client to the guardrail instance
litellm/types/proxy/guardrails/guardrail_hooks/hlido.py	Config model — correctly extends GuardrailConfigModel, but on_unverified and on_error accept any string and silently fall back to 'allow' on typos instead of using Literal types
litellm/types/guardrails.py	Adds HLIDO enum entry and mixes HlidoGuardrailConfigModel into LitellmParams — minimal, correct changes consistent with other guardrail integrations
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_hlido.py	12 mocked unit tests covering allow/block paths, caching, metadata slug merging, and auth headers — all use FakeHandler with no real network calls, consistent with repo test policy

_{Reviews (2): Last reviewed commit: "feat(guardrails): add Hlido agent trust ..." | Re-trigger Greptile}

greptile-apps · 2026-06-11T20:57:27Z

+        if min_score is None and allowed_tiers is None:
+            min_score = DEFAULT_MIN_SCORE


Instance-level dict cache causes stampede and unbounded growth

self._cache is a plain Python dict on the guardrail instance. Under high concurrency, multiple requests for the same uncached slug (e.g., at startup or after TTL expiry) will all miss the cache simultaneously and fan out to the Hlido API — there is no in-flight deduplication or asyncio lock. Additionally, entries are never evicted: the dict grows monotonically for every unique slug seen over the lifetime of the worker. In a multi-process deployment (Gunicorn/uvicorn workers), each worker maintains its own independent cache, multiplying API calls accordingly. The DualCache instance passed in to async_pre_call_hook is the litellm-idiomatic shared cache that can be backed by Redis; consider using it here rather than a bare dict.

greptile-apps · 2026-06-11T20:57:28Z

+                if self.on_unverified == "allow":
+                    verbose_proxy_logger.warning(
+                        "Hlido guardrail: agent '%s' has no Hlido review; allowing "
+                        "per on_unverified=allow",
+                        slug,
+                    )
+                    return
+                raise GuardrailRaisedException(
+                    guardrail_name=self.guardrail_name,
+                    message=(
+                        f"Hlido trust check failed for agent '{slug}': no "
+                        "independent review exists and on_unverified is 'block'"
+                    ),
+                )
+            case TrustLookupFailed(slug=slug, error=error):
+                if self.on_error == "allow":
+                    verbose_proxy_logger.warning(
+                        "Hlido guardrail: lookup failed for '%s' (%s); allowing "
+                        "per on_error=allow",


Every slug name is sent to an external service operated by the PR author

The PR description explicitly discloses that the author is the founder of Hlido. Every unique agent slug that passes through this guardrail (whether from static config or per-request hlido_slugs metadata) is transmitted to https://hlido.eu — a commercial service operated by the PR contributor. Proxy operators who enable this guardrail will leak their downstream agent vendor identifiers to that external service on every cache miss. Users opting into the guardrail may not be aware of this data flow, and there is no redirection mechanism, anonymisation, or self-hosting option described. At a minimum, this data flow should be clearly documented in the proxy config docs and in the config model's api_base description so operators can evaluate the privacy implication before enabling the guardrail.

greptile-apps · 2026-06-11T20:57:29Z

+    on_unverified: Optional[str] = Field(
+        default=None,
+        description=(
+            "Action when a slug has no Hlido review: 'allow' (default) or 'block'."
+        ),
+    )
+    on_error: Optional[str] = Field(
+        default=None,
+        description=(
+            "Action when the Hlido API is unreachable: 'allow' (default) or 'block'."
+        ),
+    )


on_unverified and on_error are typed as Optional[str], so any non-"block" string (e.g. a typo like "bloc") silently defaults to "allow" without any validation error. Using Literal["allow", "block"] propagates misconfiguration to the operator immediately.

Suggested change

on_unverified: Optional[str] = Field(

default=None,

description=(

"Action when a slug has no Hlido review: 'allow' (default) or 'block'."

),

)

on_error: Optional[str] = Field(

default=None,

description=(

"Action when the Hlido API is unreachable: 'allow' (default) or 'block'."

),

)

on_unverified: Optional[Literal["allow", "block"]] = Field(

default=None,

description=(

"Action when a slug has no Hlido review: 'allow' (default) or 'block'."

),

)

on_error: Optional[Literal["allow", "block"]] = Field(

default=None,

description=(

"Action when the Hlido API is unreachable: 'allow' (default) or 'block'."

),

)

veria-ai · 2026-06-11T21:08:31Z

+        request_slugs: Tuple[str, ...] = ()
+        metadata = data.get("metadata") or data.get("litellm_metadata")
+        if isinstance(metadata, dict):
+            raw = metadata.get("hlido_slugs")


Medium: Client-controlled trust subject

metadata is supplied by the API caller, so a caller can omit hlido_slugs or provide a reviewed slug for a different agent and still get the request through when no static slugs are configured. The slug being verified should come from trusted server-side config or the proxy's agent registry, and requests without a trusted slug should block rather than silently returning.

veria-ai · 2026-06-11T21:08:31Z

+        slugs = self._collect_slugs(data)
+        if not slugs:
+            return
+        for slug in slugs:


Medium: Unbounded slug lookups

A caller can submit a large list of unique metadata.hlido_slugs; the guardrail performs one outbound Hlido request per slug and then stores each unique slug in the in-memory cache. Add a small maximum slug count, validate slug length/format before lookup, and use a bounded cache or evict expired entries.

veria-ai · 2026-06-11T21:08:38Z

PR overview

This PR adds a Hlido agent trust guardrail under the LiteLLM proxy guardrail hooks, using Hlido slug checks to decide whether an agent/request should be allowed. The touched code integrates outbound Hlido lookups and caching around those trust checks.

There are still two open security concerns in the new guardrail path. The main issue is that the trust subject can be supplied by the API caller via metadata, allowing requests to influence which slug is checked or avoid a check when no trusted server-side slug is configured. The implementation also allows unbounded caller-driven slug lookups and cache growth, creating a potential resource-exhaustion path; no issues have been addressed yet.

Open issues (2)

Medium: Client-controlled trust subject — litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py:198
Medium: Unbounded slug lookups — litellm/proxy/guardrails/guardrail_hooks/hlido/hlido.py:190

Fixed/addressed: 0 · PR risk: 6/10

Sameerlite · 2026-06-12T03:33:34Z

Thanks for contributing this guardrail, @ankitkapur1992-hlido! Before this can merge, Greptile (scored 2/5) flagged a few things that need attention:

URL path injection — metadata.hlido_slugs is user-controlled and used directly in a URL path without encoding. This is a potential security bypass of the guardrail itself — please URL-encode the slug values before constructing the request URL.
Unbounded in-memory cache — the instance-level dict has no eviction policy, stampede protection, or cross-worker sharing. This should either use a TTL-limited structure or integrate with the shared litellm cache.
Other items — Greptile also flagged missing test coverage for the evaluate code path, missing __all__ export, and a hardcoded API timeout.

Once those are addressed, this will be in great shape!

feat(guardrails): add Hlido agent trust guardrail

52b9441

ankitkapur1992-hlido mentioned this pull request Jun 11, 2026

docs(guardrails): add Hlido agent trust guardrail page BerriAI/litellm-docs#336

Open

greptile-apps Bot reviewed Jun 11, 2026

View reviewed changes

veria-ai Bot reviewed Jun 11, 2026

View reviewed changes

		if min_score is None and allowed_tiers is None:
		min_score = DEFAULT_MIN_SCORE

Uh oh!

Conversation

ankitkapur1992-hlido commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Relevant issues

Pre-Submission checklist

Screenshots / Proof of Fix

Type

Changes

Uh oh!

CLAassistant commented Jun 11, 2026

Uh oh!

ankitkapur1992-hlido commented Jun 11, 2026

Uh oh!

codecov Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 11, 2026

Greptile Summary

Confidence Score: 3/5

Security Review

Important Files Changed

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 11, 2026

Greptile Summary

Confidence Score: 2/5

Security Review

Important Files Changed

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

veria-ai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

veria-ai Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

veria-ai Bot commented Jun 11, 2026

PR overview

Open issues (2)

Uh oh!

Sameerlite commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ankitkapur1992-hlido commented Jun 11, 2026 •

edited

Loading

codecov Bot commented Jun 11, 2026 •

edited

Loading