Skip to content

q7766206/AgentGuard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ AgentGuard

Open-source security middleware for AI agents

Audit every action. Enforce safety rules. Stop agents before they go wrong.

PyPI License Python


AgentGuard is a lightweight, framework-agnostic security layer that sits between your AI agent and the real world. It records every LLM call, tool execution, and decision as an immutable audit trail β€” and enforces configurable safety rules that can block, throttle, or kill an agent before it causes damage.

Zero external dependencies. Works with LangChain, CrewAI, or any custom framework.

Why AgentGuard?

AI agents are increasingly autonomous β€” they browse the web, execute code, manage files, and call APIs. But nobody is watching what they do. A single hallucination can lead to rm -rf /, a runaway loop can burn $500 in API costs, and cascading errors can corrupt your data.

AgentGuard solves this by providing:

  • Audit Trail β€” Every action recorded as immutable, queryable, exportable logs
  • Rule Engine β€” 5 built-in rules that catch the most common agent failures
  • Shield β€” 4 active defense modules against external attacks (prompt injection, data leakage, exfiltration, behavioral hijacking)
  • Real-time Enforcement β€” Block dangerous operations before they execute
  • Framework Adapters β€” Drop-in integration with LangChain 1.0, CrewAI, and more

Quickstart

pip install agentguard
from agentguard import AgentGuard, AgentGuardBlock
from agentguard.rules import LoopDetection, TokenBudget, SensitiveOp

guard = AgentGuard(rules=[
    LoopDetection(max_repeats=5),     # Stop infinite loops
    TokenBudget(max_tokens=100_000),  # Prevent cost overruns
    SensitiveOp(),                    # Block dangerous commands
])

# In your agent loop:
guard.before_tool_call("web_search", {"query": "AI safety"})
guard.after_tool_call("web_search", result="Found 10 papers", duration_ms=350)

# Dangerous operations are blocked automatically:
try:
    guard.before_tool_call("shell", {"command": "rm -rf /"})
except AgentGuardBlock as e:
    print(f"Blocked: {e}")
    # β†’ [AgentGuard BLOCK] | Rule: SensitiveOp | Tool: shell | ...

Built-in Rules

Rule What it catches Default action
LoopDetection Same tool called N times in M seconds BLOCK
TokenBudget Total tokens exceed threshold KILL
ErrorCascade N errors in M seconds (circuit breaker) KILL
SensitiveOp rm -rf, DROP TABLE, sudo, .env access, etc. BLOCK
TimeoutGuard Single operation exceeds time limit LOG

All rules are configurable:

LoopDetection(max_repeats=3, window_seconds=30)
TokenBudget(max_tokens=50_000, max_tokens_per_call=5000)
ErrorCascade(max_errors=3, window_seconds=30)
SensitiveOp(extra_patterns=[r"my_secret"], blocked_tools=["dangerous_tool"])
TimeoutGuard(max_duration_ms=10_000)

Shield β€” Active Defense

AgentGuard Shield protects agents from external attacks, not just self-inflicted errors.

from agentguard import AgentGuard
from agentguard.rules import SensitiveOp
from agentguard.shield import (
    PromptInjectionDetector,
    DataLeakageDetector,
    BehaviorAnomalyDetector,
    ExfilDetector,
)

guard = AgentGuard(rules=[
    # Self-protection (built-in rules)
    SensitiveOp(),

    # External attack protection (Shield)
    PromptInjectionDetector(sensitivity="medium"),
    DataLeakageDetector(),
    BehaviorAnomalyDetector(baseline_window=20),
    ExfilDetector(trusted_domains={"api.openai.com", "google.com"}),
])

Shield Modules

Module What it catches Detection method
PromptInjectionDetector "Ignore previous instructions", role hijacking, jailbreaks Multi-layer: regex (4 languages) + heuristic scoring + canary tokens
DataLeakageDetector API keys, passwords, PII, private keys, connection strings 25+ regex patterns for OpenAI/AWS/GCP/Stripe/GitHub/etc.
BehaviorAnomalyDetector Agent behavioral shifts after reading external content Rolling baseline comparison, new tool detection, frequency spike
ExfilDetector `base64 curl`, reverse shells, DNS exfil, untrusted domains

Prompt Injection Detection (Multi-language)

# Detects injection in English, Chinese, Spanish, French
detector = PromptInjectionDetector(
    sensitivity="high",           # low/medium/high
    canary_token="MY_SECRET_123", # Optional: detect if agent leaks this
    scan_tool_output=True,        # Scan tool results for indirect injection
)

Data Leakage Prevention

detector = DataLeakageDetector(
    scan_categories=["api_keys", "passwords", "pii"],  # Choose what to scan
    allowlist={"sk-test-not-real"},                     # Known safe strings
)

Custom Rules

Create your own rules in ~20 lines:

from agentguard.rules import BaseRule
from agentguard.types import AuditRecord, RuleContext, RuleViolation, RuleAction, RuleSeverity

class CostBudget(BaseRule):
    """Kill agent when estimated API cost exceeds threshold."""

    def __init__(self, max_cost_usd: float = 5.0):
        super().__init__(name="CostBudget", severity=RuleSeverity.HIGH, action=RuleAction.KILL)
        self._max_cost = max_cost_usd

    def evaluate(self, record: AuditRecord, context: RuleContext):
        cost = (context.total_tokens / 1000) * 0.01  # $0.01/1k tokens
        if cost >= self._max_cost:
            return RuleViolation(
                rule_name=self.name, severity=self.severity, action=self.action,
                message=f"Cost ${cost:.2f} exceeds ${self._max_cost:.2f}",
            )
        return None

guard = AgentGuard(rules=[CostBudget(max_cost_usd=1.0)])

LangChain 1.0 Integration

from agentguard import AgentGuard
from agentguard.rules import LoopDetection, TokenBudget, SensitiveOp
from agentguard.adapters.langchain import make_guard_middleware
from langchain.agents import create_agent
from langchain_openai import ChatOpenAI

guard = AgentGuard(rules=[LoopDetection(), TokenBudget(), SensitiveOp()])

agent = create_agent(
    model=ChatOpenAI(model="gpt-4o"),
    tools=[...],
    middleware=[*make_guard_middleware(guard)],  # ← One line
)

CrewAI Integration

from agentguard import AgentGuard
from agentguard.rules import SensitiveOp
from agentguard.adapters.crewai import make_crewai_callbacks

guard = AgentGuard(rules=[SensitiveOp()])
step_cb, task_cb = make_crewai_callbacks(guard)

agent = Agent(role="researcher", step_callback=step_cb, ...)
task = Task(description="...", callback=task_cb, ...)

Audit Trail

Every action is recorded and queryable:

# Query recent errors
errors = guard.get_records(level=LogLevel.ERROR, limit=10)

# Export full audit trail as JSON
print(guard.export_json())

# Persist to disk (JSONL format, compatible with jq/ELK/Splunk)
guard = AgentGuard(persist=True, persist_path="./audit.jsonl")

# Get session stats
stats = guard.get_stats()
# β†’ {"session_id": "abc123", "audit": {"total_records": 42}, "rules": {"total_violations": 2}}

Performance

AgentGuard is designed to add negligible overhead to your agent:

Scenario Latency per record
Memory-only ~13ΞΌs
With disk persistence ~100ΞΌs
With 1 subscriber ~15ΞΌs
10 threads concurrent ~16ΞΌs

For comparison, a single LLM API call takes 500ms–5s. AgentGuard's overhead is <0.01%.

Architecture

Your Agent
    β”‚
    β”œβ”€β”€ before_tool_call() ──→ AuditTrail ──→ RuleEngine ──→ Block/Kill/Log
    β”‚                              β”‚               β”‚
    β”‚                         MemoryStorage    β”Œβ”€β”€ Built-in Rules ──┐
    β”‚                         JSONLStorage     β”‚  LoopDetection     β”‚
    β”‚                                          β”‚  TokenBudget       β”‚
    β”œβ”€β”€ after_tool_call()                      β”‚  ErrorCascade      β”‚
    β”œβ”€β”€ before_llm_call()                      β”‚  SensitiveOp       β”‚
    β”œβ”€β”€ after_llm_call()                       β”‚  TimeoutGuard      β”‚
    β”‚                                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚                                          β”Œβ”€β”€ Shield Modules ──┐
    β”‚                                          β”‚  PromptInjection   β”‚
    β”‚                                          β”‚  DataLeakage       β”‚
    β”‚                                          β”‚  BehaviorAnomaly   β”‚
    β”‚                                          β”‚  ExfilDetector     β”‚
    β”‚                                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    └── get_stats() / export_json()

Project Structure

agentguard/
β”œβ”€β”€ __init__.py          # Public API exports
β”œβ”€β”€ core.py              # AgentGuard main class
β”œβ”€β”€ audit.py             # AuditTrail (recording, querying, export)
β”œβ”€β”€ types.py             # All types, enums, data structures
β”œβ”€β”€ exceptions.py        # AgentGuardBlock, AgentGuardKill, etc.
β”œβ”€β”€ rules/
β”‚   β”œβ”€β”€ base.py          # BaseRule abstract class
β”‚   β”œβ”€β”€ builtin.py       # 5 built-in rules
β”‚   └── engine.py        # RuleEngine (evaluation, context tracking)
β”œβ”€β”€ shield/
β”‚   β”œβ”€β”€ injection.py     # Prompt injection detector (4 languages)
β”‚   β”œβ”€β”€ leakage.py       # Data leakage detector (25+ patterns)
β”‚   β”œβ”€β”€ anomaly.py       # Behavior anomaly detector
β”‚   └── exfil.py         # Exfiltration detector
β”œβ”€β”€ storage/
β”‚   β”œβ”€β”€ base.py          # BaseStorage abstract class
β”‚   β”œβ”€β”€ memory.py        # In-memory ring buffer
β”‚   └── jsonl.py         # JSONL file persistence
└── adapters/
    β”œβ”€β”€ langchain.py     # LangChain 1.0 middleware adapter
    └── crewai.py        # CrewAI callback adapter

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

git clone https://github.com/agentguard/agentguard.git
cd agentguard
pip install -e ".[dev]"
pytest

License

Apache 2.0 β€” use it in production, fork it, build on it.

About

πŸ›‘οΈ Open-source security middleware for AI agents. Audit trail, rule engine, prompt injection detection, data leakage prevention. Works with LangChain, CrewAI. Zero dependencies. 168 tests.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages