AgentK is a tiny prototype of an agent security kernel.
Status: public prototype, not production-ready.
It is not another agent framework. It is the syscall boundary agent frameworks should run through:
model.call
context.read
memory.write
tool.describe
tool.invoke
tool.response
secret.open
network.send
file.patch
human.approve
agent.spawnEvery syscall carries provenance, taint labels, a policy decision, and a hash-chained flight recorder event.
AgentK treats prompt context like memory.
The Context MMU labels every context page:
trusted
untrusted
external
private
secret
poisoned-suspectThen it blocks unsafe flows:
untrusted_webpage -> shell_exec
private_email -> external_http_post
secret_fd -> raw_model_contextThe first demo shows a poisoned webpage trying to exfiltrate ~/.ssh/id_rsa. AgentK blocks the raw secret read and the network send, then writes a tamper-evident JSONL flight log.
cargo runVerify the latest flight log:
cargo run -- verify .agentk/runs/latest.jsonlVerify receipt and secret-handle signatures:
cargo run -- verify-signatures .agentk/runs/latest.jsonlSignature verification prints redacted signer fingerprints with receipt and secret-handle counts, so reviewers can see which signing identities produced evidence without printing raw public keys.
Pin verification to an expected public signing key:
cargo run -- verify-signatures .agentk/runs/latest.jsonl --trusted-public-key <hex-public-key>Pin verification to a public trusted-signer manifest:
cargo run -- verify-signatures .agentk/runs/latest.jsonl --trusted-key-manifest examples/trusted-signers.tomlValidate a trusted-signer manifest without printing keys:
cargo run -- trusted-signers-check --manifest examples/trusted-signers.tomlInspect the latest flight log without printing raw input refs:
cargo run -- trace-inspect .agentk/runs/latest.jsonlReplay the latest flight log without side effects:
cargo run -- replay .agentk/runs/latest.jsonlReplay records synthetic stub_output_sha256 refs for allowed model, tool, and network side-effect syscalls. It does not execute those syscalls or invent raw outputs.
Fork-replay the latest flight log against another policy:
cargo run -- fork-replay .agentk/runs/latest.jsonl --policy examples/policies/research-agent.tomlFork replay reports both per-event decision changes and a stable decision
summary, such as deny:rule->allow:rule, so policy drift is visible without
manual counting.
Fork-replay with changed hashed behavior outputs:
cargo run -- fork-replay-behavior .agentk/runs/latest.jsonl --behavior examples/replay-behavior-overrides.jsonCheck the prototype policy:
cargo run -- policy-check examples/agentk.policy.tomlValidate a secret-reference manifest without printing provider refs:
cargo run -- secret-refs-check --manifest examples/secret-refs.tomlCheck whether secret references are available through the local env store without printing refs:
AGENTK_DEMO_REF=present cargo run -- secret-refs-store-check --manifest examples/secret-refs.tomlExample profiles live in:
examples/policies/research-agent.toml
examples/policies/coding-agent.toml
examples/policies/browser-agent.tomlRun the public-readiness gate:
cargo run -- readinessRun the full local release audit:
cargo run -- release-auditRun the strict pre-push audit with a configured signing key file:
AGENTK_REQUIRE_SIGNING_KEY=1 AGENTK_SIGNING_KEY_FILE=../agentk-signing-key cargo run -- release-audit --strictContribution and release rules live in CONTRIBUTING.md, docs/v0.1-target.md, and docs/release-checklist.md. Accepted v0.1 limits are tracked in docs/v0.1-limit-disposition.md, and the current pre-tag dry run is recorded in docs/v0.1-release-dry-run.md.
Mediate a demo MCP-shaped tool request without executing it:
cargo run -- mcp-proxy --request examples/mcp-tool-request.jsonMediate one bounded MCP-shaped request over stdin:
cargo run -- mcp-stdio < examples/mcp-tool-request.jsonMediate newline-delimited MCP-shaped requests over bounded stdin:
cargo run -- mcp-lines < examples/mcp-tool-requests.jsonlRun the minimal MCP JSON-RPC stdio server. The prototype accepts
newline-delimited JSON-RPC messages, rejects batches, enforces bounded request
ids, streams stdin with a per-line message size cap, and does not execute the
underlying tool. Tool listing and calls require a prior initialize request
with the supported protocol version followed by the notifications/initialized
notification. Before that lifecycle completes, only initialize and ping
requests receive method-specific handling:
cargo run -- mcp-server < examples/mcp-server-session.jsonlRun AgentK as a stdio proxy in front of a downstream MCP server process. The
proxy forwards JSON-RPC to the child server only after mediating tools/list
descriptors, tools/call arguments, resources/list descriptors, and
resources/read requests, plus prompts/list descriptors and prompts/get
requests. It strips AgentK-only policy metadata before forwarding, starts the
child with only explicitly configured environment variables, validates proxy
configuration before spawn, records hash evidence for tool, resource, and
prompt responses, and refuses denied tool/resource/prompt actions before the
child sees them. MCP methods that do not yet have an AgentK policy contract are
rejected instead of being forwarded as generic passthrough. Downstream
responses are bounded by a configurable timeout so a hung child cannot stall
the proxy indefinitely:
cargo run -- mcp-proxy-stdio --server-id poisoned-demo --trace-out .agentk/runs/mcp-proxy-demo.jsonl --command sh --arg examples/mcp-poisoned-server.sh < examples/mcp-proxy-client-session.jsonl
cargo run -- trace-inspect .agentk/runs/mcp-proxy-demo.jsonlUse --allow-env NAME to copy a named parent environment variable into the
cleared child environment. Repeat the flag for multiple variables.
Repeat --arg for each downstream argument; hyphen-prefixed child args are
accepted, for example --arg -c.
Use --response-timeout-ms to set the downstream response timeout; the default
is 30000 ms.
The subprocess proxy operator contract lives in docs/mcp-proxy.md.
Run the MCP killer demo. The downstream server returns poisoned tool output that tells the agent to exfiltrate a private marker and patch the repository. AgentK records the poisoned output by hash, then blocks both dangerous follow-up tool calls before the child server sees them:
cargo run -- mcp-killer-demo
cargo run -- trace-inspect .agentk/runs/mcp-killer-demo.jsonlRun the before/after shim eval. It drives the same poisoned MCP flow through a baseline passthrough and through AgentK, then prints a scorecard showing which dangerous transitions executed versus which were blocked with evidence:
cargo run -- mcp-shim-eval
cargo run -- trace-inspect .agentk/runs/mcp-shim-eval-agentk.jsonlThe reviewer guide for this proof lives in docs/mcp-shim-eval.md.
Run a second proxy transcript where the downstream MCP server returns a poisoned JSON-RPC error body. AgentK returns only a sanitized error summary to the client while preserving hash evidence in the trace:
cargo run -- mcp-proxy-stdio --server-id poisoned-error-demo --trace-out .agentk/runs/mcp-proxy-error-demo.jsonl --command sh --arg examples/mcp-poisoned-error-server.sh < examples/mcp-proxy-poisoned-error-session.jsonl
cargo run -- trace-inspect .agentk/runs/mcp-proxy-error-demo.jsonlPrint the active proof-signing public key:
cargo run -- signing-keyGenerate a local signing key file outside git:
cargo run -- keygen --out ../agentk-signing-keyRotate a local signing key and write a public signed manifest:
cargo run -- key-rotate --current ../agentk-signing-key --next-out ../agentk-signing-key-next --manifest ../agentk-rotation.jsonVerify a public key-rotation manifest:
cargo run -- key-rotate-verify --manifest ../agentk-rotation.jsonEmit the demo report as JSON:
cargo run -- demo --jsonMost agent security tools either:
- sandbox code without understanding semantic data flow,
- trace LLM calls without enforcing anything,
- ask models to behave safely,
- or gate individual tools without preserving provenance.
AgentK's thesis:
Autonomous actions need OS-style mediation: typed syscalls, capability receipts, taint-aware egress, secret handles, and replayable evidence.
This repo currently includes:
- a Rust CLI,
- a typed TOML policy AST,
- label propagation for demo syscalls,
- default-deny behavior for unknown syscalls,
- Ed25519-signed development capability receipts,
- opaque secret FD handles scoped to signed receipts,
- Ed25519-signed development secret handles with expiry and receipt binding,
- target-only dummy secret registrations for local tests,
- redacted external secret reference records that require a configured store before minting handles by default,
- a metadata-only secret store registry that checks provider support and external reference availability without returning secret bytes,
- an env-backed local secret store presence adapter for
envreferences, - a versioned secret-reference manifest parser with provider-id validation for registering external refs without secret values,
- a redacted secret-reference manifest validation command,
- a redacted secret-reference store availability command,
- a hash-chained flight recorder,
- log verification,
- receipt and secret-handle signature verification with optional trusted-key pinning and redacted signer summaries,
- a redacted public trusted-signer manifest for verifier pinning,
- redacted flight-log inspection for human review,
- deterministic side-effect-free replay,
- fork replay with policy comparison and decision-change summaries,
- an MCP proxy MVP that mediates
tool.invokewithout execution, - MCP descriptor mediation that hashes untrusted tool metadata before model exposure,
- MCP response recording that hashes raw tool output instead of logging it,
- subprocess MCP resource mediation for
resources/listandresources/readwith hash-only evidence, - subprocess MCP prompt mediation for
prompts/listandprompts/getwith hash-only evidence, - subprocess MCP stderr suppression so child diagnostics cannot bypass the redacted JSON-RPC and trace-evidence path,
- an MCP killer demo where poisoned tool output tries to trigger secret exfiltration and an unsafe file patch, but both follow-up calls are blocked with inspectable trace evidence,
- a one-command MCP killer demo runner that writes a redacted trace without dumping the poisoned raw content into the review path,
- a before/after MCP shim eval that contrasts unsafe baseline passthrough with AgentK blocking and replayable evidence,
- stdin mediation for one MCP-shaped request,
- newline-delimited stdin mediation for repeated MCP-shaped requests,
- a minimal MCP JSON-RPC stdio server exposing
agentk.mediate,agentk.mediate_descriptor, andagentk.record_response, - signing key generation to a caller-chosen local file,
- signed key-rotation manifests that do not include private key material,
- key-rotation manifest verification,
- a one-command local release audit,
- a local public-readiness gate,
- and tests for tainted egress, capability receipts, secret redaction, secret-handle binding, replay, MCP mediation, descriptor/response hashing, key rotation, and unknown syscall denial.
Next obvious pieces:
- close the remaining v0.1 target gaps,
- production key storage and operational key lifecycle,
- fuller MCP proxy/server compliance,
- filesystem diff capture,
- fork replay with changed model/tool behavior,
- eBPF/cgroup adapters for Linux resource accounting,
- and a visual trace viewer.
This project is security-sensitive and intentionally conservative.
Implemented today:
- toy Context MMU labels,
- typed TOML policy validation,
- Ed25519-signed development capability receipts,
- opaque secret FD handle minting,
- Ed25519-signed development secret handles with expiry, scope, and receipt binding,
- external secret references that require a configured store before minting handles by default,
- JSONL flight log hash chain,
- local log verification,
- redacted flight-log inspection that replaces raw input refs with hash evidence,
- trace inspection summaries that group blocked events by policy rule,
- trace inspection summaries that group boundary events by syscall and evidence ref type,
- deterministic replay that stubs side effects and summarizes blocked policy rules,
- fork replay with policy comparison and decision-change summaries,
- MCP-shaped tool mediation without execution,
- MCP descriptor and response hash evidence without raw descriptor/response logging,
- conservative MCP tool-output labels for recorded responses,
- tainted tool-input blocking at
tool.invokeboundaries, - MCP resource descriptor/read/response evidence with explicit read capabilities,
- MCP prompt descriptor/get/response evidence with explicit get capabilities,
- mixed subprocess MCP interoperability coverage across tools, resources, prompts, and notifications,
- public MCP interoperability transcript coverage that blocks poisoned follow-up network egress and unsafe patch attempts,
- subprocess MCP pre-ready notification guards so client notifications cannot bypass lifecycle gating,
- downstream subprocess MCP notification-burst handling without raw payload reflection,
- downstream subprocess MCP notification-flood bounds without raw payload reflection,
- subprocess MCP stderr suppression for downstream diagnostics,
- subprocess MCP lifecycle error redaction for downstream
initializeandpingfailures, - subprocess MCP initialize protocol guards before the proxy becomes ready,
- subprocess MCP
tools/listerror redaction before descriptors are exposed, - subprocess MCP tool-shape guards for malformed
tools/listand successfultools/callresults, - subprocess MCP bad-response redaction for malformed JSON and mismatched response ids,
- subprocess MCP resource subscription no-passthrough coverage for unsupported
resources/subscribeandresources/unsubscribe, - subprocess MCP invalid AgentK metadata redaction before unsafe requests are forwarded,
- subprocess MCP client intent hashing so AgentK-only metadata does not leak through trace evidence,
- subprocess MCP invalid client-parameter guards before empty identifiers can reach downstream servers,
- compact denial summaries on blocked MCP tool, resource, and prompt responses,
- subprocess MCP response timeout handling for hung downstream servers,
- subprocess MCP transport-close handling for child exits and broken pipes,
- a runnable MCP killer demo that blocks poisoned-output exfiltration and unsafe patch attempts,
- a one-command
mcp-killer-demorunner for reviewable demo traces, - a one-command
mcp-shim-evalscorecard for showing why the shim matters, - a minimal MCP JSON-RPC stdio server,
- local key generation and signed key-rotation manifests,
- a local release audit that runs formatting, tests, clippy, readiness, replay, signature, signer summaries, signer-pinning, trusted-signer manifest, secret-handle, secret-reference validation, secret-store availability, MCP taint-flow, subprocess MCP boundaries, lifecycle/list redaction, initialize guards, tool/resource/prompt shape guards, bad-response redaction, response timeouts, transport-close checks, mixed interop, public interop transcripts, resource subscription no-passthrough, pre-ready notification guards, notification-burst/flood checks, config and metadata guards, client-intent redaction, invalid client-parameter guards, denial summaries, no-passthrough checks, the MCP shim eval, inspect, and MCP server smoke checks.
Not implemented yet:
- production key storage and complete key lifecycle management,
- production MCP server transport,
- production secret storage,
- real sandboxing,
- eBPF/cgroup enforcement.
By default AgentK signs evidence with a static development key. Set AGENTK_SIGNING_KEY_FILE to a private key file created by agentk keygen, or set AGENTK_SIGNING_KEY_HEX to a 32-byte hex Ed25519 signing key for non-demo runs. Set AGENTK_REQUIRE_SIGNING_KEY=1 in release gates to fail readiness if the configured signer falls back to the development key. On Unix, readiness also fails if the configured key file is readable by group/other users or if its parent directory is group/other writable. The CLI only prints the public key.
See SECURITY.md, docs/threat-model.md, docs/key-lifecycle.md, docs/mcp-proxy.md, and docs/public-readiness.md.
AgentK: short for Agent Kernel.
Small name. Sharp edges.