CAK

CAK (Causal Agent Kernel) is a typed semantic control layer for AI-agent behavior.

CAK is not a general-purpose programming language, a prompt DSL, or another orchestration framework. It is a three-layer system:

CAK Spec — a human-facing declarative surface format for agent behavior artifacts.
CAK IR — the canonical typed Agent Learning IR.
CAK Runtime / Governance / Learning Plane — execution, verification, replay, promotion, audit, portability, cost control, and unlearning.

Product thesis

CAK makes agent behavior a governed software artifact.

Modern agents can act, but teams struggle to govern what agents learn, how they change, why they act, how much they cost, and whether behavior remains portable across vendors and runtimes.

CAK treats traces, effects, skills, memory, policies, provider bindings, approvals, and patches as replayable, auditable, portable software artifacts.

v0.1 wedge

The first usable CAK slice is intentionally narrow:

Tool-using agent
-> structured action proposal
-> effect and capability check
-> policy decision
-> tool gateway execution
-> trace

v0.1 should prove tool-boundary governance before attempting automatic learning, full replay, multi-agent coordination, or provider portability.

The preferred v0.1 form factor is an MCP gateway/proxy for SaaS and operations agents, because it can govern tool calls without requiring agent rewrites.

Core loop

Observation
→ Trace
→ Evidence
→ Effect
→ Memory / Skill / Patch
→ Verification
→ Replay / Shadow Eval
→ Promotion
→ Runtime Use
→ Audit / Incident / Unlearning

What CAK addresses

unsafe external actions;
opaque agent behavior;
repeated failures;
memory pollution;
skill debt;
vendor lock-in;
unpredictable frontier-model cost;
data routing and retention;
multi-agent failure modes;
supply-chain and repo-context attacks;
debugging, replay, audit, rollback, and unlearning.

Repository map

docs/       Project thesis, pain map, architecture, runtime, governance, evals
evidence/   Source ledger for pain claims and market assumptions
schemas/    Draft CAK IR, TaskCapsule, EffectSpec, SkillSpec, PolicySpec schemas
examples/   Example CAK specs, provider/profile artifacts, v0.1 demo
src/cak/    v0.1 runtime skeleton: specs, verifier, trace, replay, MCP gateway
tests/      Verifier, trace/replay, and gateway end-to-end tests

v0.1 runtime skeleton

The first executable slice (docs/13 scope, docs/17 positioning): typed specs, an embeddable pre-execution verifier, a JSONL trace recorder, semantic replay, and an MCP stdio gateway that owns upstream credentials.

ruff check src tests && mypy src && pytest
PYTHONPATH=src python3 examples/v0_1/demo.py

The demo shows the wedge: auto-allow as Effect<compensable> with postcondition checks, require_approval with a scoped single-use approval token (python3 -m cak.approve, see docs/18), block as a typed replayable denial, a verified compensation chain on compensable effects (compensation_prepared → compensation_executed, see docs/19), and replay over the recorded trace with decision and postcondition checkpoints.

Policy predicates use CEL (ratified in docs/11; see CEL Policy Predicates): PolicySpec.expr is a CEL boolean over args, expressing cross-field, membership, and absence checks the interim surface cannot — examples/v0_1/cel_policies_example.json. CAK keeps the policy envelope (action scope, enforcement tiers, strictest-wins); CEL only answers whether a single policy's condition holds. cel-python is required only for configs that use expr; the interim when list still runs without it.

To put the gateway in front of any MCP server for a real agent (for example Claude Code), point the client at the proxy command — see examples/v0_1/mcp_config_example.json. The agent needs zero code changes; upstream credentials live in the gateway environment.

Naming

The public project name is CAK.

Earlier working names such as CAK-L, TraceLang, and TLIR are now treated as legacy aliases:

TraceLang → CAK Spec
TLIR → CAK IR
CAK-L → CAK

First repo goal

Document the full solution, then cut it down to a testable v0.1:

v0.1: Agent proposal -> Effect/Capability -> Policy -> Tool Gateway -> Trace
vision: CAK Spec -> CAK IR -> Agent VM -> Evidence/Scope -> Verifier -> Artifact Registry -> Replay/Eval -> Provider/Cost/Governance

Start with:

Design notes beyond v0.1:

R&D mode

CAK now tracks open research questions in docs/rd/. Major architecture changes should link to an RDR or an implementation spike. The current priority is agent-native skill/procedural-memory abstraction, including ContractSpec, active skills, verifiers, skill pollution, and possible future SkillPack protocol boundaries.

Start with:

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
.heurema/rdlab		.heurema/rdlab
crates		crates
docs		docs
evidence		evidence
examples		examples
experiments/predicate-language		experiments/predicate-language
runtime-fixtures		runtime-fixtures
schemas		schemas
scripts		scripts
skills		skills
src/cak		src/cak
tests		tests
.gitignore		.gitignore
.mcp.json		.mcp.json
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CAK

Product thesis

v0.1 wedge

Core loop

What CAK addresses

Repository map

v0.1 runtime skeleton

Naming

First repo goal

R&D mode

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CAK

Product thesis

v0.1 wedge

Core loop

What CAK addresses

Repository map

v0.1 runtime skeleton

Naming

First repo goal

R&D mode

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages