Skip to content

heurema/cak

Repository files navigation

CAK

CAK (Causal Agent Kernel) is a typed semantic control layer for AI-agent behavior.

CAK is not a general-purpose programming language, a prompt DSL, or another orchestration framework. It is a three-layer system:

  1. CAK Spec — a human-facing declarative surface format for agent behavior artifacts.
  2. CAK IR — the canonical typed Agent Learning IR.
  3. CAK Runtime / Governance / Learning Plane — execution, verification, replay, promotion, audit, portability, cost control, and unlearning.

Product thesis

CAK makes agent behavior a governed software artifact.

Modern agents can act, but teams struggle to govern what agents learn, how they change, why they act, how much they cost, and whether behavior remains portable across vendors and runtimes.

CAK treats traces, effects, skills, memory, policies, provider bindings, approvals, and patches as replayable, auditable, portable software artifacts.

v0.1 wedge

The first usable CAK slice is intentionally narrow:

Tool-using agent
-> structured action proposal
-> effect and capability check
-> policy decision
-> tool gateway execution
-> trace

v0.1 should prove tool-boundary governance before attempting automatic learning, full replay, multi-agent coordination, or provider portability.

The preferred v0.1 form factor is an MCP gateway/proxy for SaaS and operations agents, because it can govern tool calls without requiring agent rewrites.

Core loop

Observation
→ Trace
→ Evidence
→ Effect
→ Memory / Skill / Patch
→ Verification
→ Replay / Shadow Eval
→ Promotion
→ Runtime Use
→ Audit / Incident / Unlearning

What CAK addresses

  • unsafe external actions;
  • opaque agent behavior;
  • repeated failures;
  • memory pollution;
  • skill debt;
  • vendor lock-in;
  • unpredictable frontier-model cost;
  • data routing and retention;
  • multi-agent failure modes;
  • supply-chain and repo-context attacks;
  • debugging, replay, audit, rollback, and unlearning.

Repository map

docs/       Project thesis, pain map, architecture, runtime, governance, evals
evidence/   Source ledger for pain claims and market assumptions
schemas/    Draft CAK IR, TaskCapsule, EffectSpec, SkillSpec, PolicySpec schemas
examples/   Example CAK specs, provider/profile artifacts, v0.1 demo
src/cak/    v0.1 runtime skeleton: specs, verifier, trace, replay, MCP gateway
tests/      Verifier, trace/replay, and gateway end-to-end tests

v0.1 runtime skeleton

The first executable slice (docs/13 scope, docs/17 positioning): typed specs, an embeddable pre-execution verifier, a JSONL trace recorder, semantic replay, and an MCP stdio gateway that owns upstream credentials.

ruff check src tests && mypy src && pytest
PYTHONPATH=src python3 examples/v0_1/demo.py

The demo shows the wedge: auto-allow as Effect<compensable> with postcondition checks, require_approval with a scoped single-use approval token (python3 -m cak.approve, see docs/18), block as a typed replayable denial, a verified compensation chain on compensable effects (compensation_preparedcompensation_executed, see docs/19), and replay over the recorded trace with decision and postcondition checkpoints.

Policy predicates use CEL (ratified in docs/11; see CEL Policy Predicates): PolicySpec.expr is a CEL boolean over args, expressing cross-field, membership, and absence checks the interim surface cannot — examples/v0_1/cel_policies_example.json. CAK keeps the policy envelope (action scope, enforcement tiers, strictest-wins); CEL only answers whether a single policy's condition holds. cel-python is required only for configs that use expr; the interim when list still runs without it.

To put the gateway in front of any MCP server for a real agent (for example Claude Code), point the client at the proxy command — see examples/v0_1/mcp_config_example.json. The agent needs zero code changes; upstream credentials live in the gateway environment.

Naming

The public project name is CAK.

Earlier working names such as CAK-L, TraceLang, and TLIR are now treated as legacy aliases:

  • TraceLangCAK Spec
  • TLIRCAK IR
  • CAK-LCAK

First repo goal

Document the full solution, then cut it down to a testable v0.1:

v0.1: Agent proposal -> Effect/Capability -> Policy -> Tool Gateway -> Trace
vision: CAK Spec -> CAK IR -> Agent VM -> Evidence/Scope -> Verifier -> Artifact Registry -> Replay/Eval -> Provider/Cost/Governance

Start with:

Design notes beyond v0.1:

R&D mode

CAK now tracks open research questions in docs/rd/. Major architecture changes should link to an RDR or an implementation spike. The current priority is agent-native skill/procedural-memory abstraction, including ContractSpec, active skills, verifiers, skill pollution, and possible future SkillPack protocol boundaries.

Start with:

About

CAK: typed semantic control layer for AI-agent behavior, learning, replay, governance, portability, and cost control.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors