A governance-aware runtime substrate for persistent capability evolution on top of LangGraph / DeepAgents.
This repository provides a research reference implementation accompanying the paper:
Governed Evolution of Agent Runtimes through Executable Operational Cognition
A Systems Vision Paper
The project explores how agent-generated artifacts can evolve from transient outputs into persistent operational capabilities under explicit governance constraints.
Rather than treating generated code as disposable execution artifacts, the runtime models prompts, workflows, evaluators, routing rules, executable skills, and mutations as inspectable operational entities participating directly in future runtime behavior.
The implementation is intentionally built on top of the LangGraph / DeepAgents ecosystem and should be interpreted as a governance-aware runtime kernel rather than a standalone orchestration framework.
The implementation follows the conceptual hierarchy introduced in the paper.
A generated operational entity such as:
- executable skill
- evaluator
- workflow
- routing rule
- prompt mutation
- policy
- executable module
An artifact that successfully passes validation, governance, and persistence stages, becoming reusable runtime behavior.
Emergent runtime behavior produced through the interaction between:
- persistent capabilities
- traces
- evaluation loops
- governance policies
- runtime memory
- multi-agent specialization
The runtime operationalizes a governed lifecycle:
- generate
- execute
- evaluate
- persist
- mutate
- govern
- promote
- govern
- mutate
- persist
- evaluate
- execute
Not every generated artifact becomes part of the runtime substrate.
Governance and validation determine whether operational behavior is promoted into persistent capability memory.
A recommended way to explore the repository is:
- Read the Runtime Evolution Model section.
- Inspect the multi-agent governance decomposition.
- Review the agent-facing tools and lifecycle transitions.
- Explore generated skills and mutation proposals under
runtime_v10/. - Inspect trace registries, reviews, and runtime memory artifacts.
- Follow how governance constraints affect promotion decisions.
- Review the inline
Implements:comments connecting the implementation to the paper.
The repository is intentionally designed to function as both:
- a conceptual systems artifact,
- and an executable runtime reference substrate.
A runtime repeatedly observes manual CSV normalization and schema validation steps across multiple traces.
The runtime evolution agent may then:
- inspect operational traces,
- infer a reusable capability pattern,
- generate a new DeepAgents skill,
- validate the generated artifact,
- record a capability review,
- propose lifecycle promotion,
- persist the capability into future runtime executions.
If repeated failures or bottlenecks are detected in prompts, workflows, or evaluation logic, the runtime may additionally propose governed harness mutations subject to explicit validation and promotion constraints.
The important property is that runtime evolution remains bounded, inspectable, and governance-aware rather than unrestricted self-modification.
Runtime evolution is decomposed across specialized operational roles:
| Agent | Responsibility |
|---|---|
runtime-evolution-agent |
capability generation and abstraction |
skill-validator-agent |
validation and lifecycle review |
harness-governor-agent |
mutation governance and policy control |
task-worker-agent |
operational execution |
reflection-agent |
introspection and strategic improvement |
This decomposition avoids monolithic self-modifying loops and instead promotes bounded, inspectable runtime evolution.
This repository intentionally builds on top of the LangGraph / DeepAgents ecosystem.
- orchestration
- state management
- subagents
- tool execution
- middleware pipelines
- memory integration
- governed capability evolution
- lifecycle-aware persistence
- mutation governance
- runtime introspection
- operational cognition substrate
- persistent operational memory
- capability promotion boundaries
Key concepts from the paper map directly to runtime components:
| Paper Concept | Implementation |
|---|---|
| HarnessMutation | propose_harness_mutation / promote_harness_mutation |
| Capability lifecycle | promote_skill_stage |
| Validation gates | validate_generated_skill |
| Runtime traces | trace_events.jsonl |
| Capability reviews | CapabilityReview |
| Runtime introspection | inspect_harness_state |
| Operational memory | generated skills + runtime memory notes |
| Multi-agent governance | specialized subagents |
| Runtime substrate composition | compose_system_prompt() |
The implementation intentionally exposes explicit paper-to-code traceability through inline comments such as:
# Implements: governed mutation proposal stage from the paper.This allows the repository to operate as an executable companion to the systems vision proposed in the paper.
The project explicitly avoids unrestricted autonomous self-modification.
The runtime separates:
- persistence
- lifecycle state
- validation boundaries
- auditability
- mutation gates
- filesystem boundaries
from:
- pattern detection
- abstraction
- capability synthesis
- evaluator generation
- reuse decisions
- workflow evolution
- governance reasoning
This separation is central to the architecture:
Python is not the intelligence.
Python is the governed operational substrate.
The paper introduces the concept of a Knowledge-Grounded Runtime Graph.
In this reference implementation, the graph is approximated through the interaction between:
- trace registries
- capability registries
- mutation registries
- lifecycle reviews
- runtime memory notes
Together these artifacts create an inspectable operational lineage layer connecting runtime evolution, governance, and persistent capabilities.
This project should be interpreted as:
- research infrastructure,
- architectural exploration,
- executable reference implementation.
It is not currently intended as:
- a production autonomous runtime,
- an unrestricted self-modifying system,
- a secure deployment platform,
- a complete governance engine.
This repository exposes mechanisms for generated artifact persistence and runtime mutation proposals.
Do not expose these capabilities to untrusted environments without additional controls such as:
- sandboxing
- policy engines
- approval workflows
- network restrictions
- dependency allowlists
- CI validation
- audit logging
- rollback systems
If you use this repository or the associated ideas in academic work, please cite:
@misc{garraldabarrio2026governedevolutionagentruntimes,
title={Governed Evolution of Agent Runtimes through Executable Operational Cognition},
author={Mariano Garralda-Barrio},
year={2026},
eprint={2605.27328},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2605.27328},
}This repository is released under the Academic Research License v1.0.
The project is provided exclusively for:
- academic research,
- educational use,
- non-commercial experimentation,
- evaluation and reproducibility purposes.
Commercial usage, production deployment, managed services, sublicensing, resale, or integration into commercial systems is prohibited without explicit prior written permission from the author.
See the LICENSE file for the complete license terms.