aganthos · bordeauxred · Mar 31, 2026 · Mar 31, 2026 · gemini-code-assist · Mar 31, 2026
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -0,0 +1,22 @@
+---
+name: Bug Report
+about: Report something that isn't working
+labels: bug
+---
+
+**Describe the bug**
+A clear description of what's going wrong.
+
+**To reproduce**
+Steps to reproduce the behavior:
+1. ...
+2. ...
+
+**Expected behavior**
+What you expected to happen.
+
+**Environment**
+- Python version:
+- ClawLoop version:
+- OS:
+- LLM provider (if relevant):
diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md
@@ -0,0 +1,14 @@
+---
+name: Feature Request
+about: Suggest an idea or improvement
+labels: enhancement
+---
+
+**Use case**
+What are you trying to accomplish?
+
+**Proposed solution**
+How do you think this could work?
+
+**Alternatives considered**
+Any other approaches you've thought about.
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -0,0 +1,8 @@
+## Summary
+
+What changed and why.
+
+## Test plan
+
+- [ ] `pytest tests/ -x` passes
+- [ ] Tested manually (if applicable)
diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml
@@ -0,0 +1,48 @@
+name: Docs
+
+on:
+  push:
+    branches: [main]
+    paths: [docs/**, mkdocs.yml]
+  pull_request:
+    paths: [docs/**, mkdocs.yml]
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+concurrency:
+  group: pages
+  cancel-in-progress: true
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+      - run: pip install mkdocs-material
+      - run: mkdocs build --strict
+
+  deploy:
+    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
+    needs: build
+    runs-on: ubuntu-latest
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+      - run: pip install mkdocs-material
+      - run: mkdocs build --strict
+      - uses: actions/upload-pages-artifact@v3
+        with:
+          path: site/
+      - id: deployment
+        uses: actions/deploy-pages@v4
diff --git a/.gitignore b/.gitignore
@@ -24,3 +24,7 @@ build/
 dist/
 examples/openclaw_runner/node_modules/
 examples/openclaw_runner/package-lock.json
+
+# Runtime artifacts
+playbook.json
+runs/
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -9,15 +9,106 @@ git clone https://github.com/aganthos/clawloop.git
 cd clawloop
 python -m venv .venv && source .venv/bin/activate
 pip install -e ".[dev]"
-python -m pytest tests/
+pytest tests/ -x
 ```
 
-## Guidelines
+## Architecture Overview
+
+ClawLoop has three learning layers that all follow the same protocol:
+
+```
+clawloop/
+  core/         # Types (Episode, Datum, StateID), protocols (Layer, Evolver),
+                #   and the learning loop itself
+  layers/       # The three learning layers: Harness, Router, Weights
+  envs/         # Built-in task environments (math, harbor) — simple, self-contained
+  adapters/     # Connectors for external benchmarks (CAR-bench, CRMArena, OpenClaw)
+                #   that require process orchestration or network calls
+  evolvers/     # Harness optimization backends (LocalEvolver ships by default)
+  backends/     # Weight training backends (SkyRL integration for GRPO/PPO/SFT)
+  extractors/   # Compute reward signals from raw episode traces
+  exporters/    # Send data out: OpenTelemetry spans, SkyRL training format,
+                #   router tuning tuples
+  callbacks/    # Hook into litellm call lifecycle to capture traces
+  utils/        # Small helpers (async bridge)
+```
+
+**Key types:** `Episode`, `EpisodeSummary`, `Datum`, `AgentState`, `StateID`
+
+**Layer Protocol:** Every layer implements `forward_backward()` (accumulate
+updates without mutation) and `optim_step()` (apply atomically, rollback on
+failure). See `clawloop/core/layer.py`.
+
+**Learning loop:** `clawloop/core/loop.py` — collects episodes, distributes
+them as `Datum` objects, runs forward_backward then optim_step on each layer.
+
+## Adding a New Environment
+
+1. Create an adapter in `clawloop/adapters/` implementing `EnvAdapter`
+2. Your `run_episode()` must return an `Episode` with messages, steps, and
+   an `EpisodeSummary` containing reward signals
+3. Register it in `clawloop/train.py` via `ENV_BUILDERS`
+
+Existing adapters to learn from:
+
+- `clawloop/envs/math.py` — minimal example (~80 lines)
+- `clawloop/envs/harbor.py` — sandboxed agent tasks via Docker
+- `clawloop/adapters/car.py` — CAR-bench integration with external process orchestration
+- `clawloop/adapters/entropic.py` — CRMArena A2A benchmark
+
+See [Adding Environments](https://aganthos.github.io/clawloop/adding-environments/)
+for a full walkthrough.
+
+## Testing
+
+```bash
+# Run all tests
+pytest tests/ -x
+
+# Run a specific test file
+pytest tests/test_agent.py -x
+
+# Run a specific test
+pytest tests/test_agent.py::TestClawLoopAgent::test_learn_basic -x
+
+# Run with verbose output
+pytest tests/ -x -v --timeout=30
+```
+
+Tests use `MockLLMClient` from `clawloop/llm.py` — no API keys needed. The
+`tests/conftest.py` has a boundary guard that prevents tests from importing
+private modules.
+
+## Code Style
 
-- Run `pytest tests/ -x` before submitting a PR
 - Follow existing code patterns
-- One commit per logical change: `feat:`, `fix:`, or `chore:` prefix
+- Use type hints on all public functions and methods
+- Add docstrings to public classes and functions
+- Use `from __future__ import annotations` for forward references
+- Use `Protocol` for interfaces, `@dataclass` for value types
+- No linter is enforced yet — just keep it consistent with surrounding code
+
+## Commits
+
+One commit per logical change with a prefix:
+
+- `feat:` new functionality
+- `fix:` bug fix
+- `chore:` maintenance, docs, CI
+
+## Pull Requests
+
+- Run `pytest tests/ -x` before submitting
+- Keep PRs focused — one concern per PR
+- Describe what changed and why in the PR description
+
+## Issues
+
+- **Bug reports:** include steps to reproduce, expected vs actual behavior,
+  and your Python version
+- **Feature requests:** describe the use case, not just the solution
 
 ## License
 
-By contributing, you agree that your contributions will be licensed under the BSL 1.1 license.
+By contributing, you agree that your contributions will be licensed under
+the [BSL 1.1](LICENSE) license.
diff --git a/README.md b/README.md
@@ -136,8 +136,8 @@ ClawLoop uses [litellm](https://docs.litellm.ai/) — any provider works:
 
 ```json
 {"model": "anthropic/claude-haiku-4-5-20251001"}
-{"model": "openai/gpt-4o-mini"}
-{"model": "gemini/gemini-2.0-flash-lite"}
+{"model": "openai/gpt-5-nano"}
+{"model": "gemini/gemini-3.1-flash-lite"}
 ```
 
 Set the provider's API key as an environment variable (`ANTHROPIC_API_KEY`,
@@ -200,6 +200,21 @@ and an `EpisodeSummary` containing reward signals. See `clawloop/envs/math.py`
 
 </details>
 
+## Enterprise
+
+ClawLoop Enterprise adds premium learning backends and managed
+infrastructure on top of the community edition.
+
+- **Premium evolution backends** — broader search over prompts, playbooks,
+  and agent configurations than the community `LocalEvolver`
+- **Persistent playbooks** — versioned storage with rollback so learned
+  strategies survive restarts
+- **Managed training infrastructure** — hosted compute for weight training
+  without self-hosting GPUs
+- **Logging & lineage** — episode archive with provenance tracking
+
+Contact [info@aganthos.com](mailto:info@aganthos.com) to learn more.
+
 ## License
 
 ClawLoop is licensed under the [Business Source License 1.1](LICENSE) with

diff --git a/clawloop/adapters/car.py b/clawloop/adapters/car.py
@@ -35,8 +35,6 @@
 class CARAdapter(EnvAdapter):
     """Adapter for CAR-bench. Runs agentbeats-run per learning iteration."""
 
-    CAR_BENCH_TESTED_COMMIT = "TBD"
-
     def setup(self, config: dict[str, Any]) -> None:
         self._model = config.get("model", "anthropic/claude-haiku-4-5-20251001")
         self._car_bench_path = Path(

diff --git a/clawloop/adapters/tau2.py b/clawloop/adapters/tau2.py
diff --git a/clawloop/cli.py b/clawloop/cli.py
@@ -64,7 +64,6 @@ def _build_parser() -> argparse.ArgumentParser:
 ADAPTER_REGISTRY: dict[str, tuple[str, str]] = {
     "entropic": ("clawloop.adapters.entropic", "EntropicAdapter"),
     "car": ("clawloop.adapters.car", "CARAdapter"),
-    "tau2": ("clawloop.adapters.tau2", "Tau2Adapter"),
 }
 
 
@@ -222,11 +221,6 @@ def cmd_eval(args: argparse.Namespace) -> None:
         "data_setup": None,
         "uv_sync_cmd": ["uv", "sync"],
     },
-    # "tau2": {
-    #     "bench_dir": "benchmarks/tau-bench",
-    #     "data_setup": None,
-    #     "uv_sync_cmd": ["uv", "sync"],
-    # },
 }
 
 

diff --git a/clawloop/core/episode.py b/clawloop/core/episode.py
@@ -221,7 +221,7 @@ class Episode:
     id: str
     state_id: str  # hash of layers used
     task_id: str
-    bench: str  # "entropic" | "car" | "tau2" | ...
+    bench: str  # "entropic" | "car" | ...
     messages: list[Message]
     step_boundaries: list[int]  # indices into messages where each agent turn starts
     steps: list[StepMeta]