Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
name: Bug Report
about: Report something that isn't working
labels: bug
---

**Describe the bug**
A clear description of what's going wrong.

**To reproduce**
Steps to reproduce the behavior:
1. ...
2. ...

**Expected behavior**
What you expected to happen.

**Environment**
- Python version:
- ClawLoop version:
- OS:
- LLM provider (if relevant):
14 changes: 14 additions & 0 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
name: Feature Request
about: Suggest an idea or improvement
labels: enhancement
---

**Use case**
What are you trying to accomplish?

**Proposed solution**
How do you think this could work?

**Alternatives considered**
Any other approaches you've thought about.
8 changes: 8 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
## Summary

What changed and why.

## Test plan

- [ ] `pytest tests/ -x` passes
- [ ] Tested manually (if applicable)
48 changes: 48 additions & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: Docs

on:
push:
branches: [main]
paths: [docs/**, mkdocs.yml]
pull_request:
paths: [docs/**, mkdocs.yml]

permissions:
contents: read
pages: write
id-token: write

concurrency:
group: pages
cancel-in-progress: true

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- run: pip install mkdocs-material
- run: mkdocs build --strict

deploy:
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- run: pip install mkdocs-material
- run: mkdocs build --strict
- uses: actions/upload-pages-artifact@v3
with:
path: site/
- id: deployment
uses: actions/deploy-pages@v4
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,7 @@ build/
dist/
examples/openclaw_runner/node_modules/
examples/openclaw_runner/package-lock.json

# Runtime artifacts
playbook.json
Comment on lines +28 to +29
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since this PR focuses on pre-launch cleanup and explicitly adds playbook.json as a runtime artifact to .gitignore, it is highly recommended to also include the runs/ directory. The ExperimentLog and EvolutionLog classes default to creating this directory for storing execution traces and logs (e.g., ./runs/<bench>/<timestamp>). Ignoring it prevents local run data from being accidentally committed to the repository.

# Runtime artifacts
playbook.json
runs/

runs/
101 changes: 96 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,106 @@ git clone https://github.com/aganthos/clawloop.git
cd clawloop
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
python -m pytest tests/
pytest tests/ -x
```

## Guidelines
## Architecture Overview

ClawLoop has three learning layers that all follow the same protocol:

```
clawloop/
core/ # Types (Episode, Datum, StateID), protocols (Layer, Evolver),
# and the learning loop itself
layers/ # The three learning layers: Harness, Router, Weights
envs/ # Built-in task environments (math, harbor) — simple, self-contained
adapters/ # Connectors for external benchmarks (CAR-bench, CRMArena, OpenClaw)
# that require process orchestration or network calls
evolvers/ # Harness optimization backends (LocalEvolver ships by default)
backends/ # Weight training backends (SkyRL integration for GRPO/PPO/SFT)
extractors/ # Compute reward signals from raw episode traces
exporters/ # Send data out: OpenTelemetry spans, SkyRL training format,
# router tuning tuples
callbacks/ # Hook into litellm call lifecycle to capture traces
utils/ # Small helpers (async bridge)
```

**Key types:** `Episode`, `EpisodeSummary`, `Datum`, `AgentState`, `StateID`

**Layer Protocol:** Every layer implements `forward_backward()` (accumulate
updates without mutation) and `optim_step()` (apply atomically, rollback on
failure). See `clawloop/core/layer.py`.

**Learning loop:** `clawloop/core/loop.py` — collects episodes, distributes
them as `Datum` objects, runs forward_backward then optim_step on each layer.

## Adding a New Environment

1. Create an adapter in `clawloop/adapters/` implementing `EnvAdapter`
2. Your `run_episode()` must return an `Episode` with messages, steps, and
an `EpisodeSummary` containing reward signals
3. Register it in `clawloop/train.py` via `ENV_BUILDERS`

Existing adapters to learn from:

- `clawloop/envs/math.py` — minimal example (~80 lines)
- `clawloop/envs/harbor.py` — sandboxed agent tasks via Docker
- `clawloop/adapters/car.py` — CAR-bench integration with external process orchestration
- `clawloop/adapters/entropic.py` — CRMArena A2A benchmark

See [Adding Environments](https://aganthos.github.io/clawloop/adding-environments/)
for a full walkthrough.

## Testing

```bash
# Run all tests
pytest tests/ -x

# Run a specific test file
pytest tests/test_agent.py -x

# Run a specific test
pytest tests/test_agent.py::TestClawLoopAgent::test_learn_basic -x

# Run with verbose output
pytest tests/ -x -v --timeout=30
```

Tests use `MockLLMClient` from `clawloop/llm.py` — no API keys needed. The
`tests/conftest.py` has a boundary guard that prevents tests from importing
private modules.

## Code Style

- Run `pytest tests/ -x` before submitting a PR
- Follow existing code patterns
- One commit per logical change: `feat:`, `fix:`, or `chore:` prefix
- Use type hints on all public functions and methods
- Add docstrings to public classes and functions
- Use `from __future__ import annotations` for forward references
- Use `Protocol` for interfaces, `@dataclass` for value types
- No linter is enforced yet — just keep it consistent with surrounding code

## Commits

One commit per logical change with a prefix:

- `feat:` new functionality
- `fix:` bug fix
- `chore:` maintenance, docs, CI

## Pull Requests

- Run `pytest tests/ -x` before submitting
- Keep PRs focused — one concern per PR
- Describe what changed and why in the PR description

## Issues

- **Bug reports:** include steps to reproduce, expected vs actual behavior,
and your Python version
- **Feature requests:** describe the use case, not just the solution

## License

By contributing, you agree that your contributions will be licensed under the BSL 1.1 license.
By contributing, you agree that your contributions will be licensed under
the [BSL 1.1](LICENSE) license.
19 changes: 17 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,8 @@ ClawLoop uses [litellm](https://docs.litellm.ai/) — any provider works:

```json
{"model": "anthropic/claude-haiku-4-5-20251001"}
{"model": "openai/gpt-4o-mini"}
{"model": "gemini/gemini-2.0-flash-lite"}
{"model": "openai/gpt-5-nano"}
{"model": "gemini/gemini-3.1-flash-lite"}
```

Set the provider's API key as an environment variable (`ANTHROPIC_API_KEY`,
Expand Down Expand Up @@ -200,6 +200,21 @@ and an `EpisodeSummary` containing reward signals. See `clawloop/envs/math.py`

</details>

## Enterprise

ClawLoop Enterprise adds premium learning backends and managed
infrastructure on top of the community edition.

- **Premium evolution backends** — broader search over prompts, playbooks,
and agent configurations than the community `LocalEvolver`
- **Persistent playbooks** — versioned storage with rollback so learned
strategies survive restarts
- **Managed training infrastructure** — hosted compute for weight training
without self-hosting GPUs
- **Logging & lineage** — episode archive with provenance tracking

Contact [info@aganthos.com](mailto:info@aganthos.com) to learn more.

## License

ClawLoop is licensed under the [Business Source License 1.1](LICENSE) with
Expand Down
2 changes: 0 additions & 2 deletions clawloop/adapters/car.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,6 @@
class CARAdapter(EnvAdapter):
"""Adapter for CAR-bench. Runs agentbeats-run per learning iteration."""

CAR_BENCH_TESTED_COMMIT = "TBD"

def setup(self, config: dict[str, Any]) -> None:
self._model = config.get("model", "anthropic/claude-haiku-4-5-20251001")
self._car_bench_path = Path(
Expand Down
40 changes: 0 additions & 40 deletions clawloop/adapters/tau2.py

This file was deleted.

6 changes: 0 additions & 6 deletions clawloop/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,6 @@ def _build_parser() -> argparse.ArgumentParser:
ADAPTER_REGISTRY: dict[str, tuple[str, str]] = {
"entropic": ("clawloop.adapters.entropic", "EntropicAdapter"),
"car": ("clawloop.adapters.car", "CARAdapter"),
"tau2": ("clawloop.adapters.tau2", "Tau2Adapter"),
}


Expand Down Expand Up @@ -222,11 +221,6 @@ def cmd_eval(args: argparse.Namespace) -> None:
"data_setup": None,
"uv_sync_cmd": ["uv", "sync"],
},
# "tau2": {
# "bench_dir": "benchmarks/tau-bench",
# "data_setup": None,
# "uv_sync_cmd": ["uv", "sync"],
# },
}


Expand Down
2 changes: 1 addition & 1 deletion clawloop/core/episode.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ class Episode:
id: str
state_id: str # hash of layers used
task_id: str
bench: str # "entropic" | "car" | "tau2" | ...
bench: str # "entropic" | "car" | ...
messages: list[Message]
step_boundaries: list[int] # indices into messages where each agent turn starts
steps: list[StepMeta]
Expand Down
Loading
Loading