diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 00000000..5897b22b --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,22 @@ +--- +name: Bug Report +about: Report something that isn't working +labels: bug +--- + +**Describe the bug** +A clear description of what's going wrong. + +**To reproduce** +Steps to reproduce the behavior: +1. ... +2. ... + +**Expected behavior** +What you expected to happen. + +**Environment** +- Python version: +- ClawLoop version: +- OS: +- LLM provider (if relevant): diff --git a/.github/ISSUE_TEMPLATE/feature_request.md b/.github/ISSUE_TEMPLATE/feature_request.md new file mode 100644 index 00000000..e31e0b12 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/feature_request.md @@ -0,0 +1,14 @@ +--- +name: Feature Request +about: Suggest an idea or improvement +labels: enhancement +--- + +**Use case** +What are you trying to accomplish? + +**Proposed solution** +How do you think this could work? + +**Alternatives considered** +Any other approaches you've thought about. diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md new file mode 100644 index 00000000..b17ed997 --- /dev/null +++ b/.github/pull_request_template.md @@ -0,0 +1,8 @@ +## Summary + +What changed and why. + +## Test plan + +- [ ] `pytest tests/ -x` passes +- [ ] Tested manually (if applicable) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml new file mode 100644 index 00000000..c6921dc0 --- /dev/null +++ b/.github/workflows/docs.yml @@ -0,0 +1,48 @@ +name: Docs + +on: + push: + branches: [main] + paths: [docs/**, mkdocs.yml] + pull_request: + paths: [docs/**, mkdocs.yml] + +permissions: + contents: read + pages: write + id-token: write + +concurrency: + group: pages + cancel-in-progress: true + +jobs: + build: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + - uses: actions/setup-python@v5 + with: + python-version: "3.12" + - run: pip install mkdocs-material + - run: mkdocs build --strict + + deploy: + if: github.event_name == 'push' && github.ref == 'refs/heads/main' + needs: build + runs-on: ubuntu-latest + environment: + name: github-pages + url: ${{ steps.deployment.outputs.page_url }} + steps: + - uses: actions/checkout@v4 + - uses: actions/setup-python@v5 + with: + python-version: "3.12" + - run: pip install mkdocs-material + - run: mkdocs build --strict + - uses: actions/upload-pages-artifact@v3 + with: + path: site/ + - id: deployment + uses: actions/deploy-pages@v4 diff --git a/.gitignore b/.gitignore index 6bbc0055..7b189d3c 100644 --- a/.gitignore +++ b/.gitignore @@ -24,3 +24,7 @@ build/ dist/ examples/openclaw_runner/node_modules/ examples/openclaw_runner/package-lock.json + +# Runtime artifacts +playbook.json +runs/ diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 1dec762f..f94b98f1 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -9,15 +9,106 @@ git clone https://github.com/aganthos/clawloop.git cd clawloop python -m venv .venv && source .venv/bin/activate pip install -e ".[dev]" -python -m pytest tests/ +pytest tests/ -x ``` -## Guidelines +## Architecture Overview + +ClawLoop has three learning layers that all follow the same protocol: + +``` +clawloop/ + core/ # Types (Episode, Datum, StateID), protocols (Layer, Evolver), + # and the learning loop itself + layers/ # The three learning layers: Harness, Router, Weights + envs/ # Built-in task environments (math, harbor) — simple, self-contained + adapters/ # Connectors for external benchmarks (CAR-bench, CRMArena, OpenClaw) + # that require process orchestration or network calls + evolvers/ # Harness optimization backends (LocalEvolver ships by default) + backends/ # Weight training backends (SkyRL integration for GRPO/PPO/SFT) + extractors/ # Compute reward signals from raw episode traces + exporters/ # Send data out: OpenTelemetry spans, SkyRL training format, + # router tuning tuples + callbacks/ # Hook into litellm call lifecycle to capture traces + utils/ # Small helpers (async bridge) +``` + +**Key types:** `Episode`, `EpisodeSummary`, `Datum`, `AgentState`, `StateID` + +**Layer Protocol:** Every layer implements `forward_backward()` (accumulate +updates without mutation) and `optim_step()` (apply atomically, rollback on +failure). See `clawloop/core/layer.py`. + +**Learning loop:** `clawloop/core/loop.py` — collects episodes, distributes +them as `Datum` objects, runs forward_backward then optim_step on each layer. + +## Adding a New Environment + +1. Create an adapter in `clawloop/adapters/` implementing `EnvAdapter` +2. Your `run_episode()` must return an `Episode` with messages, steps, and + an `EpisodeSummary` containing reward signals +3. Register it in `clawloop/train.py` via `ENV_BUILDERS` + +Existing adapters to learn from: + +- `clawloop/envs/math.py` — minimal example (~80 lines) +- `clawloop/envs/harbor.py` — sandboxed agent tasks via Docker +- `clawloop/adapters/car.py` — CAR-bench integration with external process orchestration +- `clawloop/adapters/entropic.py` — CRMArena A2A benchmark + +See [Adding Environments](https://aganthos.github.io/clawloop/adding-environments/) +for a full walkthrough. + +## Testing + +```bash +# Run all tests +pytest tests/ -x + +# Run a specific test file +pytest tests/test_agent.py -x + +# Run a specific test +pytest tests/test_agent.py::TestClawLoopAgent::test_learn_basic -x + +# Run with verbose output +pytest tests/ -x -v --timeout=30 +``` + +Tests use `MockLLMClient` from `clawloop/llm.py` — no API keys needed. The +`tests/conftest.py` has a boundary guard that prevents tests from importing +private modules. + +## Code Style -- Run `pytest tests/ -x` before submitting a PR - Follow existing code patterns -- One commit per logical change: `feat:`, `fix:`, or `chore:` prefix +- Use type hints on all public functions and methods +- Add docstrings to public classes and functions +- Use `from __future__ import annotations` for forward references +- Use `Protocol` for interfaces, `@dataclass` for value types +- No linter is enforced yet — just keep it consistent with surrounding code + +## Commits + +One commit per logical change with a prefix: + +- `feat:` new functionality +- `fix:` bug fix +- `chore:` maintenance, docs, CI + +## Pull Requests + +- Run `pytest tests/ -x` before submitting +- Keep PRs focused — one concern per PR +- Describe what changed and why in the PR description + +## Issues + +- **Bug reports:** include steps to reproduce, expected vs actual behavior, + and your Python version +- **Feature requests:** describe the use case, not just the solution ## License -By contributing, you agree that your contributions will be licensed under the BSL 1.1 license. +By contributing, you agree that your contributions will be licensed under +the [BSL 1.1](LICENSE) license. diff --git a/README.md b/README.md index 37882f73..61e4bc7c 100644 --- a/README.md +++ b/README.md @@ -136,8 +136,8 @@ ClawLoop uses [litellm](https://docs.litellm.ai/) — any provider works: ```json {"model": "anthropic/claude-haiku-4-5-20251001"} -{"model": "openai/gpt-4o-mini"} -{"model": "gemini/gemini-2.0-flash-lite"} +{"model": "openai/gpt-5-nano"} +{"model": "gemini/gemini-3.1-flash-lite"} ``` Set the provider's API key as an environment variable (`ANTHROPIC_API_KEY`, @@ -200,6 +200,21 @@ and an `EpisodeSummary` containing reward signals. See `clawloop/envs/math.py` +## Enterprise + +ClawLoop Enterprise adds premium learning backends and managed +infrastructure on top of the community edition. + +- **Premium evolution backends** — broader search over prompts, playbooks, + and agent configurations than the community `LocalEvolver` +- **Persistent playbooks** — versioned storage with rollback so learned + strategies survive restarts +- **Managed training infrastructure** — hosted compute for weight training + without self-hosting GPUs +- **Logging & lineage** — episode archive with provenance tracking + +Contact [info@aganthos.com](mailto:info@aganthos.com) to learn more. + ## License ClawLoop is licensed under the [Business Source License 1.1](LICENSE) with diff --git a/clawloop/adapters/car.py b/clawloop/adapters/car.py index 9750b7a0..530f04be 100644 --- a/clawloop/adapters/car.py +++ b/clawloop/adapters/car.py @@ -35,8 +35,6 @@ class CARAdapter(EnvAdapter): """Adapter for CAR-bench. Runs agentbeats-run per learning iteration.""" - CAR_BENCH_TESTED_COMMIT = "TBD" - def setup(self, config: dict[str, Any]) -> None: self._model = config.get("model", "anthropic/claude-haiku-4-5-20251001") self._car_bench_path = Path( diff --git a/clawloop/adapters/tau2.py b/clawloop/adapters/tau2.py deleted file mode 100644 index 7192d206..00000000 --- a/clawloop/adapters/tau2.py +++ /dev/null @@ -1,40 +0,0 @@ -"""tau2-bench adapter — Python API via LocalAgent subclass. - -Uses the Python API directly (not a CLI wrapper). Maps ``SimulationRun`` -> -``Episode``. Reward is the product of all dimensions (sparse, binary-ish); -``reward_info.reward_breakdown`` provides per-dimension signals. - -Domains: airline, retail. Use ``"base"`` split for comparability. -""" - -from __future__ import annotations - -from typing import TYPE_CHECKING, Any - -from clawloop.adapters.base import EnvAdapter -from clawloop.core.episode import Episode - -if TYPE_CHECKING: - from clawloop.core.loop import AgentState - - -class Tau2Adapter(EnvAdapter): - """Adapter for tau2-bench (stub). - - Intended to subclass ``tau2.agent.base.LocalAgent`` and map - ``SimulationRun`` objects to ClawLoop ``Episode`` instances. - """ - - def setup(self, config: dict[str, Any]) -> None: - # TODO: import tau2, instantiate LocalAgent subclass, - # load domain config (airline/retail) - self._config = config - - def run_episode(self, task: Any, agent_state: AgentState) -> Episode: - raise NotImplementedError("tau2-bench adapter not yet implemented") - - def get_traces(self, episode: Episode) -> dict[str, Any]: - return {"bench": "tau2", "episode_id": episode.id} - - def list_tasks(self, split: str = "base") -> list[Any]: - raise NotImplementedError("tau2-bench adapter not yet implemented") diff --git a/clawloop/cli.py b/clawloop/cli.py index 32139b1e..b3ced5ae 100644 --- a/clawloop/cli.py +++ b/clawloop/cli.py @@ -64,7 +64,6 @@ def _build_parser() -> argparse.ArgumentParser: ADAPTER_REGISTRY: dict[str, tuple[str, str]] = { "entropic": ("clawloop.adapters.entropic", "EntropicAdapter"), "car": ("clawloop.adapters.car", "CARAdapter"), - "tau2": ("clawloop.adapters.tau2", "Tau2Adapter"), } @@ -222,11 +221,6 @@ def cmd_eval(args: argparse.Namespace) -> None: "data_setup": None, "uv_sync_cmd": ["uv", "sync"], }, - # "tau2": { - # "bench_dir": "benchmarks/tau-bench", - # "data_setup": None, - # "uv_sync_cmd": ["uv", "sync"], - # }, } diff --git a/clawloop/core/episode.py b/clawloop/core/episode.py index 793d58b2..248564ae 100644 --- a/clawloop/core/episode.py +++ b/clawloop/core/episode.py @@ -221,7 +221,7 @@ class Episode: id: str state_id: str # hash of layers used task_id: str - bench: str # "entropic" | "car" | "tau2" | ... + bench: str # "entropic" | "car" | ... messages: list[Message] step_boundaries: list[int] # indices into messages where each agent turn starts steps: list[StepMeta] diff --git a/docs/adding-environments.md b/docs/adding-environments.md new file mode 100644 index 00000000..8fa13d3d --- /dev/null +++ b/docs/adding-environments.md @@ -0,0 +1,95 @@ +# Adding Environments + +ClawLoop environments are pluggable via the `EnvAdapter` interface. + +## The Adapter Interface + +```python +from clawloop.adapters.base import EnvAdapter +from clawloop.core.episode import Episode +from clawloop.core.loop import AgentState + +class MyAdapter(EnvAdapter): + def setup(self, config: dict) -> None: + """Initialize from config (model, paths, credentials).""" + ... + + def run_episode(self, task: Any, agent_state: AgentState) -> Episode: + """Run one agent trajectory and return a structured Episode.""" + ... + + def list_tasks(self, split: str = "test") -> list: + """Return available task IDs.""" + ... +``` + +## Building an Episode + +Your `run_episode` must return an `Episode` with messages, steps, and reward +signals: + +```python +from clawloop.core.episode import Episode, EpisodeSummary, Message, StepMeta +from clawloop.core.reward import RewardSignal + +episode = Episode( + id=str(uuid4()), + state_id=agent_state.state_id().combined_hash, + task_id=task_id, + bench="my_bench", + messages=[ + Message(role="system", content=system_prompt), + Message(role="user", content=task_prompt), + Message(role="assistant", content=agent_response), + ], + step_boundaries=[1], # agent turn starts at message index 1 + steps=[StepMeta(t=0, reward=score, done=True, timing_ms=0.0)], + summary=EpisodeSummary( + signals={"outcome": RewardSignal(name="outcome", value=score, confidence=1.0)}, + ), +) +``` + +**Existing adapters to learn from:** + +- [`clawloop/envs/math.py`](https://github.com/aganthos/clawloop/blob/main/clawloop/envs/math.py) — minimal (~80 lines), good starting point +- [`clawloop/envs/harbor.py`](https://github.com/aganthos/clawloop/blob/main/clawloop/envs/harbor.py) — sandboxed agent tasks via Docker +- [`clawloop/adapters/car.py`](https://github.com/aganthos/clawloop/blob/main/clawloop/adapters/car.py) — external process orchestration (agentbeats-run) +- [`clawloop/adapters/entropic.py`](https://github.com/aganthos/clawloop/blob/main/clawloop/adapters/entropic.py) — CRMArena A2A benchmark + +## Registering Your Adapter + +Add a builder function to the training entrypoint: + +```python +# clawloop/train.py +def _build_my_env(config, llm_clients): + adapter = MyAdapter() + adapter.setup(config) + tasks = adapter.list_tasks() + return adapter, tasks + +ENV_BUILDERS["my_env"] = _build_my_env +``` + +Then run: + +```bash +python examples/train_runner.py my_config.json +``` + +## Reward Signals + +Episodes carry named reward signals with a priority system: + +| Priority | Source | When to use | +|----------|--------|-------------| +| 1 (highest) | `user` | Explicit human feedback | +| 2 | `outcome` | Verifiable correctness (math, code tests) | +| 3 | `execution` | Tool call success, format compliance | +| 4 (lowest) | `judge` | LLM-as-judge scoring | + +`EpisodeSummary.effective_reward()` resolves to the highest-priority signal +available. If only low-confidence execution signals exist, +`summary.needs_judge()` returns `True` — useful for triggering LLM judge +evaluation only when needed. diff --git a/docs/concepts.md b/docs/concepts.md new file mode 100644 index 00000000..86ab57d2 --- /dev/null +++ b/docs/concepts.md @@ -0,0 +1,139 @@ +# Concepts + +This page explains ClawLoop's core types and how they fit together. + +## The Learning Loop + +``` +Environment → Episodes → Layers → Improved Agent → Environment → ... +``` + +An agent interacts with an environment. ClawLoop collects **episodes** — +structured traces of messages, tool calls, and rewards. Learning **layers** +process these episodes and update the agent. Repeat. + +## Episodes + +### Episode + +One complete agent trajectory: a sequence of messages with step boundaries +and reward signals. + +```python +episode.messages # list[Message] — full conversation in OpenAI format +episode.steps # list[StepMeta] — per-turn metadata (reward, timing) +episode.summary # EpisodeSummary — aggregate metrics +episode.terminal_reward() # float — final reward +``` + +### EpisodeSummary + +Aggregate metrics for a completed episode. Stores named reward signals +with priority-based resolution: user > outcome > execution > judge. + +```python +summary.effective_reward() # float in [-1, 1] — priority-resolved +summary.normalized_reward() # float in [0, 1] — for compatibility +summary.needs_judge() # bool — should an LLM judge score this? +summary.signals # dict[str, RewardSignal] +``` + +### Datum + +The input bundle passed to each learning layer — a batch of episodes plus +loss function configuration. + +```python +datum = Datum(episodes=[ep1, ep2, ...], loss_fn="default") +layer.forward_backward(datum) +``` + +## Layers + +All three layers implement the **Layer Protocol** — a two-phase contract: + +1. **`forward_backward(data)`** — accumulate updates without mutating state +2. **`optim_step()`** — apply updates atomically; rollback on failure + +### Harness + +The agent's full configuration surface: system prompt, playbook (learned +strategies), and tool schemas. The harness layer optimizes all three through +pluggable **Evolver** backends. + +The community `LocalEvolver` combines a Reflector (extracts reusable insights +from episode traces), a Playbook Curator (merges, deduplicates, prunes), and +GEPA (Pareto-front prompt evolution). Enterprise backends swap in broader +search algorithms. The harness itself is agnostic to which evolver drives it. + +```python +harness.system_prompt("math") # prompt + injected playbook entries + tool config +harness.playbook # current learned strategies +``` + +### Router + +Trainable model routing. Maps queries to the cheapest capable model using +a multi-dimension complexity scorer. + +```python +router.route(features) # returns model_id for this query +router.classify(features) # returns tier: LIGHT, MEDIUM, HEAVY, REASONING +``` + +### Weights + +Model weight training. Delegates to pluggable backends — the default +[SkyRL/Tinker](https://github.com/NovaSky-AI/SkyRL) backend supports +GRPO, PPO, SFT, DPO, LoRA, and full fine-tuning. The weights layer +computes per-task advantages from episodes and passes them to whichever +training method the backend is configured to use. + +```python +weights.active_adapter # current adapter reference (if applicable) +weights.grpo_config # training hyperparameters +``` + +## State + +### AgentState + +Bundle of all three layers. Provides a content-addressed fingerprint for +reproducibility. + +```python +agent_state = AgentState() +agent_state.harness # Harness layer +agent_state.router # Router layer +agent_state.weights # Weights layer +agent_state.state_id() # StateID — SHA-256 hash of full config +``` + +### StateID + +Content-addressed fingerprint (SHA-256) across all layers. Two agents with +identical configurations produce the same `StateID`. + +```python +state_id.combined_hash # single hash for the full configuration +state_id.harness_hash # hash of harness layer alone +``` + +## Evolution + +### Evolver + +Pluggable interface for harness optimization backends. The community edition +ships `LocalEvolver` (Reflector + GEPA + Paradigm). Enterprise backends +provide broader search via evolutionary algorithms. + +```python +result = evolver.evolve(episodes, harness_state, context) +result.insights # new playbook entries +result.candidates # prompt candidates for GEPA fronts +``` + +### Paradigm Breakthrough + +Stagnation escape mechanism. When rewards plateau, asks a strong LLM +for fundamentally new strategic directions rather than incremental refinements. diff --git a/docs/getting-started.md b/docs/getting-started.md new file mode 100644 index 00000000..77e0c5c5 --- /dev/null +++ b/docs/getting-started.md @@ -0,0 +1,98 @@ +# Getting Started + +## Installation + +Requires Python 3.11+. + +```bash +pip install -e . +``` + +For weight training (GPU): + +```bash +git submodule update --init clawloop/skyrl +pip install -e clawloop/skyrl[fsdp] +``` + +## Try It (No API Keys) + +```bash +python examples/demo_math.py --dry-run +``` + +This runs a complete learning loop with a mock LLM. The agent starts with +mistakes, the reflector analyzes failures, learns strategies, and injects them +into the system prompt. You'll see rewards climb toward 1.0. + +## With a Real LLM + +Set your API key and run: + +```bash +export ANTHROPIC_API_KEY=sk-... +python examples/demo_math.py +``` + +ClawLoop uses [litellm](https://docs.litellm.ai/) — any provider works: + +```bash +export OPENAI_API_KEY=sk-... +CLAWLOOP_TASK_MODEL=openai/gpt-5-nano python examples/demo_math.py +``` + +## Add Learning to Your Agent + +Two lines to wrap an existing LLM client: + +```python +import clawloop + +wrapped = clawloop.wrap(your_llm_client, collector) +result = wrapped.complete(messages) # transparently captures traces +``` + +Or use the full agent API: + +```python +from clawloop import ClawLoopAgent +from clawloop.envs.math import MathEnvironment + +agent = ClawLoopAgent( + task_client=task_llm, + reflector_client=reflector_llm, + base_system_prompt="You are a math solver.", +) +results = agent.learn(MathEnvironment(), iterations=10, episodes_per_iter=5) +``` + +## Config-Driven Training + +No code needed — just a JSON config: + +```bash +python examples/train_runner.py examples/configs/math_harness.json +``` + +See [`examples/configs/`](https://github.com/aganthos/clawloop/tree/main/examples/configs) +for ready-made configurations. + +## LLM Providers + +Any litellm-supported provider: + +```json +{"model": "anthropic/claude-haiku-4-5-20251001"} +{"model": "openai/gpt-5-nano"} +{"model": "gemini/gemini-3.1-flash-lite"} +``` + +Set the provider's API key as an environment variable (`ANTHROPIC_API_KEY`, +`OPENAI_API_KEY`, `GEMINI_API_KEY`), or pass `api_key` and `api_base` in +the config. + +## Next Steps + +- [Concepts](concepts.md) — understand the core types and architecture +- [Adding Environments](adding-environments.md) — connect your own benchmark +- [Examples README](https://github.com/aganthos/clawloop/blob/main/examples/README.md) — all integration paths diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 00000000..075f9ec4 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,45 @@ +# ClawLoop + +**AI agents that learn from experience.** + +Your AI agents run, fail, and forget. ClawLoop closes the loop: it observes +agent-environment interactions, learns from them, and feeds improvements back +into the agent. + +## Quick Start + +```bash +pip install -e . +python examples/demo_math.py --dry-run +``` + +No API keys needed. The agent learns strategies, builds a playbook, and +improves across iterations. + +## Three Learning Layers + +| Layer | What it optimizes | How | +|-------|------------------|-----| +| **Harness** | Prompts, playbooks, tool config | Pluggable evolver backends analyze traces and improve the agent's full configuration surface | +| **Router** | Model selection | Trainable complexity scorer routes queries to the most cost-effective model | +| **Weights** | Model weights | Pluggable training backends (SkyRL/Tinker supports GRPO, PPO, SFT, DPO, LoRA, full fine-tune, and more) | + +All three follow the same **Layer Protocol**: `forward_backward()` accumulates +updates without mutation, then `optim_step()` applies them atomically with +cross-layer rollback on failure. + +## Integration Paths + +| You have... | Start here | +|---|---| +| A Python agent | [`examples/demo_math.py`](https://github.com/aganthos/clawloop/blob/main/examples/demo_math.py) | +| An n8n or workflow platform | [`examples/n8n/`](https://github.com/aganthos/clawloop/tree/main/examples/n8n) | +| An OpenAI-compatible agent | [`examples/train_runner.py`](https://github.com/aganthos/clawloop/blob/main/examples/train_runner.py) with configs | +| Want zero-code-change learning | [`examples/openclaw_demo.py`](https://github.com/aganthos/clawloop/blob/main/examples/openclaw_demo.py) — OpenClaw transparent proxy | +| GPU resources for weight training | [`examples/recipes/`](https://github.com/aganthos/clawloop/tree/main/examples/recipes) | + +## Enterprise + +ClawLoop Enterprise adds premium learning backends and production +infrastructure. [Learn more](https://aganthos.com) or contact +[info@aganthos.com](mailto:info@aganthos.com). diff --git a/examples/README.md b/examples/README.md index 53af5201..1858636e 100644 --- a/examples/README.md +++ b/examples/README.md @@ -25,7 +25,7 @@ Use `ClawLoopAgent` with any litellm-supported LLM: ANTHROPIC_API_KEY=... python examples/demo_math.py # With OpenAI -CLAWLOOP_TASK_MODEL=openai/gpt-4o-mini CLAWLOOP_REFLECTOR_MODEL=openai/gpt-4o \ +CLAWLOOP_TASK_MODEL=openai/gpt-5-nano CLAWLOOP_REFLECTOR_MODEL=openai/gpt-5 \ python examples/demo_math.py ``` diff --git a/mkdocs.yml b/mkdocs.yml new file mode 100644 index 00000000..89a05215 --- /dev/null +++ b/mkdocs.yml @@ -0,0 +1,29 @@ +site_name: ClawLoop +site_description: AI agents that learn from experience +site_url: https://aganthos.github.io/clawloop +repo_url: https://github.com/aganthos/clawloop +repo_name: aganthos/clawloop + +theme: + name: material + palette: + scheme: default + primary: indigo + features: + - navigation.sections + - content.code.copy + +nav: + - Home: index.md + - Concepts: concepts.md + - Getting Started: getting-started.md + - Adding Environments: adding-environments.md + +markdown_extensions: + - admonition + - pymdownx.highlight + - pymdownx.superfences + - pymdownx.tabbed: + alternate_style: true + - toc: + permalink: true diff --git a/pyproject.toml b/pyproject.toml index 4148a8f0..fad8152d 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -46,8 +46,6 @@ server = [ "uvicorn>=0.20", "httpx>=0.24", ] -# tau2 = ["tau-bench"] # deferred — not yet on PyPI - [project.scripts] clawloop = "clawloop.cli:main" clawloop-server = "clawloop.server:main" @@ -56,6 +54,8 @@ clawloop-server = "clawloop.server:main" Homepage = "https://github.com/aganthos/clawloop" Repository = "https://github.com/aganthos/clawloop" Issues = "https://github.com/aganthos/clawloop/issues" +Documentation = "https://aganthos.github.io/clawloop" +Website = "https://aganthos.com" [tool.hatch.build.targets.sdist] include = [