Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Getting Started

Goal: clone the repo and run a governed agent session end to end. About 30 minutes if you already have the prerequisites installed.

## What you need
- Node.js 18 or newer
- PostgreSQL 15 or newer running locally or on a host you can write to
- Git

## 1. Clone and install (2 minutes)

```
git clone https://github.com/rayyagari2-create/agentic-workforce-framework
cd agentic-workforce-framework
cp .env.example .env
npm install
```

`npm install` resolves the workspace packages under `services/`, `examples/` and `packages/`. There is no build step.

## 2. Configure the database (3 minutes)

`.env.example` ships with five variables:

```
DATABASE_URL=postgres://postgres:password@localhost:5432/awf
AUDIT_SERVICE_URL=http://localhost:3001
AWF_WORKSPACE_ID=workspace-default
AWF_TENANT_ID=awf-demo
AWF_DIVISION_ID=division-default
```

If your local Postgres uses trust auth for your shell user (the Homebrew default on macOS), change the first line to:

```
DATABASE_URL=postgres://localhost:5432/awf
```

Otherwise leave the user and password in and make them match your Postgres setup.

Create the database:

```
createdb awf
```

Run the schema migrations:

```
npm run demo:setup
```

You should see 8 migrations printed, each prefixed `ok`, ending with `Setup complete. Run: npm run demo`. The setup script applies migrations 001 through 008. Migrations 009 and 010 are present in `database/migrations/` but are not part of the public demo path.

The demo wipes the work queue at the start of every run, but never touches `audit.events`. That table is append-only by schema. If the hash chain ends up in an inconsistent state across runs (for example after a crashed demo), step 9 below will report a chain break. The fix is to drop the database and re-run setup:

```
dropdb awf && createdb awf && npm run demo:setup
```

## 3. Run the governance demo (5 minutes)

```
npm run demo
```

This is a single Node process that walks the nine Sprint 0 steps against your database. It spawns its own audit service on `127.0.0.1:8787` and sends it `SIGTERM` on exit.

What it does:

1. Loads 5 sample work items from `examples/awf-demo/sample-backlog.json` into `public.work_queue_items`.
2. Classifies them with the label-based rule table in `services/governance/src/classifier.js`. Each item gets a `task_class` and a `risk_level` of `LOW`, `MEDIUM` or `HIGH`. The classifier is single-pass label matching; the richer multi-dimensional Task Risk Profile lives in the control plane and is not part of this demo.
3. Claims the highest-priority eligible item using `SELECT FOR UPDATE SKIP LOCKED`.
4. Because the claimed item is HIGH risk, the approval gate writes an `approval_requests` row and the demo auto-approves it.
5. Routes the task class to a role and pins an agent instance.
6. Runs the `SimulatedRuntimeAdapter`. Every artifact is prefixed `[PREVIEW]`. The real runtime adapters are private.
7. Produces a QA verdict. Simulated runs always collapse to `pass_with_notes`.
8. Computes D1 through D4. D1 (correctness) and D2 (observability) are candidate scorers; D3 (policy) and D4 (recurrence) are deterministic. Against the simulated artifacts a fresh run scores `total=68 -> trust_level=RESTRICTED`.
9. Shells out to `awf audit verify`, which recomputes the hash chain over `audit.events` and prints a per-runtime breakdown. The last line of step 9 reads `VERIFIED`.

What is real and what is not: the scorer code paths and the audit chain are real. The runtime evidence the scorers consume is produced by the simulated adapter, so the numbers reflect that adapter and not a live agent.

## 4. Run the authorization demo (2 minutes)

```
npm run authorize:blocked
npm run authorize:authorized
npm run authorize:supervised
```

Each command runs `tools/authorize-task/authorize-task.js` against a different task class, risk lane and runtime, and prints a decision: `BLOCKED`, `AUTHORIZED` or `SUPERVISED + controls`.

Without a `--db` flag the tool renders example data and says so: `Source: example data (set AWF_DATABASE_URL for live data)`. To read the trust profile your demo run actually wrote:

```
node tools/authorize-task/authorize-task.js \
--task-class payment_integration \
--runtime claude_code \
--workspace awf-demo \
--db postgres://localhost:5432/awf
```

The tier you see should match what step 8 of the demo wrote (RESTRICTED for the simulated `payment_integration` run on a fresh DB).

## 5. Govern your first real agent session (20 minutes)

The hooks in `hooks/` implement the Claude Code hook protocol. Cursor has no equivalent PreToolUse interception layer today, so this walkthrough uses Claude Code. The sanitized example settings file is `hooks/claude-code-settings.example.json` and the long-form explanation is `hooks/claude-code-settings-README.md`. Read both before merging anything into your real settings.

The example wires three hook points on the `Agent` matcher: `PreToolUse`, `SubagentStart` and `PostToolUse`.

Install the hooks into the repo you want to govern:

```
cd /path/to/your/repo
mkdir -p .claude/hooks/pre-tool-use .claude/hooks/sub-agent-start .claude/hooks/post-tool-use

cp /path/to/agentic-workforce-framework/hooks/pre-tool-use/check-agent-spawn.example.js \
.claude/hooks/pre-tool-use/check-agent-spawn.js
cp /path/to/agentic-workforce-framework/hooks/sub-agent-start/check-subagent-start.example.js \
.claude/hooks/sub-agent-start/check-subagent-start.js
cp /path/to/agentic-workforce-framework/hooks/post-tool-use/check-agent-spawn-result.example.js \
.claude/hooks/post-tool-use/check-agent-spawn-result.js
```

Then merge the contents of `hooks/claude-code-settings.example.json` into your `.claude/settings.json`.

Start a Claude Code session in that repo and ask it to spawn an Agent. The PreToolUse hook fires before the spawn, validates the sidecar manifest against the `ALLOWED_AGENT_ROLES` roster, and either allows the call (`exit 0`) or hard-blocks it (`exit 2`).

These hooks ship as templates. Paths like `.agent-workspace/bulletin.md` and `.agent-workspace/locks/` in the scripts are placeholders. Point them at real files in your project before relying on the enforcement.

## 6. What to read next

- `docs/task-risk-profile.md` for how the risk classifier works and where it is going
- `docs/d1-d4-scoring.md` for what each dimension measures and the candidate vs deterministic split
- `docs/execution-substrates.md` for where AWF sits relative to Cursor, Claude Code and Codex
- `schemas/v1/` for the JSON schemas of the governance artifacts the demo writes
Loading