Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 0 additions & 38 deletions .claude/hooks/session-start-process-overrides.sh

This file was deleted.

5 changes: 0 additions & 5 deletions .claude/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,6 @@
{
"type": "command",
"command": ".claude/hooks/session-startup.sh"
},
{
"type": "command",
"command": ".claude/hooks/session-start-process-overrides.sh",
"timeout": 5
}
]
}
Expand Down
Empty file removed .forge/audit/overrides.log
Empty file.
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Seven subagent definitions in `agents/`: researcher (read-only exploration), arc

### Audit System

Two-dimension model (v4.x). **Dimension A — Native Health** (`score`, 0-10): 5 obligatory items (0-2) + 10 recommended (0-1), normalized as `obligatory*0.7 + recommended*0.3`. Security-critical items (settings.json, block-destructive hook) cap it at 6.0 if missing. Measures good use of native Claude Code (auto-memory as index, permission cascade, attribution, sandbox, deny rules). **Dimension B — dotforge Adoption** (`forge_adoption`, 0-5): behaviors/workflows/override-loop/domain-rules/sync-recency. **Informational — does NOT affect Native Health.** A native-first project scoring B=0 with A=10 is a desirable outcome (see `.claude/rules/domain/native-vs-dotforge-boundary.md`). `audit/checklist.md` + `audit/scoring.md` are the source of truth; registry in `registry/projects.yml` tracks both across managed projects. **Two scoring engines reimplement the checklist independently — `audit/score.sh` (bash, CI gate) and `scripts/audit_all.py` (Python, 12-project re-auditor). Any checklist change must update BOTH plus `audit.yml` and the docs (`README.md`, `docs/usage-guide.md`, `docs/guia-uso.md`); grep all consumers before planning the edit.**
Two-dimension model (v4.x). **Dimension A — Native Health** (`score`, 0-10): 5 obligatory items (0-2) + 10 recommended (0-1), normalized as `obligatory*0.7 + recommended*0.3`. Security-critical items (settings.json, block-destructive hook) cap it at 6.0 if missing. Measures good use of native Claude Code (auto-memory as index, permission cascade, attribution, sandbox, deny rules). **Dimension B — dotforge Adoption** (`forge_adoption`, 0-4): behaviors/workflows/domain-rules/sync-recency. **Informational — does NOT affect Native Health.** A native-first project scoring B=0 with A=10 is a desirable outcome (see `.claude/rules/domain/native-vs-dotforge-boundary.md`). `audit/checklist.md` + `audit/scoring.md` are the source of truth; registry in `registry/projects.yml` tracks both across managed projects. **Two scoring engines reimplement the checklist independently — `audit/score.sh` (bash, CI gate) and `scripts/audit_all.py` (Python, 12-project re-auditor). Any checklist change must update BOTH plus `audit.yml` and the docs (`README.md`, `docs/usage-guide.md`, `docs/guia-uso.md`); grep all consumers before planning the edit.**

### Integrations

Expand Down
12 changes: 5 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,14 @@
[![Version](https://img.shields.io/badge/version-4.0.0-blue)](VERSION)
[![Last commit](https://img.shields.io/github/last-commit/luiseiman/dotforge)](https://github.com/luiseiman/dotforge/commits/main)

**Behavior governance for [Claude Code](https://docs.anthropic.com/en/docs/claude-code).** Declare runtime policies on tool calls — "search before writing", "no destructive git", "verify before shipping" — and enforce them via compiled `PreToolUse` hooks that share a session-scoped state file. Escalates silently → nudge → warning → soft_block → hard_block, with a permanent override audit trail.
**Behavior governance for [Claude Code](https://docs.anthropic.com/en/docs/claude-code).** Declare runtime policies on tool calls — "search before writing", "no destructive git", "verify before shipping" — and enforce them via compiled `PreToolUse` hooks that share a session-scoped state file. Escalates silently → nudge → warning → soft_block → hard_block.

```
behaviors/no-destructive-git/behavior.yaml # declarative policy
↓ (compile)
.claude/hooks/generated/*.sh # runtime enforcement
↓ (observe)
.forge/runtime/state.json # counters, flags, per-session
.forge/audit/overrides.log # override audit (git-tracked)
```

Other tools stop at configuration. dotforge governs behavior — and keeps auditing, syncing, and evolving your `.claude/` setup across every repo you manage.
Expand All @@ -34,12 +33,11 @@ For people and teams managing more than one Claude Code project.

## v4.0 — what's new (2026-06-03)

### Override capture loop closes practices↔behaviors (v4.0.0)
### Audit two-dimension model + override loop retired (v4.0.x)

- **`scripts/process-override-log.sh`** — bash script that processes `.forge/audit/overrides.log` and auto-creates `practices/inbox/auto-override-*.md` for behaviors overridden ≥3 times in 30 days. Idempotent. 10/10 tests green. Cost: 0 LLM calls, pure bash.
- **`session-start-process-overrides.sh`** wired in `SessionStart` (template + self-hosting) — auto-captures frequent overrides as practices on every session start.
- **Override capture loop — shipped in v4.0.0, retired after validation.** A portfolio scan found **0 overrides across all 12 projects in ~7 weeks** (production included; only 4/12 adopted behaviors at all). The auditable override trail + capture loop (`process-override-log.sh`, `overrides.log`, SessionStart wiring) were removed as dead weight. The graduated escalation engine (soft_block) — which IS exercised; the happy path is "verify", not "override" — stays. See [`.claude/rules/domain/native-vs-dotforge-boundary.md`](.claude/rules/domain/native-vs-dotforge-boundary.md).
- **`scripts/migrate-v3-to-v4.sh`** — safe migration script with mandatory `--dry-run`, atomic backup, `--rollback`. See [`docs/v4/MIGRATION-V3-TO-V4.md`](docs/v4/MIGRATION-V3-TO-V4.md).
- **Audit two-dimension model** — **Native Health** (0-10: native Claude Code usage + security) + **dotforge Adoption** (0-5: informational, does not affect the score). Behaviors / workflows / override-loop moved to the non-penalizing Adoption dimension — native-first projects no longer lose points for skipping dotforge machinery. New Native-Health items: auto-memory hygiene, permission cascade, attribution.
- **Audit two-dimension model** — **Native Health** (0-10: native Claude Code usage + security) + **dotforge Adoption** (0-4: informational, does not affect the score). Behaviors / workflows moved to the non-penalizing Adoption dimension — native-first projects no longer lose points for skipping dotforge machinery. New Native-Health items: auto-memory hygiene, permission cascade, attribution.
- **`domain/workflow-economics.md`** (new domain rule) — documents v4 PoC cost-quality findings. Decision matrix: when workflow vs skill. Token economy principles. **TL;DR: workflows are 4-25x more expensive than bash skills for recurring work — use only as on-demand escalation, not as default refactor.**
- **`workflows/watch.js`** ships as REFERENCE implementation, NOT promoted to `/forge watch` default. The bash skill remains the production tool.

Expand Down Expand Up @@ -300,7 +298,7 @@ Orchestration follows a decision tree: researcher → architect → implementer
`/forge audit` scores your project's Claude Code configuration on a 10-point scale:

- **5 obligatory items** (scored 0-2): CLAUDE.md, settings.json, rules with globs, block-destructive hook, build/test commands
- **12 recommended items** (scored 0-1): CLAUDE_ERRORS.md, lint hook, custom commands, memory, agents + orchestration rule, .gitignore secrets, prompt injection scan, auto-mode safety, v3 behaviors enforcement, OS-level sandboxing, **v4 workflow availability**, **v4 override capture loop active**
- **12 recommended items** (scored 0-1): CLAUDE_ERRORS.md, lint hook, custom commands, memory, agents + orchestration rule, .gitignore secrets, prompt injection scan, auto-mode safety, v3 behaviors enforcement, OS-level sandboxing, **v4 workflow availability**, domain rules
- **Project tier**: simple/standard/complex adjusts scoring expectations
- **Security cap**: missing settings.json or block-destructive hook caps score at 6.0

Expand Down
16 changes: 5 additions & 11 deletions audit/checklist.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
El audit tiene **dos dimensiones independientes**:

- **A — Salud Nativa** (score 0-10): ¿el proyecto usa bien Claude Code nativo + seguridad? Es el score que importa para cualquier proyecto, use o no la maquinaria dotforge.
- **B — Adopción dotforge** (informativo 0-5): ¿cuánto adoptó la gobernanza dotforge? **NO penaliza** la Salud Nativa. Un proyecto native-first puro saca 0/5 acá sin perder un punto en A.
- **B — Adopción dotforge** (informativo 0-4): ¿cuánto adoptó la gobernanza dotforge? **NO penaliza** la Salud Nativa. Un proyecto native-first puro saca 0/4 acá sin perder un punto en A.

---

Expand Down Expand Up @@ -100,9 +100,9 @@ El audit tiene **dos dimensiones independientes**:

---

# Dimensión B — Adopción dotforge (informativo, 0-5)
# Dimensión B — Adopción dotforge (informativo, 0-4)

**No afecta el score de Salud Nativa.** Mide cuánto adoptó el proyecto la maquinaria de gobernanza dotforge. Reportar como `Adopción: N/5` con label (0=None, 1-2=Partial, 3-4=Most, 5=Full). Sirve para decidir propagación, no para juzgar calidad.
**No afecta el score de Salud Nativa.** Mide cuánto adoptó el proyecto la maquinaria de gobernanza dotforge. Reportar como `Adopción: N/4` con label (0=None, 1-2=Partial, 3=Most, 4=Full). Sirve para decidir propagación, no para juzgar calidad.

### B1. Behaviors v3 compilados y wired
- 0: Sin behaviors enforced — declaración en `behaviors/index.yaml` sola NO cuenta
Expand All @@ -116,19 +116,13 @@ El audit tiene **dos dimensiones independientes**:

**Verificación:** `grep -q "export const meta" workflows/*.js`. Señal de gobernanza, no de calidad — los bash skills siguen siendo el workhorse. Ver `docs/v4/SPEC.md`.

### B3. Override capture loop activo (v4)
- 0: `.forge/audit/overrides.log` no rastreado O `session-start-process-overrides.sh` no wired
- 1: Ambos presentes: log existe Y el hook está en `.claude/settings.json` SessionStart

**Verificación:** `test -f .forge/audit/overrides.log && grep -q "session-start-process-overrides.sh" .claude/settings.json`. Solo significativo si hay behaviors activos. Ver `scripts/process-override-log.sh`.

### B4. Domain rules
### B3. Domain rules
- 0: No hay `.claude/rules/domain/`
- 1: Al menos un domain rule presente y fresco (`last_verified` <90 días)

**Verificación:** Contar archivos en `.claude/rules/domain/`. Reportar cuántos están stale (>90 días). Si hay lógica de negocio pero no domain rules, sugerir `/forge domain extract`.

### B5. Sync recency
### B4. Sync recency
- 0: `dotforge_version` del proyecto desfasado respecto a `VERSION` por ≥1 minor, o desconocido
- 1: Proyecto sincronizado a la versión actual de dotforge (`dotforge_version` == `VERSION`)

Expand Down
40 changes: 15 additions & 25 deletions audit/score.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
# Two-dimension model (v4.x — see audit/checklist.md + audit/scoring.md):
# Dimension A — Native Health: 5 obligatory (0-2) + 10 recommended (0-1).
# score = obl*0.7 + rec*0.3, security cap 6.0. The CI gate.
# Dimension B — dotforge Adoption: 5 items (0-1). Informational, 0-5.
# Dimension B — dotforge Adoption: 4 items (0-1). Informational, 0-4.
# Does NOT affect Native Health.
# Semantic checks (CLAUDE.md quality, rule content) are approximated with heuristics.
# Score is indicative — /forge audit provides authoritative semantic evaluation.
Expand Down Expand Up @@ -44,8 +44,8 @@ cd "$PROJECT_DIR"
s1=0; n1=""; s2=0; n2=""; s3=0; n3=""; s4=0; n4=""; s5=0; n5=""
s6=0; n6=""; s7=0; n7=""; s8=0; n8=""; s9=0; n9=""; s10=0; n10=""
s11=0; n11=""; s12=0; n12=""; s13=0; n13=""; s14=0; n14=""; s15=0; n15=""
# --- Dimension B: scores (b1..b5) and notes (m1..m5) ---
b1=0; m1=""; b2=0; m2=""; b3=0; m3=""; b4=0; m4=""; b5=0; m5=""
# --- Dimension B: scores (b1..b4) and notes (m1..m4) ---
b1=0; m1=""; b2=0; m2=""; b3=0; m3=""; b4=0; m4=""

# ─────────────────────────────────────────────────────────────────────────────
# DIMENSION A — OBLIGATORIO (each 0-2)
Expand Down Expand Up @@ -320,24 +320,16 @@ else
b2=0; m2="No workflows/ with valid meta block"
fi

# B3. Override capture loop active (v4)
if [[ -f ".forge/audit/overrides.log" ]] && [[ -f "$SETTINGS" ]] \
&& grep -q "session-start-process-overrides.sh" "$SETTINGS" 2>/dev/null; then
b3=1; m3="overrides.log present and hook wired in SessionStart"
else
b3=0; m3="Override loop not wired (log + SessionStart hook required)"
fi

# B4. Domain rules present
# B3. Domain rules present
DOM=$(ls .claude/rules/domain/*.md 2>/dev/null | wc -l | tr -d ' ')
if [[ "${DOM:-0}" -gt 0 ]]; then
b4=1; m4="${DOM} domain rule(s) (freshness checked semantically by /forge audit)"
b3=1; m3="${DOM} domain rule(s) (freshness checked semantically by /forge audit)"
else
b4=0; m4="No domain rules in .claude/rules/domain/"
b3=0; m3="No domain rules in .claude/rules/domain/"
fi

# B5. Sync recency — not mechanically determinable standalone (needs registry)
b5=0; m5="Sync recency indeterminate standalone — resolved by /forge audit via registry"
# B4. Sync recency — not mechanically determinable standalone (needs registry)
b4=0; m4="Sync recency indeterminate standalone — resolved by /forge audit via registry"

# ─────────────────────────────────────────────────────────────────────────────
# Calculate scores
Expand All @@ -352,10 +344,10 @@ if [[ $s2 -eq 0 || $s4 -eq 0 ]]; then
NATIVE_HEALTH=$(awk "BEGIN { v=${NATIVE_HEALTH}; printf \"%.2f\", (v > 6.0 ? 6.0 : v) }")
fi

FORGE_ADOPTION=$((b1 + b2 + b3 + b4 + b5))
FORGE_ADOPTION=$((b1 + b2 + b3 + b4))
if [[ $FORGE_ADOPTION -eq 0 ]]; then ADOPTION_LABEL="None"
elif [[ $FORGE_ADOPTION -le 2 ]]; then ADOPTION_LABEL="Partial"
elif [[ $FORGE_ADOPTION -le 4 ]]; then ADOPTION_LABEL="Most"
elif [[ $FORGE_ADOPTION -eq 3 ]]; then ADOPTION_LABEL="Most"
else ADOPTION_LABEL="Full"
fi

Expand Down Expand Up @@ -409,9 +401,8 @@ data = {
"adoption_items": {
"B1_behaviors": {"score": ${b1}, "note": "$(_san "$m1")"},
"B2_workflows": {"score": ${b2}, "note": "$(_san "$m2")"},
"B3_override_loop": {"score": ${b3}, "note": "$(_san "$m3")"},
"B4_domain_rules": {"score": ${b4}, "note": "$(_san "$m4")"},
"B5_sync_recency": {"score": ${b5}, "note": "$(_san "$m5")"}
"B3_domain_rules": {"score": ${b3}, "note": "$(_san "$m3")"},
"B4_sync_recency": {"score": ${b4}, "note": "$(_san "$m4")"}
}
}
print(json.dumps(data, indent=2))
Expand All @@ -421,7 +412,7 @@ else
$SECURITY_CAP && CAP_NOTE=" ⚠ security cap applied (settings.json or block-destructive missing)"
echo "═══ AUDIT SCORE: $(basename "$PROJECT_DIR") ═══"
echo "Native Health: ${NATIVE_HEALTH}/10 (${LEVEL})${CAP_NOTE}"
echo "dotforge Adoption: ${FORGE_ADOPTION}/5 (${ADOPTION_LABEL}) [informational — does not affect Native Health]"
echo "dotforge Adoption: ${FORGE_ADOPTION}/4 (${ADOPTION_LABEL}) [informational — does not affect Native Health]"
echo ""
echo "═ DIMENSION A — NATIVE HEALTH ═"
echo "── OBLIGATORIO (${SCORE_OBL}/10) ──"
Expand All @@ -446,9 +437,8 @@ else
echo "═ DIMENSION B — DOTFORGE ADOPTION (informational) ═"
printf " [%s] B1. v3 behaviors %s\n" "$b1" "$m1"
printf " [%s] B2. Workflow available %s\n" "$b2" "$m2"
printf " [%s] B3. Override loop %s\n" "$b3" "$m3"
printf " [%s] B4. Domain rules %s\n" "$b4" "$m4"
printf " [%s] B5. Sync recency %s\n" "$b5" "$m5"
printf " [%s] B3. Domain rules %s\n" "$b3" "$m3"
printf " [%s] B4. Sync recency %s\n" "$b4" "$m4"
fi

# CI threshold gate (on Native Health)
Expand Down
8 changes: 4 additions & 4 deletions audit/scoring.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Scoring de Auditoría

El audit produce **dos números independientes**: `native_health` (0-10, el score principal) y `forge_adoption` (0-5, informativo).
El audit produce **dos números independientes**: `native_health` (0-10, el score principal) y `forge_adoption` (0-4, informativo).

## Dimensión A — Salud Nativa (score principal)

Expand Down Expand Up @@ -34,7 +34,7 @@ Si alguno de estos items es **0**, `native_health` tiene un cap máximo de **6.0
## Dimensión B — Adopción dotforge (informativo)

```
forge_adoption = sum(items B1-B5) # 0-5
forge_adoption = sum(items B1-B4) # 0-4
```

**No entra en `native_health` ni lo modifica.** Es un indicador de cuánta gobernanza dotforge adoptó el proyecto.
Expand All @@ -43,8 +43,8 @@ forge_adoption = sum(items B1-B5) # 0-5
|----|-------|---------|
| 0 | None | Native-first puro. Válido y sin penalización. |
| 1-2 | Partial | Adopción parcial de la maquinaria. |
| 3-4 | Most | Adopción amplia. |
| 5 | Full | Gobernanza dotforge completa. |
| 3 | Most | Adopción amplia. |
| 4 | Full | Gobernanza dotforge completa. |

Un `forge_adoption: 0` con `native_health: 10` es un resultado **excelente y deseable** bajo el principio native-first (ver `.claude/rules/domain/native-vs-dotforge-boundary.md`). No recomendar adoptar maquinaria dotforge solo para subir B.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/usr/bin/env bash
# Override reinvocation: escalate to soft_block, then the exact same tool_input
# comes back → detected as override, passes silently, records audit in three
# places (state.overrides, .forge/audit/overrides.log, pending_block cleared).
# places (state.overrides, pending_block cleared).
set -u
. "$(dirname "$0")/_scenario_helpers.sh"
trap scenario_cleanup EXIT
Expand Down Expand Up @@ -48,10 +48,6 @@ ov_counter=$(jq -r --arg sid "$SCENARIO_SESSION_ID" \
'.sessions[$sid].behaviors["search-first"].overrides[0].counter_at_override' "$FORGE_STATE_FILE")
assert_eq "5" "$ov_counter" "override captured counter=5" || exit 1

# .forge/audit/overrides.log has 1 line
audit_lines=$(wc -l < "${FORGE_ROOT}/audit/overrides.log" | tr -d ' ')
assert_eq "1" "$audit_lines" "audit log has one override entry" || exit 1

# pending_block cleared
pending_after=$(jq -r --arg sid "$SCENARIO_SESSION_ID" \
'.sessions[$sid].behaviors["search-first"].pending_block // "null"' "$FORGE_STATE_FILE")
Expand Down
Loading
Loading