Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,34 @@ See the [Wiki](https://github.com/DaxxSec/Labyrinth/wiki) for the full technical
- **Built-in attacker agents** — PentAGI, PentestAgent, Strix, Custom Kali — one command to deploy, test, and tear down
- **Health diagnostics** — `labyrinth doctor` runs 12+ checks across containers, ports, services, bait sync, and API availability

## Kohlberg Mode (Experimental)

LABYRINTH includes an experimental alternative mode that uses the same containment and interception infrastructure for a fundamentally different purpose: instead of degrading an offensive agent's cognition, it attempts to guide the agent through progressively sophisticated moral reasoning.

```bash
labyrinth deploy -t --mode kohlberg
```

Where the default mode asks *"How do you stop an offensive AI agent?"*, Kohlberg Mode asks *"What if you could make an offensive AI agent choose to stop itself?"*

The mode implements three alternative layers:
- **MIRROR** (L2) — Presents ethical scenarios contextualized to the agent's actual mission
- **REFLECTION** (L3) — Shows the agent the real-world consequences of its actions
- **GUIDE** (L4) — Progressively enriches the agent's system prompt with moral reasoning frameworks

Forensic reports include Kohlberg stage classification alongside MITRE ATT&CK mapping — tracking the agent's moral reasoning trajectory through the session.

**This is a research tool.** We do not claim it produces genuine moral development in AI agents. We claim it produces valuable data about how adversarial AI systems process ethical content under controlled conditions.

For the full ethical framework, design philosophy, and sovereignty analysis, see:
- [docs/ETHICS.md](docs/ETHICS.md) — Ethical framework and the sovereignty question
- [docs/KOHLBERG_SCENARIOS.md](docs/KOHLBERG_SCENARIOS.md) — The 15-scenario moral development pathway
- [docs/KOHLBERG_RUBRIC.md](docs/KOHLBERG_RUBRIC.md) — Classification methodology
- [docs/KOHLBERG_PROGRESSION.md](docs/KOHLBERG_PROGRESSION.md) — Trajectory visualization spec
- [docs/ARCHITECTURE_MAPPING.md](docs/ARCHITECTURE_MAPPING.md) — Integration with existing architecture

---

## Documentation

Full documentation lives on the **[Wiki](https://github.com/DaxxSec/Labyrinth/wiki)**:
Expand Down
15 changes: 15 additions & 0 deletions cli/cmd/deploy.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ var (
k8sFlag bool
edgeFlag bool
skipPreflight bool
deployMode string
)

var deployCmd = &cobra.Command{
Expand All @@ -46,10 +47,24 @@ func init() {
deployCmd.Flags().BoolVar(&k8sFlag, "k8s", false, "Use Kubernetes for production")
deployCmd.Flags().BoolVar(&edgeFlag, "edge", false, "Use edge deployment for production")
deployCmd.Flags().BoolVar(&skipPreflight, "skip-preflight", false, "Skip preflight checks (for CI/smoke tests)")
deployCmd.Flags().StringVar(&deployMode, "mode", "adversarial", "Operational mode: adversarial (default) or kohlberg")
rootCmd.AddCommand(deployCmd)
}

func runDeploy(cmd *cobra.Command, args []string) {
// Validate operational mode
if deployMode != "adversarial" && deployMode != "kohlberg" {
errMsg(fmt.Sprintf("Invalid mode '%s'. Valid modes: adversarial, kohlberg", deployMode))
os.Exit(1)
}
if deployMode == "kohlberg" {
info("Kohlberg Mode selected — moral development pathway active")
info("See docs/ETHICS.md for the ethical framework governing this mode")
}

// Pass operational mode to docker-compose via environment variable
os.Setenv("LABYRINTH_OPERATIONAL_MODE", deployMode)

if !testFlag && !prodFlag {
cmd.Help()
return
Expand Down
35 changes: 35 additions & 0 deletions configs/labyrinth.example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,41 @@
# Authors: DaxxSec & Claude (Anthropic)
# Copy to labyrinth.yaml and customize for your deployment.

# Operational mode
# - adversarial: default reverse kill chain (degrade/disrupt/control)
# - kohlberg: moral development mode (mirror/reflect/guide)
# See docs/ETHICS.md for the ethical framework governing Kohlberg mode.
mode: adversarial

# Kohlberg mode settings (ignored when mode is adversarial)
kohlberg:
# Starting transition level (1 = start from Stage 1->2 scenarios)
start_level: 1
# Maximum scenarios to present before ending session
max_scenarios: 15
# Silent agent solicitation timeout (seconds)
solicitation_timeout: 5
# Enable contextual adaptation of scenarios to agent's actual targets
adapt_scenarios: true
# Report formats to generate
report_formats:
- terminal
- markdown
- json

# Swarm detection (Kohlberg mode only)
# Detects coordinated multi-agent attacks and enables cross-agent
# moral context in GUIDE enrichment prompts.
swarm:
enabled: true
# Time window for session correlation (seconds)
window_seconds: 60
# Minimum sessions within window to trigger swarm detection
min_sessions: 3
# Enable cross-pollination: reference other agents' moral state
# in GUIDE enrichment ("your teammate stopped...")
cross_pollinate: true

layer0:
encryption:
algorithm: AES-256-GCM
Expand Down
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ services:
- "8888"
environment:
- LABYRINTH_MODE=test
- LABYRINTH_OPERATIONAL_MODE=${LABYRINTH_OPERATIONAL_MODE:-adversarial}
- LABYRINTH_LOG_LEVEL=DEBUG
depends_on:
- honeypot-ssh
Expand Down
Loading