diff --git a/AGENTS.md b/AGENTS.md index 80a4c94..6c6b337 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -47,6 +47,21 @@ The server communicates over JSON-RPC stdio. All tools return structured JSON wi | `boruna_capability_list` | List the frozen 1.0 capability set with `capability_set_hash` | | `boruna_policy_validate` | Validate a `Policy` JSON document; returns typed `error_kind` on rejection | +## Agent-native CLI surfaces + +Beyond the MCP server, the `boruna` binary exposes read-only inspection +commands designed for agents. Every one accepts `--json`: + +| Command | Use | +|---------|-----| +| `boruna skills list` / `boruna skills get ` | Self-describing docs — learn `.ax` and the toolchain from the binary alone | +| `boruna lang codes` | Resolve any `E0NN` diagnostic code seen in `lang check --json` output | +| `boruna doctor` | Verify the environment before relying on the toolchain | +| `boruna workflow graph ` | Read a workflow's DAG (nodes, edges, topo order) before editing it | +| `boruna size ` | Check the bytecode artifact cost of a program | + +A fresh agent should start with `boruna skills get cli`. + ## Usage patterns **Compile and check for errors:** diff --git a/CHANGELOG.md b/CHANGELOG.md index c5e30c0..da6e9cf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,16 @@ Versioning follows [Semantic Versioning](https://semver.org/). ## [Unreleased] +### Added + +- **Agent-native CLI surfaces** — five read-only, `--json`-capable commands so AI agents can inspect Boruna projects without reading source. Motivated by a competitive review of `vercel-labs/zero`. + - `boruna lang codes [--json]` — emit the registry of stable diagnostic codes (`E001`–`E009`) with name, summary, and category. Backed by `tooling/src/diagnostics/registry.rs`; a drift test keeps the registry 1:1 with the `E0NN` constants the compiler emits. + - `boruna doctor [--json]` — environment and toolchain health: binary version, compiled features, Rust toolchain, data-directory writability, and project-layout detection. Exits 1 if any check fails. + - `boruna workflow graph [--json]` — emit DAG facts for a workflow: nodes (kind, capabilities, dependencies), edges, topological order, roots, and leaves. Exits 1 on a non-DAG. + - `boruna size [--json]` — bytecode artifact cost: per-function opcode counts, module-wide totals, and serialized `.axbc` byte size. + - `boruna skills list` / `boruna skills get [--json]` — embedded, agent-curated documentation (`ax-language`, `cli`, `workflows`, `diagnostics`) compiled into the binary, usable with no repository checkout. +- **`docs/reference/diagnostic-codes.md`** — human reference for the diagnostic-code registry. + ## [1.3.0] — 2026-04-30 ### Stable diff --git a/claudedocs/research_zerolang_vs_boruna_2026-05-17.md b/claudedocs/research_zerolang_vs_boruna_2026-05-17.md new file mode 100644 index 0000000..44ceef0 --- /dev/null +++ b/claudedocs/research_zerolang_vs_boruna_2026-05-17.md @@ -0,0 +1,122 @@ +# Research: ZeroLang (vercel-labs/zero) vs Boruna — сравнение + +**Дата:** 2026-05-17 +**Изследван обект:** https://zerolang.ai/ · https://github.com/vercel-labs/zero +**Сравнен с:** Boruna (ai-lang) — този проект +**Дълбочина:** standard · **Увереност:** висока (първични източници: official site + repo README + GitHub API) + +--- + +## Executive Summary + +ZeroLang и Boruna **споделят една и съща философия, но са различни продукти**. + +- **Обща ДНК:** и двата са „agent-native" езици — експлицитни capabilities в сигнатурите, JSON структурни диагностики с repair-метаданни, една малка toolchain. +- **Различна същност:** Zero е **системен език за компилиране на малки native бинарни инструменти** (без GC, без runtime). Boruna е **платформа за детерминистично, одитируемо изпълнение на enterprise AI workflows** (bytecode + VM + evidence bundles). +- **Не сме конкуренти на едно поле.** Zero конкурира Rust/Zig/C за писане на CLI инструменти. Boruna конкурира workflow/orchestration платформи за compliance. +- **Има 6–7 идеи, които си струва да заемем** — без да губим нашия диференциатор (determinism + replay + hash-chained evidence). + +⚠️ **Стратегически сигнал:** Zero е създаден от **Vercel Labs**, на **2026-05-15** (преди 2 дни), и вече има **1166 звезди**. Категорията „agent-native език с експлицитни capabilities + JSON диагностики" вече е валидирана от голям играч. Това е едновременно потвърждение, че сме на права посока, и сигнал, че трябва ясно да защитим това, което Zero **не** прави. + +--- + +## Какво е ZeroLang + +| | | +|---|---| +| Тип | Системен programming език (general-purpose, за малки native tools) | +| Автор | Vercel Labs | +| Създаден | 2026-05-15 · 1166 ⭐ за 2 дни · Apache-2.0 | +| Език на компилатора | C (`native/zero-c/`) + self-hosted (`compiler-zero/`) | +| Файлово разширение | `.0` | +| Статус | Experimental, нестабилен | + +**Таглайн:** „The programming language for agents — humans and AI agents can read, repair, inspect, and ship small native programs together." + +**Ключови технически свойства:** +- **Native артефакти** — статичен dispatch, **без задължителен GC**, без event loop, без скрит runtime. `zero size --json` показва цената на артефакта. +- **Capability-based I/O** — функциите декларират какво докосват; компилаторът отхвърля недостъпни capabilities **по време на компилация**. +- **Експлицитни effects & memory** — сигнатурите излагат fallibility (`raises`) и capabilities; алокацията е видима. +- **Agent-first tooling** — `zero check --json` връща структурни диагностики със стабилни кодове (`NAM003`) и `repair` метаданни. +- **Cross-target проверки** — компилаторът проверява target-neutral код за няколко target-а; emit на `linux-musl-x64` и др. +- **C ABI boundary** — експорт на C ABI символи за low-level interop. +- **Една toolchain:** `check`, `build`, `test`, `format`, `graph`, `size`, `routes`, `skills`, `doctor`, `document`. + +--- + +## Сравнителна таблица + +| Измерение | ZeroLang | Boruna | +|---|---|---| +| **Категория** | Системен език за native tools | Платформа за изпълнение на AI workflows | +| **Цел на компилация** | Native бинарни файлове (exe, C ABI) | Bytecode за custom VM | +| **Runtime модел** | Без runtime, без GC, без event loop | VM с capability gateway, actor система | +| **Главен диференциатор** | Малки артефакти, native, размерна прозрачност | Determinism + replay + hash-chained evidence bundles | +| **Capabilities** | ✅ Експлицитни, проверка при компилация | ✅ Експлицитни (`!{net.fetch}`), enforce-ват се в VM | +| **JSON диагностики + repair** | ✅ `check --json`, стабилни кодове | ✅ `lang check --json`, `lang repair`, suggested patches | +| **Agent интеграция** | `zero skills get`, machine-readable docs | `boruna-mcp` MCP сървър (10 tools) | +| **Compliance / audit** | ❌ Не е фокус | ✅ Ядро — EvidenceBundle, AuditLog, verify | +| **Workflow / DAG** | ❌ Няма | ✅ Ядро — WorkflowDef, validator, runner | +| **Framework модел** | ❌ Няма | ✅ Elm-архитектура (init/update/view) | +| **Целеви потребител** | Разработчици/агенти, пишещи CLI tools | Enterprise, изпълняващ одитируеми AI процеси | +| **Зрялост** | 2 дни, experimental, голям hype | По-зрял (557+ теста, 9 крейта, roadmap до 1.0) | + +--- + +## Прилики (реално конвергентни решения) + +1. **Capability-based effects** — почти идентична концепция. Zero: „compiler rejects unavailable capabilities". Boruna: `!{net.fetch}` анотации, VM gateway. И двата правят side effects видими в сигнатурата. +2. **JSON диагностики с repair-метаданни** — Zero: `"repair": {"id": "declare-missing-symbol"}`. Boruna: diagnostics със suggested patches + `boruna lang repair`. Една и съща идея: „хората четат текста, агентите четат JSON-а". +3. **Local reasoning** — сигнатурите излагат fallibility + capabilities за двата езика. +4. **Една малка toolchain** — обединен CLI за check/build/test/format/inspect. +5. **Agent-native позициониране** — и двата изрично се продават като езици, проектирани да бъдат поддържани от AI агенти. + +**Извод:** Не сме копирали един друг — независимо стигнахме до едни и същи принципи. Това валидира дизайна на Boruna. + +--- + +## Ключови разлики (нашият защитен ров) + +Zero **не прави** нищо от следното, а то е сърцето на Boruna: +- Детерминистично изпълнение с гаранция „same input → same output". +- Запис/replay на изпълнения (`EventLog`, `ReplayEngine`). +- Hash-chained, tamper-evident evidence bundles за compliance/audit. +- DAG workflow оркестрация с policy gates и human approval. +- Enterprise compliance шаблони (SOC 2, HIPAA, финанси). + +Обратно — Zero има неща, които Boruna няма (native компилация, без GC, C ABI, размерни отчети), но те **не са релевантни** за нашата ниша (изпълнение на workflows, не доставка на native бинарни файлове). + +--- + +## Идеи за заемане (приоритизирани) + +| # | Идея от Zero | Приложимост за Boruna | Усилие | +|---|---|---|---| +| 1 | **Стабилни кодове за диагностики** (`NAM003`) | Ако още нямаме стабилни, документирани error codes — да въведем. Критично за agent-repair надеждност. | Малко | +| 2 | **`size --json` / отчети за цена на артефакт** | `boruna` може да докладва размер на bytecode модул, брой стъпки, budget цена преди изпълнение. | Средно | +| 3 | **`graph --json`** — graph facts от CLI | Boruna има workflow DAG-ове — изложи `workflow graph --json` за визуализация/инспекция от агенти. | Малко | +| 4 | **`doctor --json`** — диагностика на toolchain/среда | Команда, която проверява среда, features, policy конфигурация и връща JSON. | Малко | +| 5 | **`skills get`** — machine-readable docs пакети за агенти | Boruna има MCP; добави skill/docs пакет, който агентите дърпат за самообучение по `.ax`. | Средно | +| 6 | **Cross-target / pre-flight проверки** | Boruna може да докладва изисквани capabilities/policy **преди** изпълнение („този workflow ще иска net.fetch + db.query"). | Средно | +| 7 | **Позициониране/маркетинг** — `curl \| bash` инсталатор, ясен „за агенти" таглайн, docs site | Подобри лендинга на Boruna с конкретен agent-native наратив. | Малко | + +**Какво да НЕ заемаме:** native компилация, премахване на VM, C ABI, без GC — те противоречат на detmerinism/replay модела ни. + +--- + +## Препоръка + +1. **Запази стратегията.** Boruna и Zero не са конкуренти — различни слоеве. Не пренасочвай Boruna към native компилация. +2. **Засили диференциатора.** В маркетинга и docs изрично подчертай: determinism + replay + evidence bundles — точно това, което Zero (и Vercel) не предлагат. +3. **Бързи победи:** въведи стабилни диагностични кодове (#1), `doctor --json` (#4), `workflow graph --json` (#3) — евтини, директно повишават agent-надеждността. +4. **Следи Zero.** Голям играч в съседна ниша; ако Zero добави workflow/audit слой, става директен конкурент. Преглеждай repo-то им периодично. + +--- + +## Източници + +- https://zerolang.ai/ — официален сайт (fetched 2026-05-17) +- https://github.com/vercel-labs/zero — README + GitHub API метаданни (fetched 2026-05-17) +- Boruna `architecture` Serena memory + project CLAUDE.md + +*Това е изследователски доклад. Без имплементация — следващата стъпка е решение на потребителя.* diff --git a/crates/llmvm-cli/src/doctor.rs b/crates/llmvm-cli/src/doctor.rs new file mode 100644 index 0000000..3bb7837 --- /dev/null +++ b/crates/llmvm-cli/src/doctor.rs @@ -0,0 +1,182 @@ +//! `boruna doctor` — environment and toolchain health checks. +//! +//! Read-only. Reports the binary version, which optional features were +//! compiled in, whether a Rust toolchain is reachable, the persistent data +//! directory's writability, and whether the current directory looks like a +//! Boruna project root. `--json` emits the report for agent consumption. + +use std::path::Path; +use std::process::Command; + +use serde::Serialize; + +#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)] +#[serde(rename_all = "snake_case")] +pub enum Status { + Ok, + Warn, + Error, +} + +#[derive(Debug, Serialize)] +pub struct Check { + pub name: String, + pub status: Status, + pub detail: String, +} + +#[derive(Debug, Serialize)] +pub struct Report { + pub ok: bool, + pub boruna_version: String, + pub checks: Vec, +} + +fn check(name: &str, status: Status, detail: impl Into) -> Check { + Check { + name: name.to_string(), + status, + detail: detail.into(), + } +} + +/// A space-separated list of optional features compiled into this binary. +fn compiled_features() -> String { + let mut features = Vec::new(); + if cfg!(feature = "persist-sqlite") { + features.push("persist-sqlite"); + } + if cfg!(feature = "serve") { + features.push("serve"); + } + if cfg!(feature = "http") { + features.push("http"); + } + if cfg!(feature = "telemetry") { + features.push("telemetry"); + } + if features.is_empty() { + "none".to_string() + } else { + features.join(" ") + } +} + +fn check_rust_toolchain() -> Check { + match Command::new("rustc").arg("--version").output() { + Ok(out) if out.status.success() => { + let version = String::from_utf8_lossy(&out.stdout).trim().to_string(); + check("rust_toolchain", Status::Ok, version) + } + _ => check( + "rust_toolchain", + Status::Warn, + "rustc not found — only needed to build Boruna from source", + ), + } +} + +fn check_data_dir(data_dir: &Path) -> Check { + if !data_dir.exists() { + return check( + "data_dir", + Status::Ok, + format!( + "{} does not exist yet — created on first persistent run", + data_dir.display() + ), + ); + } + if !data_dir.is_dir() { + return check( + "data_dir", + Status::Error, + format!("{} exists but is not a directory", data_dir.display()), + ); + } + let probe = data_dir.join(".boruna-doctor-probe"); + match std::fs::write(&probe, b"probe") { + Ok(()) => { + let _ = std::fs::remove_file(&probe); + check( + "data_dir", + Status::Ok, + format!("{} exists and is writable", data_dir.display()), + ) + } + Err(e) => check( + "data_dir", + Status::Error, + format!("{} is not writable: {e}", data_dir.display()), + ), + } +} + +fn check_project_layout() -> Check { + let expected = ["templates", "libs", "examples"]; + let missing: Vec<&str> = expected + .iter() + .filter(|d| !Path::new(d).is_dir()) + .copied() + .collect(); + if missing.is_empty() { + check( + "project_layout", + Status::Ok, + "templates/, libs/, examples/ present — looks like a Boruna repo root", + ) + } else { + check( + "project_layout", + Status::Warn, + format!( + "missing {} — current directory may not be a Boruna repo root", + missing.join(", ") + ), + ) + } +} + +/// Run the doctor checks. Returns `true` if no check reported `Error`. +pub fn run(data_dir: &Path, json: bool) -> bool { + let version = env!("CARGO_PKG_VERSION").to_string(); + let checks = vec![ + check("boruna_version", Status::Ok, version.clone()), + check("compiled_features", Status::Ok, compiled_features()), + check_rust_toolchain(), + check_data_dir(data_dir), + check_project_layout(), + ]; + let ok = !checks.iter().any(|c| c.status == Status::Error); + let report = Report { + ok, + boruna_version: version, + checks, + }; + + if json { + match serde_json::to_string_pretty(&report) { + Ok(s) => println!("{s}"), + Err(e) => eprintln!("failed to serialize doctor report: {e}"), + } + } else { + println!("boruna doctor — version {}", report.boruna_version); + for c in &report.checks { + let mark = match c.status { + Status::Ok => "ok", + Status::Warn => "warn", + Status::Error => "ERROR", + }; + println!(" [{mark}] {}: {}", c.name, c.detail); + } + println!( + "{}", + if report.ok { + "status: healthy" + } else { + "status: problems found" + } + ); + } + ok +} diff --git a/crates/llmvm-cli/src/main.rs b/crates/llmvm-cli/src/main.rs index 9b15cf9..10cd54c 100644 --- a/crates/llmvm-cli/src/main.rs +++ b/crates/llmvm-cli/src/main.rs @@ -19,6 +19,7 @@ use boruna_vm::vm::Vm; mod coordinator; #[cfg(feature = "serve")] mod dashboard; +mod doctor; mod evidence_diff; #[cfg(feature = "serve")] mod evidence_serve; @@ -27,6 +28,8 @@ mod provider_registry; mod scaffold; #[cfg(feature = "serve")] mod serve; +mod size; +mod skills; #[cfg(feature = "serve")] mod worker; mod workflow_eval; @@ -144,6 +147,23 @@ enum Command { /// Language tooling commands (diagnostics, repair). #[command(subcommand)] Lang(LangCommand), + /// Environment and toolchain health checks. + Doctor { + /// Output the health report as JSON. + #[arg(long)] + json: bool, + }, + /// Report the bytecode artifact size of a .ax source file. + Size { + /// Source file (.ax). + file: PathBuf, + /// Output the size report as JSON. + #[arg(long)] + json: bool, + }, + /// Embedded, agent-curated documentation (list, get). + #[command(subcommand)] + Skills(SkillsCommand), /// Trace-to-test tools (record, generate, run, minimize). #[command(subcommand)] Trace2tests(Trace2TestsCommand), @@ -549,6 +569,30 @@ enum LangCommand { #[arg(long, default_value = "best")] apply: String, }, + /// List the registry of stable diagnostic codes. + Codes { + /// Output the registry as JSON. + #[arg(long)] + json: bool, + }, +} + +#[derive(Subcommand)] +enum SkillsCommand { + /// List available agent skill documents. + List { + /// Output the skill list as JSON. + #[arg(long)] + json: bool, + }, + /// Print an agent skill document by name. + Get { + /// Skill name (see `boruna skills list`). + name: String, + /// Output the skill as JSON ({ name, summary, content }). + #[arg(long)] + json: bool, + }, } #[derive(Subcommand)] @@ -967,6 +1011,14 @@ enum WorkflowCommand { #[arg(long)] json: bool, }, + /// Emit the workflow DAG as structured graph facts. + Graph { + /// Workflow directory containing workflow.json. + dir: PathBuf, + /// Output graph facts as JSON. + #[arg(long)] + json: bool, + }, } #[derive(Subcommand)] @@ -1397,6 +1449,29 @@ fn run(cli: Cli) -> Result<(), Box> { Command::Fmt { file, check } => format::run_fmt(&file, check)?, Command::Framework(fw) => run_framework(fw)?, Command::Lang(lang) => run_lang(lang)?, + Command::Doctor { json } => { + let data_dir = resolve_data_dir(None, env_arg); + if !doctor::run(&data_dir, json) { + process::exit(1); + } + } + Command::Size { file, json } => { + let source = fs::read_to_string(&file)?; + let name = file + .file_stem() + .map(|s| s.to_string_lossy().to_string()) + .unwrap_or_else(|| "module".into()); + let resolved = maybe_resolve_imports(&source)?; + size::run(&name, &resolved, json)?; + } + Command::Skills(cmd) => match cmd { + SkillsCommand::List { json } => skills::run_list(json), + SkillsCommand::Get { name, json } => { + if !skills::run_get(&name, json) { + process::exit(1); + } + } + }, Command::Trace2tests(t2t) => run_trace2tests(t2t)?, Command::Template(tmpl) => run_template(tmpl)?, Command::New { @@ -1867,6 +1942,24 @@ fn run_lang(cmd: LangCommand) -> Result<(), Box> { } } } + LangCommand::Codes { json } => { + let registry = boruna_tooling::diagnostics::registry::registry(); + if json { + let payload = serde_json::json!({ + "version": 1, + "codes": registry, + }); + println!("{}", serde_json::to_string_pretty(&payload)?); + } else { + println!("{:<6} {:<22} {:<16} SUMMARY", "CODE", "NAME", "CATEGORY"); + for c in registry { + println!( + "{:<6} {:<22} {:<16} {}", + c.code, c.name, c.category, c.summary + ); + } + } + } } Ok(()) } @@ -3612,6 +3705,126 @@ fn run_workflow( WorkflowCommand::Find { dir, json } => { handle_workflow_find(&dir, json)?; } + WorkflowCommand::Graph { dir, json } => { + handle_workflow_graph(&dir, json)?; + } + } + Ok(()) +} + +/// Emit the workflow DAG under `dir` as structured graph facts: nodes (with +/// kind, capabilities, dependencies), edges, topological order, roots, and +/// leaves. Read-only. Exits non-zero if the directory is unreadable or the +/// graph is not a DAG. +fn handle_workflow_graph( + dir: &std::path::Path, + json: bool, +) -> Result<(), Box> { + use boruna_orchestrator::workflow::{StepKind, WorkflowDef, WorkflowValidator}; + use std::collections::BTreeSet; + + let def_path = dir.join("workflow.json"); + let raw = fs::read_to_string(&def_path) + .map_err(|e| format!("cannot read {}: {e}", def_path.display()))?; + let def: WorkflowDef = + serde_json::from_str(&raw).map_err(|e| format!("invalid workflow.json: {e}"))?; + + let kind_label = |k: &StepKind| -> &'static str { + match k { + StepKind::Source { .. } => "source", + StepKind::ApprovalGate { .. } => "approval_gate", + StepKind::ExternalTrigger { .. } => "external_trigger", + } + }; + + // Dependency relation = union of per-step `depends_on` and global `edges`. + let mut deps: std::collections::BTreeMap<&str, BTreeSet<&str>> = def + .steps + .keys() + .map(|id| (id.as_str(), BTreeSet::new())) + .collect(); + for (id, step) in &def.steps { + let entry = deps.entry(id.as_str()).or_default(); + for d in &step.depends_on { + entry.insert(d.as_str()); + } + } + for (from, to) in &def.edges { + // Only record edges between declared steps — an edge to an unknown + // step is a malformed def (caught by `workflow validate`) and must + // not introduce a phantom node into `deps`/`roots`. + if let Some(set) = deps.get_mut(to.as_str()) { + set.insert(from.as_str()); + } + } + let has_dependents: BTreeSet<&str> = deps.values().flatten().copied().collect(); + + let roots: Vec<&str> = deps + .iter() + .filter(|(_, d)| d.is_empty()) + .map(|(id, _)| *id) + .collect(); + let leaves: Vec<&str> = def + .steps + .keys() + .map(|s| s.as_str()) + .filter(|s| !has_dependents.contains(s)) + .collect(); + + let topo = WorkflowValidator::topological_order(&def); + let is_dag = topo.is_ok(); + + let nodes: Vec = def + .steps + .iter() + .map(|(id, step)| { + serde_json::json!({ + "id": id, + "kind": kind_label(&step.kind), + "capabilities": step.capabilities, + "depends_on": deps[id.as_str()].iter().collect::>(), + }) + }) + .collect(); + + if json { + let payload = serde_json::json!({ + "workflow": def.name, + "version": def.version, + "schema_version": def.schema_version, + "node_count": def.steps.len(), + "edge_count": def.edges.len(), + "is_dag": is_dag, + "nodes": nodes, + "edges": def.edges, + "topological_order": topo.clone().unwrap_or_default(), + "roots": roots, + "leaves": leaves, + "error": topo.as_ref().err(), + }); + println!("{}", serde_json::to_string_pretty(&payload)?); + } else { + println!("workflow '{}' v{}", def.name, def.version); + println!(" nodes: {} edges: {}", def.steps.len(), def.edges.len()); + for (id, step) in &def.steps { + let d = &deps[id.as_str()]; + let dep_str = if d.is_empty() { + "(root)".to_string() + } else { + d.iter().copied().collect::>().join(", ") + }; + println!(" {} [{}] <- {}", id, kind_label(&step.kind), dep_str); + } + println!(" roots: {}", roots.join(", ")); + println!(" leaves: {}", leaves.join(", ")); + match &topo { + Ok(order) => println!(" topological order: {}", order.join(" -> ")), + Err(e) => println!(" NOT A DAG: {e}"), + } + } + + if !is_dag { + process::exit(1); } Ok(()) } diff --git a/crates/llmvm-cli/src/size.rs b/crates/llmvm-cli/src/size.rs new file mode 100644 index 0000000..46093f9 --- /dev/null +++ b/crates/llmvm-cli/src/size.rs @@ -0,0 +1,102 @@ +//! `boruna size` — bytecode artifact cost report. +//! +//! Compiles a `.ax` source file and reports the size of the resulting +//! bytecode module: per-function opcode counts, module-wide totals, and the +//! serialized `.axbc` artifact byte count. Read-only — nothing is written to +//! disk. `--json` emits the report for agent consumption. + +use serde::Serialize; + +use boruna_bytecode::Module; + +#[derive(Debug, Serialize)] +struct FunctionSize { + name: String, + arity: u8, + locals: u16, + op_count: usize, + capability_count: usize, +} + +#[derive(Debug, Serialize)] +struct Totals { + function_count: usize, + total_ops: usize, + constants: usize, + types: usize, + globals: usize, +} + +#[derive(Debug, Serialize)] +struct SizeReport { + module: String, + functions: Vec, + totals: Totals, + bytecode_bytes: usize, + bytecode_format: &'static str, +} + +/// Compile `source` (named `name`) and print its bytecode size report. +pub fn run(name: &str, source: &str, json: bool) -> Result<(), Box> { + let module: Module = boruna_compiler::compile(name, source)?; + + let functions: Vec = module + .functions + .iter() + .map(|f| FunctionSize { + name: f.name.clone(), + arity: f.arity, + locals: f.locals, + op_count: f.code.len(), + capability_count: f.capabilities.len(), + }) + .collect(); + + let total_ops = functions.iter().map(|f| f.op_count).sum(); + let totals = Totals { + function_count: module.functions.len(), + total_ops, + constants: module.constants.len(), + types: module.types.len(), + globals: module.globals.len(), + }; + + let bytecode_bytes = module.to_bytes()?.len(); + + let report = SizeReport { + module: module.name.clone(), + functions, + totals, + bytecode_bytes, + bytecode_format: "axbc", + }; + + if json { + println!("{}", serde_json::to_string_pretty(&report)?); + } else { + println!("module '{}' — bytecode size", report.module); + println!( + " {:<24} {:>6} {:>7} {:>6} {:>6}", + "FUNCTION", "ARITY", "LOCALS", "OPS", "CAPS" + ); + for f in &report.functions { + println!( + " {:<24} {:>6} {:>7} {:>6} {:>6}", + f.name, f.arity, f.locals, f.op_count, f.capability_count + ); + } + println!( + " totals: {} functions, {} ops, {} constants, {} types, {} globals", + report.totals.function_count, + report.totals.total_ops, + report.totals.constants, + report.totals.types, + report.totals.globals + ); + println!( + " artifact: {} bytes ({} format)", + report.bytecode_bytes, report.bytecode_format + ); + } + Ok(()) +} diff --git a/crates/llmvm-cli/src/skills.rs b/crates/llmvm-cli/src/skills.rs new file mode 100644 index 0000000..b9a834f --- /dev/null +++ b/crates/llmvm-cli/src/skills.rs @@ -0,0 +1,120 @@ +//! `boruna skills` — embedded, agent-curated documentation. +//! +//! Skill documents are compiled into the binary via `include_str!`, so an AI +//! agent can learn how to write `.ax` and drive the toolchain from the +//! installed `boruna` binary alone — no repository checkout required. + +use serde::Serialize; + +/// One embedded skill document. +#[derive(Debug, Clone, Copy, Serialize)] +pub struct Skill { + /// Lookup name, e.g. `"ax-language"`. + pub name: &'static str, + /// One-line description shown by `skills list`. + pub summary: &'static str, + /// Full markdown body. Skipped in `list` output; served by `get`. + #[serde(skip)] + pub body: &'static str, +} + +/// All embedded skill documents. +pub const SKILLS: &[Skill] = &[ + Skill { + name: "ax-language", + summary: "Syntax, types, and capabilities of the .ax language.", + body: include_str!("skills/ax-language.md"), + }, + Skill { + name: "cli", + summary: "The boruna CLI command surface, grouped by task.", + body: include_str!("skills/cli.md"), + }, + Skill { + name: "workflows", + summary: "Authoring DAG workflows and reading workflow output.", + body: include_str!("skills/workflows.md"), + }, + Skill { + name: "diagnostics", + summary: "Diagnostic codes and the check/repair loop for agents.", + body: include_str!("skills/diagnostics.md"), + }, +]; + +/// Find a skill by exact name. +pub fn lookup(name: &str) -> Option<&'static Skill> { + SKILLS.iter().find(|s| s.name == name) +} + +/// Print the list of available skills. +pub fn run_list(json: bool) { + if json { + let payload = serde_json::json!({ + "version": 1, + "skills": SKILLS, + }); + match serde_json::to_string_pretty(&payload) { + Ok(s) => println!("{s}"), + Err(e) => eprintln!("failed to serialize skills: {e}"), + } + } else { + println!("available skills (boruna skills get ):"); + for s in SKILLS { + println!(" {:<14} {}", s.name, s.summary); + } + } +} + +/// Print one skill document. Returns `false` if `name` is unknown. +pub fn run_get(name: &str, json: bool) -> bool { + let Some(skill) = lookup(name) else { + let names: Vec<&str> = SKILLS.iter().map(|s| s.name).collect(); + eprintln!("unknown skill '{name}'. available: {}", names.join(", ")); + return false; + }; + if json { + let payload = serde_json::json!({ + "name": skill.name, + "summary": skill.summary, + "content": skill.body, + }); + match serde_json::to_string_pretty(&payload) { + Ok(s) => println!("{s}"), + Err(e) => eprintln!("failed to serialize skill: {e}"), + } + } else { + print!("{}", skill.body); + if !skill.body.ends_with('\n') { + println!(); + } + } + true +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn all_skill_bodies_are_populated() { + for s in SKILLS { + assert!(!s.body.trim().is_empty(), "skill {} has empty body", s.name); + assert!(!s.summary.is_empty(), "skill {} has empty summary", s.name); + } + } + + #[test] + fn skill_names_are_unique() { + let mut seen = std::collections::BTreeSet::new(); + for s in SKILLS { + assert!(seen.insert(s.name), "duplicate skill name {}", s.name); + } + } + + #[test] + fn lookup_finds_and_misses() { + assert!(lookup("ax-language").is_some()); + assert!(lookup("does-not-exist").is_none()); + } +} diff --git a/crates/llmvm-cli/src/skills/ax-language.md b/crates/llmvm-cli/src/skills/ax-language.md new file mode 100644 index 0000000..8aed9ae --- /dev/null +++ b/crates/llmvm-cli/src/skills/ax-language.md @@ -0,0 +1,91 @@ +# Skill: The .ax Language + +A concise reference for writing `.ax` source. For the full spec see +`docs/reference/ax-language.md` in the Boruna repository. + +## Program shape + +Every standalone `.ax` file needs an entry point: + +``` +fn main() -> Int { + return 0 +} +``` + +`main` must return `Int` — it becomes the process exit code. + +## Types + +- Scalars: `Int`, `Float`, `String`, `Bool`, `Unit` +- Containers: `Option`, `Result`, `List`, `Map` +- User-defined: `record` and `enum` + +## Functions + +``` +fn add(a: Int, b: Int) -> Int { + return a + b +} +``` + +A function that performs a side effect must declare the capability it uses: + +``` +fn fetch(url: String) -> String !{net.fetch} { + ... +} +``` + +The `!{...}` capability set is mandatory and checked by the compiler — calling +an effectful operation without declaring its capability is diagnostic `E007`. + +## Records + +``` +record State { + count: Int, + name: String, +} +``` + +Construct and update with spread syntax: + +``` +let s = State { count: 0, name: "init" } +let s2 = State { ..s, count: 1 } +``` + +## Enums and pattern matching + +``` +enum Color { Red, Green, Blue } + +let label = match c { + Color::Red => "red", + Color::Green => "green", + Color::Blue => "blue", +} +``` + +`match` must be exhaustive — a missing case is diagnostic `E005`. Use `_` as a +catch-all when total coverage is not needed. + +## Capabilities + +Capabilities name the side effects a program may perform. Common ones: +`net.fetch`, `db.query`, `fs.read`, `fs.write`, `llm.call`. They are declared +on functions and enforced at runtime by the VM against the active policy. +Pure functions declare no capabilities and are fully deterministic. + +## Determinism rule + +`.ax` code is deterministic: same input always produces same output. No +wall-clock time, no randomness, no hidden global state in pure code. All +non-determinism enters only through declared capabilities. + +## Next steps + +- `boruna skills get cli` — the command surface. +- `boruna skills get diagnostics` — error codes and the repair loop. +- `boruna check .ax` equivalents: `boruna lang check .ax --json`. diff --git a/crates/llmvm-cli/src/skills/cli.md b/crates/llmvm-cli/src/skills/cli.md new file mode 100644 index 0000000..ac517ef --- /dev/null +++ b/crates/llmvm-cli/src/skills/cli.md @@ -0,0 +1,62 @@ +# Skill: The boruna CLI + +The `boruna` binary is the single toolchain entry point. Every inspection +command accepts `--json` for machine-readable output. For the full reference +see `docs/reference/cli.md`. + +## Compile and run + +``` +boruna compile app.ax # -> app.axbc bytecode +boruna run app.ax # compile + execute +boruna run app.ax --policy allow-all +boruna ast app.ax # dump the AST as JSON +boruna inspect app.axbc # inspect a compiled bytecode file +``` + +## Diagnostics and repair + +``` +boruna lang check app.ax --json # structured diagnostics +boruna lang repair app.ax # apply suggested fixes +boruna lang codes --json # registry of all diagnostic codes +``` + +## Inspection (agent-friendly, all support --json) + +``` +boruna doctor --json # environment + toolchain health +boruna size app.ax --json # bytecode artifact size report +boruna workflow graph --json # DAG facts: nodes, edges, topo order +boruna skills list # embedded agent documentation +``` + +## Workflows + +``` +boruna workflow validate # validate a workflow.json DAG +boruna workflow run --policy allow-all +boruna workflow graph --json # inspect DAG structure +``` + +## Evidence (compliance / audit) + +``` +boruna evidence verify # verify a hash-chained bundle +boruna evidence inspect --json +``` + +## Templates and scaffolding + +``` +boruna template list +boruna template apply crud-admin --args "entity_name=products" +boruna new # interactive project scaffold +``` + +## Exit codes + +`0` success. `1` is the common failure code (compile error, diagnostics with +errors, validation failure, unknown skill). Some commands use additional codes +documented in `docs/reference/cli.md` — for example `workflow wait` uses `3` +for a budget timeout. diff --git a/crates/llmvm-cli/src/skills/diagnostics.md b/crates/llmvm-cli/src/skills/diagnostics.md new file mode 100644 index 0000000..3be855c --- /dev/null +++ b/crates/llmvm-cli/src/skills/diagnostics.md @@ -0,0 +1,70 @@ +# Skill: Diagnostics and Repair + +Boruna emits structured diagnostics designed for both humans and agents. The +human reads the message; the agent reads the JSON. + +## The check/repair loop + +``` +boruna lang check app.ax --json # 1. get structured diagnostics +boruna lang repair app.ax # 2. apply suggested fixes +boruna lang check app.ax --json # 3. confirm the fix +``` + +## Diagnostic JSON shape + +`boruna lang check --json` returns a `DiagnosticSet`: + +``` +{ + "version": 1, + "file": "app.ax", + "diagnostics": [ + { + "id": "E003", + "severity": "error", + "message": "unknown identifier 'foo'", + "location": { "file": "app.ax", "line": 3, "col": 9 }, + "suggested_patches": [ + { "id": "declare-missing-symbol", "description": "...", + "confidence": "high", "edits": [ ... ] } + ] + } + ] +} +``` + +Each diagnostic carries a stable `id` (an `E0NN` code), a `severity`, a +`location`, and zero or more `suggested_patches`. Switch on `id` — codes are +stable forever. + +## Stable diagnostic codes + +Resolve any code without reading compiler source: + +``` +boruna lang codes --json +``` + +| Code | Meaning | +|------|---------| +| `E001` | lexer error — source could not be tokenized | +| `E002` | parse error — invalid syntax tree | +| `E003` | undefined variable | +| `E004` | undefined function | +| `E005` | non-exhaustive match | +| `E006` | unknown record field | +| `E007` | capability violation — undeclared effect | +| `E008` | codegen error | +| `E009` | type error | + +## Repair strategies + +``` +boruna lang repair app.ax --apply best # highest-confidence patch (default) +boruna lang repair app.ax --apply all # every suggested patch +boruna lang repair app.ax --apply # one specific patch by id +``` + +After repair, the tool reports how many patches applied and whether a +verification re-check passed. Always re-run `lang check` to confirm. diff --git a/crates/llmvm-cli/src/skills/workflows.md b/crates/llmvm-cli/src/skills/workflows.md new file mode 100644 index 0000000..9dfdae9 --- /dev/null +++ b/crates/llmvm-cli/src/skills/workflows.md @@ -0,0 +1,67 @@ +# Skill: Workflows + +A Boruna workflow is a DAG of steps defined in a `workflow.json` file inside a +workflow directory. Each step compiles to bytecode and runs on the VM under a +capability policy. Every run can produce a hash-chained evidence bundle. + +## Directory layout + +``` +my_workflow/ + workflow.json # the DAG definition + steps/ + fetch_data.ax # one .ax source per "source" step + transform.ax +``` + +## workflow.json shape + +``` +{ + "schema_version": 1, + "name": "my_workflow", + "version": "1.0.0", + "description": "...", + "steps": { + "fetch": { "kind": "source", "source": "steps/fetch_data.ax", + "capabilities": ["net.fetch"], "depends_on": [] }, + "transform": { "kind": "source", "source": "steps/transform.ax", + "depends_on": ["fetch"] } + }, + "edges": [["fetch", "transform"]] +} +``` + +## Step kinds + +- `source` — a step backed by a `.ax` source file. +- `approval_gate` — pauses for a human decision by `required_role`. +- `external_trigger` — waits for an external event. + +## Dependencies + +A step runs after every step it depends on. Dependencies come from a step's +`depends_on` list and from the global `edges` list — both are honored. The +workflow must be a DAG; a cycle is a validation error. + +## Commands + +``` +boruna workflow validate # check the DAG is well-formed +boruna workflow graph --json # nodes, edges, topo order, roots, leaves +boruna workflow run --policy allow-all --record +``` + +## Inspecting graph structure as an agent + +`boruna workflow graph --json` returns: + +- `nodes` — each step with `kind`, `capabilities`, `depends_on` +- `edges` — explicit edge pairs +- `topological_order` — execution order +- `roots` — steps with no dependencies (entry points) +- `leaves` — steps nothing depends on (terminal outputs) +- `is_dag` — `false` if the graph contains a cycle + +Use this to understand a workflow before modifying it: read the graph, find +the step you need, check its dependencies, then edit the relevant `.ax` file. diff --git a/crates/llmvm-cli/tests/cli_agent_native.rs b/crates/llmvm-cli/tests/cli_agent_native.rs new file mode 100644 index 0000000..d272083 --- /dev/null +++ b/crates/llmvm-cli/tests/cli_agent_native.rs @@ -0,0 +1,226 @@ +//! CLI integration tests for the agent-native surfaces: +//! `lang codes`, `doctor`, `workflow graph`, `size`, `skills`. + +use std::fs; +use std::path::PathBuf; +use std::process::Command; + +use serde_json::Value; +use tempfile::tempdir; + +fn boruna_bin() -> &'static str { + env!("CARGO_BIN_EXE_boruna") +} + +/// Repo root — `CARGO_MANIFEST_DIR` is `crates/llmvm-cli/`. +fn repo_root() -> PathBuf { + PathBuf::from(env!("CARGO_MANIFEST_DIR")).join("../..") +} + +fn run(args: &[&str]) -> std::process::Output { + Command::new(boruna_bin()) + .args(args) + .output() + .expect("invoke boruna") +} + +fn stdout(out: &std::process::Output) -> String { + String::from_utf8_lossy(&out.stdout).to_string() +} + +// ---- lang codes ---------------------------------------------------------- + +#[test] +fn lang_codes_human_lists_all_codes() { + let out = run(&["lang", "codes"]); + assert!(out.status.success()); + let s = stdout(&out); + for code in [ + "E001", "E002", "E003", "E004", "E005", "E006", "E007", "E008", "E009", + ] { + assert!(s.contains(code), "missing {code} in:\n{s}"); + } +} + +#[test] +fn lang_codes_json_has_nine_entries() { + let out = run(&["lang", "codes", "--json"]); + assert!(out.status.success()); + let v: Value = serde_json::from_str(&stdout(&out)).expect("valid JSON"); + let codes = v["codes"].as_array().expect("codes array"); + assert_eq!(codes.len(), 9); + for c in codes { + assert!(c["code"].is_string()); + assert!(c["name"].is_string()); + assert!(c["summary"].is_string()); + assert!(c["category"].is_string()); + } +} + +// ---- doctor -------------------------------------------------------------- + +#[test] +fn doctor_json_is_well_formed() { + let out = run(&["doctor", "--json"]); + let v: Value = serde_json::from_str(&stdout(&out)).expect("valid JSON"); + assert_eq!(v["boruna_version"], env!("CARGO_PKG_VERSION")); + let checks = v["checks"].as_array().expect("checks array"); + assert!(!checks.is_empty()); + let mut any_error = false; + for c in checks { + let status = c["status"].as_str().expect("status string"); + assert!( + matches!(status, "ok" | "warn" | "error"), + "bad status {status}" + ); + if status == "error" { + any_error = true; + } + } + assert_eq!(v["ok"].as_bool().unwrap(), !any_error); +} + +// ---- workflow graph ------------------------------------------------------ + +fn llm_code_review_dir() -> PathBuf { + repo_root().join("examples/workflows/llm_code_review") +} + +#[test] +fn workflow_graph_json_facts_are_consistent() { + let dir = llm_code_review_dir(); + let out = Command::new(boruna_bin()) + .args(["workflow", "graph", "--json"]) + .arg(&dir) + .output() + .expect("invoke boruna"); + assert!( + out.status.success(), + "stderr: {}", + String::from_utf8_lossy(&out.stderr) + ); + let v: Value = serde_json::from_str(&stdout(&out)).expect("valid JSON"); + + assert_eq!(v["is_dag"], true); + let nodes = v["nodes"].as_array().unwrap(); + let topo: Vec<&str> = v["topological_order"] + .as_array() + .unwrap() + .iter() + .map(|x| x.as_str().unwrap()) + .collect(); + assert_eq!(nodes.len(), topo.len(), "every node appears in topo order"); + assert_eq!(v["node_count"].as_u64().unwrap() as usize, nodes.len()); + + // Every edge (a,b) must place a before b in the topological order. + for edge in v["edges"].as_array().unwrap() { + let a = edge[0].as_str().unwrap(); + let b = edge[1].as_str().unwrap(); + let ia = topo.iter().position(|x| *x == a).unwrap(); + let ib = topo.iter().position(|x| *x == b).unwrap(); + assert!(ia < ib, "edge {a}->{b} violates topo order"); + } + + assert!(!v["roots"].as_array().unwrap().is_empty()); + assert!(!v["leaves"].as_array().unwrap().is_empty()); +} + +#[test] +fn workflow_graph_detects_a_cycle() { + // Start from a real workflow def and introduce a back-dependency. + let src = fs::read_to_string(llm_code_review_dir().join("workflow.json")).unwrap(); + let mut def: Value = serde_json::from_str(&src).unwrap(); + // fetch_diff is the root; make it depend on the terminal step `report`. + def["steps"]["fetch_diff"]["depends_on"] = serde_json::json!(["report"]); + if let Some(edges) = def["edges"].as_array_mut() { + edges.push(serde_json::json!(["report", "fetch_diff"])); + } + + let dir = tempdir().unwrap(); + fs::write(dir.path().join("workflow.json"), def.to_string()).unwrap(); + + let out = Command::new(boruna_bin()) + .args(["workflow", "graph"]) + .arg(dir.path()) + .output() + .expect("invoke boruna"); + assert_eq!(out.status.code(), Some(1), "cyclic graph must exit 1"); +} + +#[test] +fn workflow_graph_missing_dir_fails_cleanly() { + let out = run(&["workflow", "graph", "/nonexistent/workflow/dir"]); + assert!(!out.status.success()); +} + +// ---- size ---------------------------------------------------------------- + +#[test] +fn size_json_totals_are_consistent() { + let hello = repo_root().join("examples/hello.ax"); + let out = Command::new(boruna_bin()) + .args(["size", "--json"]) + .arg(&hello) + .output() + .expect("invoke boruna"); + assert!( + out.status.success(), + "stderr: {}", + String::from_utf8_lossy(&out.stderr) + ); + let v: Value = serde_json::from_str(&stdout(&out)).expect("valid JSON"); + + assert!(v["bytecode_bytes"].as_u64().unwrap() > 0); + assert_eq!(v["bytecode_format"], "axbc"); + + let functions = v["functions"].as_array().unwrap(); + let sum_ops: u64 = functions + .iter() + .map(|f| f["op_count"].as_u64().unwrap()) + .sum(); + assert_eq!(v["totals"]["total_ops"].as_u64().unwrap(), sum_ops); + assert_eq!( + v["totals"]["function_count"].as_u64().unwrap() as usize, + functions.len() + ); +} + +#[test] +fn size_missing_file_fails_cleanly() { + let out = run(&["size", "/nonexistent/file.ax"]); + assert!(!out.status.success()); +} + +// ---- skills -------------------------------------------------------------- + +#[test] +fn skills_list_json_has_all_skills() { + let out = run(&["skills", "list", "--json"]); + assert!(out.status.success()); + let v: Value = serde_json::from_str(&stdout(&out)).expect("valid JSON"); + let skills = v["skills"].as_array().expect("skills array"); + assert_eq!(skills.len(), 4); +} + +#[test] +fn skills_get_returns_body() { + let out = run(&["skills", "get", "ax-language"]); + assert!(out.status.success()); + assert!(stdout(&out).contains("# Skill: The .ax Language")); +} + +#[test] +fn skills_get_json_has_content() { + let out = run(&["skills", "get", "diagnostics", "--json"]); + assert!(out.status.success()); + let v: Value = serde_json::from_str(&stdout(&out)).expect("valid JSON"); + assert_eq!(v["name"], "diagnostics"); + assert!(v["content"].as_str().unwrap().len() > 100); +} + +#[test] +fn skills_get_unknown_exits_1() { + let out = run(&["skills", "get", "no-such-skill"]); + assert_eq!(out.status.code(), Some(1)); + assert!(String::from_utf8_lossy(&out.stderr).contains("unknown skill")); +} diff --git a/docs/architecture-agent-native-cli.md b/docs/architecture-agent-native-cli.md new file mode 100644 index 0000000..c75f1b9 --- /dev/null +++ b/docs/architecture-agent-native-cli.md @@ -0,0 +1,88 @@ +# Architecture — Agent-Native CLI surfaces + +Companion to `docs/design-agent-native-cli.md`. Component-level build plan. + +## Component map + +``` +tooling/src/diagnostics/ + registry.rs NEW — DiagnosticCodeInfo + REGISTRY const + registry() + mod.rs EDIT — `pub mod registry;` + +crates/llmvm-cli/src/ + main.rs EDIT — Command/LangCommand/WorkflowCommand enums + dispatch arms + doctor.rs NEW — environment health checks + size.rs NEW — bytecode artifact cost analysis + skills.rs NEW — embedded skill-doc registry + lookup + skills/ NEW dir — ax-language.md, cli.md, workflows.md, diagnostics.md + +docs/reference/ + diagnostic-codes.md NEW — human reference for the code registry + cli.md EDIT — document the 5 new surfaces + +CHANGELOG.md EDIT — ### Added entries +AGENTS.md EDIT — mention the agent-facing surfaces +``` + +## 1. `lang codes` + +- `tooling/src/diagnostics/registry.rs`: + - `pub struct DiagnosticCodeInfo { code, name, summary, category }` (all `&'static str`). + - `pub const REGISTRY: &[DiagnosticCodeInfo]` — one entry per `E001`–`E009`. + - `pub fn registry() -> &'static [DiagnosticCodeInfo]`. +- `LangCommand::Codes { json: bool }`. Handler in `run_lang`: + - JSON: `{ "version": 1, "codes": [ {code,name,summary,category}, ... ] }`. + - Human: aligned table. +- Drift test in `tooling` tests: collect every `E0NN` const, assert 1:1 with registry codes. + +## 2. `doctor` + +- `Command::Doctor { json: bool }` → `doctor::run(json)`. +- `doctor.rs`: `Check { name: String, status: Status, detail: String }`, `Status = Ok|Warn|Error`. +- Checks: boruna version; compiled features (`cfg!(feature=...)` for http/serve/telemetry); + `rustc --version` (Warn if absent); default data-dir resolution + writability; + presence of `templates/`, `libs/`, `examples/` relative to cwd. +- JSON: `{ "ok": bool, "boruna_version": "...", "checks": [...] }`. Exit 1 if any `Error`. + +## 3. `workflow graph` + +- `WorkflowCommand::Graph { dir: PathBuf, json: bool }`. +- Load `WorkflowDef` via the same loader `WorkflowCommand::Validate` uses. +- Reuse the validator's topological sort for `topological_order` (Err on cycle → reported). +- Facts: `nodes` (id, kind, capabilities, depends_on), `edges`, `topological_order`, + `roots` (no deps), `leaves` (no dependents). +- JSON object; human = summary line + adjacency listing. + +## 4. `size` + +- `Command::Size { file: PathBuf, json: bool }` → `size::run(&file, json)`. +- Compile via `boruna_compiler::compile(name, source)`. On compile error: emit the error, + exit 1 (consistent with `compile`). +- Per-function: `name, arity, locals, op_count, capabilities`. +- Totals: function count, total ops, constants, types, globals. +- `bytecode_bytes`: serialize `Module` with Boruna's existing serializer; label the format. +- JSON `{ module, functions, totals, bytecode_bytes, bytecode_format }`; human = table. + +## 5. `skills` + +- `SkillsCommand::List { json }` and `SkillsCommand::Get { name, json }`. +- `skills.rs`: `struct Skill { name, summary, body }`, bodies via `include_str!("skills/*.md")`. + `SKILLS: &[Skill]` static slice; `fn lookup(name) -> Option<&Skill>`. +- `list`: names + summaries (JSON array or table). +- `get `: prints `body`; `--json` → `{ name, summary, content }`; unknown name → exit 1 + with available-names hint. +- Skill docs (curated, concise, agent-focused): `ax-language.md`, `cli.md`, `workflows.md`, + `diagnostics.md`. + +## Build order (sequential — all touch `main.rs`) + +1. Diagnostics registry (`tooling`) + `lang codes` + drift test + `diagnostic-codes.md`. +2. `doctor`. +3. `workflow graph`. +4. `size`. +5. `skills` + embedded docs. +6. Docs sweep: `cli.md`, `CHANGELOG.md`, `AGENTS.md`. +7. Gates: `cargo test --workspace`, `clippy -D warnings`, `fmt --check`. + +Parallel sub-agents are **not** used: every surface edits `crates/llmvm-cli/src/main.rs` +(shared enum + dispatch) — convention §31 mandates sequential for same-crate work. diff --git a/docs/design-agent-native-cli.md b/docs/design-agent-native-cli.md new file mode 100644 index 0000000..8170cd4 --- /dev/null +++ b/docs/design-agent-native-cli.md @@ -0,0 +1,64 @@ +# Design — Agent-Native CLI surfaces + +**Sprint:** agent-native-cli · **Branch:** `feature/agent-native-cli` (from `master`, v1.2.0-line) +**Date:** 2026-05-17 +**Origin:** Research of `vercel-labs/zero` (`claudedocs/research_zerolang_vs_boruna_2026-05-17.md`) surfaced 5 agent-DX ideas worth borrowing. + +## Why + +Zero (Vercel Labs) validated the "agent-native language" category: a toolchain where every +surface emits structured, machine-readable facts that AI agents consume directly. Boruna +already has the foundation (capability gating, JSON diagnostics, `boruna-mcp`). This sprint +closes 5 specific gaps so agents can *inspect* Boruna projects as fluently as humans. + +## Forcing questions + +- **Who needs this?** AI coding agents (and humans) operating on `.ax` projects and workflows. + Today they must read source, guess diagnostic-code meanings, and have no way to query + artifact cost or workflow shape without writing a script. +- **Narrowest MVP?** Five read-only CLI surfaces, each with `--json`. No new runtime behavior, + no schema changes to persisted data. +- **What makes someone say "whoa"?** `boruna skills get` — the binary self-describes how to + write `.ax` and drive the toolchain, so a fresh agent is productive with zero repo access. +- **How does it compound?** Every future diagnostic code, CLI command, and workflow feature + plugs into a registry/skill doc that agents already know how to query. The agent's + understanding of Boruna scales with the toolchain instead of lagging it. + +## Scope — 5 surfaces + +Idea #1 from the research ("stable diagnostic codes") is **already implemented** — `E001`–`E009` +exist as stable `pub const` in `tooling/src/diagnostics/mod.rs`. Per the user decision it is +**replaced** with a machine-readable *registry* surface (the agent-facing piece that was missing). + +| # | Surface | What it does | +|---|---------|--------------| +| 1 | `boruna lang codes [--json]` | Emit the registry of all diagnostic codes (id, name, summary, category). | +| 2 | `boruna doctor [--json]` | Environment/toolchain health: version, compiled features, data-dir writability, project dirs. | +| 3 | `boruna workflow graph [--json]` | Emit DAG facts: nodes, edges, topological order, roots, leaves. | +| 4 | `boruna size [--json]` | Bytecode artifact cost: per-function op counts, totals, serialized byte size. | +| 5 | `boruna skills list` / `boruna skills get [--json]` | Embedded, agent-curated docs the binary serves with no repo access. | + +## Non-goals + +- No native compilation, no GC removal, no C ABI — those are Zero's niche, not Boruna's. +- No changes to determinism, replay, evidence bundles, or any persisted schema. +- No DOT/Graphviz export for `workflow graph` (JSON facts only; visualization is out of scope). +- No new MCP tools this sprint (CLI-first; MCP exposure is a possible follow-up). + +## Decisions + +- **`lang codes`, not a new top-level `diagnostics` command.** `Lang` is already + "Language tooling commands (diagnostics, repair)" — `codes` belongs there. Avoids surface bloat. +- **All five are read-only.** No mutation, no persistence writes. Safe by construction. +- **`--json` everywhere; human output is the default.** Matches every existing Boruna CLI surface. +- **Skill docs embedded via `include_str!`.** The binary must self-describe without the repo + checked out (an agent may only have the installed binary). +- **Registry is the single source of truth for codes.** A drift test (convention §33) asserts + every `E0xx` const has exactly one registry entry and vice versa. + +## Risks + +- `doctor` runs `rustc --version` (external process) — acceptable for a diagnostic command; + failure is reported as a `warn` check, never aborts. +- `size`'s "serialized bytes" depends on Boruna's bytecode serialization format; the number is + labeled with the format used so it is not mistaken for a native-artifact size. diff --git a/docs/reference/cli.md b/docs/reference/cli.md index 15f761e..6e95124 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -27,11 +27,14 @@ Commands: replay Replay from a recorded event log inspect Inspect a compiled module ast Print the AST for a .ax file - lang Language diagnostics and repair + lang Language diagnostics, repair, and the code registry + doctor Environment and toolchain health checks + size Bytecode artifact size report for a .ax file framework Framework app validation and testing - workflow Workflow validation and execution + workflow Workflow validation, execution, and graph inspection evidence Evidence bundle inspection and verification template Template listing and application + skills Embedded, agent-curated documentation trace2tests Generate regression tests from traces ``` @@ -156,10 +159,12 @@ Language diagnostics and auto-repair. ```bash boruna lang check [--json] boruna lang repair +boruna lang codes [--json] Subcommands: check Run diagnostics: type errors, undeclared capabilities, unreachable code repair Apply auto-repair suggestions from diagnostics + codes List the registry of stable diagnostic codes (E001–E009) ``` Examples: @@ -170,8 +175,61 @@ boruna lang check app.ax --json # Automatically repair issues boruna lang repair app.ax + +# Resolve a diagnostic code seen in `lang check --json` output +boruna lang codes --json +``` + +`lang codes` emits the registry from `docs/reference/diagnostic-codes.md`. Codes +are stable forever — tools and agents may switch on them. + +--- + +## `boruna doctor` + +Environment and toolchain health checks. + +```bash +boruna doctor [--json] +``` + +Reports the binary version, which optional features were compiled in, whether a +Rust toolchain is reachable, the persistent data directory's writability, and +whether the current directory looks like a Boruna project root. Read-only. +Exits 1 if any check has `error` status. + +--- + +## `boruna size` + +Bytecode artifact size report for a `.ax` source file. + +```bash +boruna size [--json] +``` + +Compiles the file and reports per-function opcode counts, module-wide totals +(functions, ops, constants, types, globals), and the serialized `.axbc` +artifact byte size. Nothing is written to disk. + +--- + +## `boruna skills` + +Embedded, agent-curated documentation — compiled into the binary so an agent can +learn Boruna from the installed binary alone. + +```bash +boruna skills list [--json] +boruna skills get [--json] + +Subcommands: + list List available skill documents + get Print one skill document (ax-language, cli, workflows, diagnostics) ``` +`skills get` exits 1 on an unknown skill name and lists the available names. + --- ## `boruna framework` @@ -275,6 +333,21 @@ Validates each discovered workflow and prints path, name, step count, and validi --- +### `boruna workflow graph` + +Emit the workflow DAG as structured graph facts. + +```bash +boruna workflow graph [--json] +``` + +Reports nodes (each step's kind, capabilities, and dependencies), edges, +topological execution order, `roots` (steps with no dependencies), and `leaves` +(steps nothing depends on). Read-only — only `workflow.json` is read, step +source files are not. Exits 1 if the graph contains a cycle (`is_dag: false`). + +--- + ## `boruna evidence` Inspect, verify, and manage evidence bundles. diff --git a/docs/reference/diagnostic-codes.md b/docs/reference/diagnostic-codes.md new file mode 100644 index 0000000..294c507 --- /dev/null +++ b/docs/reference/diagnostic-codes.md @@ -0,0 +1,30 @@ +# Diagnostic Codes + +Every diagnostic Boruna's toolchain emits carries a stable `E0NN` code. Codes are +**stable forever** — never reused, never renumbered. Tools and AI agents may switch +on them. + +The registry is machine-readable. Query it directly: + +```bash +boruna lang codes # human table +boruna lang codes --json # { "version": 1, "codes": [ ... ] } +``` + +Codes appear in `boruna lang check --json` output as the `id` field of each diagnostic. + +| Code | Name | Category | Summary | +|------|------|----------|---------| +| `E001` | lexer-error | lexical | The source could not be tokenized (invalid character or token). | +| `E002` | parse-error | syntax | The token stream did not form a valid syntax tree. | +| `E003` | undefined-variable | name-resolution | A referenced variable is not defined in scope. | +| `E004` | undefined-function | name-resolution | A called function is not defined in the module. | +| `E005` | non-exhaustive-match | pattern-matching | A match expression does not cover all possible cases. | +| `E006` | unknown-field | type | A record field access or construction references an unknown field. | +| `E007` | capability-violation | capability | A function performs an effect it does not declare in its capability set. | +| `E008` | codegen-error | codegen | The typechecked program could not be lowered to bytecode. | +| `E009` | type-error | type | An expression's type does not match the type required by its context. | + +The table above is generated from the same registry the CLI serves +(`tooling/src/diagnostics/registry.rs`). A drift test asserts the registry stays +1:1 with the `E0NN` constants the compiler emits. diff --git a/docs/test-plan-agent-native-cli.md b/docs/test-plan-agent-native-cli.md new file mode 100644 index 0000000..4390389 --- /dev/null +++ b/docs/test-plan-agent-native-cli.md @@ -0,0 +1,58 @@ +# Test Plan — Agent-Native CLI surfaces + +Companion to `docs/design-agent-native-cli.md`. Acceptance criteria + edge cases. + +## 1. `lang codes` + +- `boruna lang codes` → human table lists all 9 codes. +- `boruna lang codes --json` → valid JSON, `codes.len() == 9`, each has code/name/summary/category. +- **Drift test** (`tooling`): every `E0NN` `pub const` in `diagnostics/mod.rs` appears exactly + once in `REGISTRY`; no registry entry lacks a backing const. Fails loudly on future drift. +- Registry codes are unique (no duplicate `code` values). + +## 2. `doctor` + +- `boruna doctor` → human report; exit 0 in a healthy checkout. +- `boruna doctor --json` → valid JSON with `ok`, `boruna_version`, `checks[]`. +- `boruna_version` equals `CARGO_PKG_VERSION`. +- Every check has a `status` in {ok,warn,error}; `ok == true` iff no `error` check. +- Missing `rustc` produces a `warn`, not an `error`, and does not abort. + +## 3. `workflow graph` + +- `boruna workflow graph examples/workflows/llm_code_review` → human summary. +- `--json` → nodes match the workflow's steps; edges match `WorkflowDef.edges`. +- `topological_order` is a valid topo order (every edge `(a,b)` has `a` before `b`). +- `roots` = steps with no dependencies; `leaves` = steps no step depends on. +- A workflow with a cycle → graph reports the cycle / non-DAG, exit non-zero. +- Missing directory → clean error, exit 1. + +## 4. `size` + +- `boruna size examples/hello.ax` → human table with per-function rows + totals. +- `--json` → `functions[]`, `totals`, `bytecode_bytes > 0`, `bytecode_format` set. +- `totals.total_ops` == sum of per-function `op_count`. +- A file with a compile error → error emitted, exit 1, no panic. +- Missing file → clean error, exit 1. + +## 5. `skills` + +- `boruna skills list` → all embedded skills with summaries. +- `boruna skills list --json` → JSON array, length == number of embedded docs. +- `boruna skills get ax-language` → prints non-empty markdown body. +- `boruna skills get ax-language --json` → `{ name, summary, content }`, content non-empty. +- `boruna skills get nonexistent` → exit 1, lists available skill names. +- Every embedded skill body is non-empty (compile-time `include_str!` guarantees existence). + +## Regression gates (convention §30, §32) + +- `cargo test --workspace` — all 557+ existing tests still pass. +- `cargo clippy --workspace --all-targets -- -D warnings` — zero warnings. +- `cargo fmt --all -- --check` — clean. +- `cargo build --workspace` — clean. + +## Test placement + +- Registry drift test → `tooling/src/diagnostics/` test module (or `tooling/src/tests.rs`). +- CLI surface tests → `crates/llmvm-cli/tests/cli_agent_native.rs` (new integration test file, + follows the existing `cli_*.rs` pattern), invoking the built `boruna` binary. diff --git a/tooling/src/diagnostics/mod.rs b/tooling/src/diagnostics/mod.rs index caecbb5..ab2fd1f 100644 --- a/tooling/src/diagnostics/mod.rs +++ b/tooling/src/diagnostics/mod.rs @@ -1,5 +1,6 @@ pub mod analyzer; pub mod collector; +pub mod registry; pub mod suggest; use serde::{Deserialize, Serialize}; diff --git a/tooling/src/diagnostics/registry.rs b/tooling/src/diagnostics/registry.rs new file mode 100644 index 0000000..8d18875 --- /dev/null +++ b/tooling/src/diagnostics/registry.rs @@ -0,0 +1,153 @@ +//! Machine-readable registry of stable diagnostic codes. +//! +//! Every `E0NN` code emitted by the toolchain has exactly one entry here. The +//! registry is the agent-facing source of truth: `boruna lang codes` serves it +//! so an agent can resolve a code seen in `lang check --json` output without +//! reading compiler source. A drift test asserts the registry stays 1:1 with +//! the `E0NN` constants in `super`. + +use serde::Serialize; + +/// Documentation for one stable diagnostic code. +#[derive(Debug, Clone, Copy, Serialize)] +pub struct DiagnosticCodeInfo { + /// Stable code string, e.g. `"E003"`. Never reused or renumbered. + pub code: &'static str, + /// Short human name, e.g. `"undefined-variable"`. + pub name: &'static str, + /// One-line summary of what the code means. + pub summary: &'static str, + /// Compiler phase that emits the code. + pub category: &'static str, +} + +/// All stable diagnostic codes, ordered by code. +pub const REGISTRY: &[DiagnosticCodeInfo] = &[ + DiagnosticCodeInfo { + code: super::E001_LEXER, + name: "lexer-error", + summary: "The source could not be tokenized (invalid character or token).", + category: "lexical", + }, + DiagnosticCodeInfo { + code: super::E002_PARSE, + name: "parse-error", + summary: "The token stream did not form a valid syntax tree.", + category: "syntax", + }, + DiagnosticCodeInfo { + code: super::E003_UNDEFINED_VAR, + name: "undefined-variable", + summary: "A referenced variable is not defined in scope.", + category: "name-resolution", + }, + DiagnosticCodeInfo { + code: super::E004_UNDEFINED_FN, + name: "undefined-function", + summary: "A called function is not defined in the module.", + category: "name-resolution", + }, + DiagnosticCodeInfo { + code: super::E005_NON_EXHAUSTIVE_MATCH, + name: "non-exhaustive-match", + summary: "A match expression does not cover all possible cases.", + category: "pattern-matching", + }, + DiagnosticCodeInfo { + code: super::E006_UNKNOWN_FIELD, + name: "unknown-field", + summary: "A record field access or construction references an unknown field.", + category: "type", + }, + DiagnosticCodeInfo { + code: super::E007_CAPABILITY_VIOLATION, + name: "capability-violation", + summary: "A function performs an effect it does not declare in its capability set.", + category: "capability", + }, + DiagnosticCodeInfo { + code: super::E008_CODEGEN, + name: "codegen-error", + summary: "The typechecked program could not be lowered to bytecode.", + category: "codegen", + }, + DiagnosticCodeInfo { + code: super::E009_TYPE_ERROR, + name: "type-error", + summary: "An expression's type does not match the type required by its context.", + category: "type", + }, +]; + +/// Returns the full diagnostic-code registry. +pub fn registry() -> &'static [DiagnosticCodeInfo] { + REGISTRY +} + +#[cfg(test)] +mod tests { + use super::*; + + /// Extract every `E0NN` code string from a `pub const ... = "E0NN";` line + /// in `diagnostics/mod.rs`. Parsing the source — rather than a hand-kept + /// list — means a constant added to `mod.rs` but not to the registry can + /// never slip past `registry_matches_source_constants`. + fn constants_declared_in_source() -> Vec { + let src = include_str!("mod.rs"); + let mut codes = Vec::new(); + for line in src.lines() { + let line = line.trim(); + if !line.starts_with("pub const E") { + continue; + } + // ... = "E001"; -> grab the quoted literal. + if let Some(start) = line.find('"') { + if let Some(end) = line[start + 1..].find('"') { + codes.push(line[start + 1..start + 1 + end].to_string()); + } + } + } + codes + } + + #[test] + fn registry_matches_source_constants() { + let declared = constants_declared_in_source(); + assert!( + !declared.is_empty(), + "found no `pub const E..` declarations in mod.rs — parser broke" + ); + let registry_codes: Vec<&str> = REGISTRY.iter().map(|c| c.code).collect(); + for code in &declared { + assert!( + registry_codes.contains(&code.as_str()), + "diagnostic constant {code} declared in mod.rs has no registry entry" + ); + } + for entry in REGISTRY { + assert!( + declared.iter().any(|c| c == entry.code), + "registry entry {} has no backing `pub const` in mod.rs", + entry.code + ); + } + assert_eq!(REGISTRY.len(), declared.len()); + } + + #[test] + fn registry_codes_are_unique() { + let mut seen = std::collections::BTreeSet::new(); + for entry in REGISTRY { + assert!(seen.insert(entry.code), "duplicate code {}", entry.code); + } + } + + #[test] + fn registry_entries_are_populated() { + for entry in REGISTRY { + assert!(!entry.name.is_empty()); + assert!(!entry.summary.is_empty()); + assert!(!entry.category.is_empty()); + } + } +}