diff --git a/site/astro.config.mjs b/site/astro.config.mjs index c647f872..b943be3c 100644 --- a/site/astro.config.mjs +++ b/site/astro.config.mjs @@ -70,6 +70,7 @@ export default defineConfig({ { label: 'Filter IR', slug: 'reference/filter-ir' }, { label: 'ado-script', slug: 'reference/ado-script' }, { label: 'Codemods', slug: 'reference/codemods' }, + { label: 'Audit', slug: 'reference/audit' }, ], }, { diff --git a/site/src/content/docs/reference/audit.mdx b/site/src/content/docs/reference/audit.mdx new file mode 100644 index 00000000..c303b014 --- /dev/null +++ b/site/src/content/docs/reference/audit.mdx @@ -0,0 +1,144 @@ +--- +title: "ado-aw audit" +description: "Audit a completed Azure DevOps agentic pipeline build: download artifacts, run analyzers, and render a structured report." +--- + +import { Steps } from '@astrojs/starlight/components'; + +`ado-aw audit` inspects one completed Azure DevOps build at a time. It downloads the three audit artifact families (agent outputs, detection outputs, safe outputs), runs the built-in analyzers (firewall, MCP gateway, OTel, safe outputs, detection verdict, build timeline, and missing-tool / missing-data / noop extraction), and renders a structured console report or the raw `AuditData` JSON. + +## Usage + +``` +ado-aw audit [options] +``` + +## Accepted input formats + +| Input | Example | +|---|---| +| Numeric build ID | `12345` | +| dev.azure.com URL | `https://dev.azure.com/my-org/My%20Project/_build/results?buildId=12345` | +| dev.azure.com URL with job/step anchors | `...?buildId=12345&j=&t=` (accepted; the build-level audit still runs) | +| Legacy visualstudio.com URL | `https://my-org.visualstudio.com/proj/_build/results?buildId=12345` | +| On-prem Azure DevOps Server URL | `https://onprem.example.com/DefaultCollection/MyProject/_build/results?buildId=12345` | + +URL-encoded project segments are decoded automatically. Both `t=` and `s=` are accepted as step-anchor parameters. + +## Flags + +| Flag | Default | Behavior | +|---|---|---| +| `-o, --output ` | `./logs` | Directory under which `/build-/` is written. | +| `--json` | off | Emit the full `AuditData` as JSON to stdout. Suppresses the trailing `Audit complete` stderr line. | +| `--org ` | auto | ADO organization override for bare build IDs. Full build URLs supply this directly. | +| `--project ` | auto | ADO project override for bare build IDs. Full build URLs supply this directly. | +| `--pat ` | env | Personal Access Token. Also reads `AZURE_DEVOPS_EXT_PAT`. Falls back to the Azure CLI auth chain when omitted. | +| `--artifacts ` | all | Restrict download + analysis to a subset. Valid values: `agent`, `detection`, `safe-outputs` (`safe_outputs` is also accepted). | +| `--no-cache` | off | Force re-processing even if `/build-/run-summary.json` already exists. | + +## Behavior + +- **Input resolution.** Bare IDs use `--org` / `--project` or git-remote auto-detection. Full build URLs contribute host, org, and project — those URL-derived values win over CLI flags. +- **Artifact scope.** Only `agent_outputs*`, `analyzed_outputs*`, and `safe_outputs*` are fetched. All other published build artifacts are ignored. +- **Artifact refresh.** If a local artifact directory already exists, it is renamed aside before re-download and restored if the download fails — no data is lost on a network error. +- **Analyzer failures are soft.** The command records a warning, keeps any successfully-derived sections, and still renders the report. +- **Multiple directories.** When multiple local directories share one recognized prefix, the lexicographically last match wins. + +## Output layout + +``` +/build-/ +├── run-summary.json # Cached AuditData, CLI-version-keyed +├── agent_outputs[_]/ # Agent stage artifacts +│ ├── staging/ +│ │ ├── safe_outputs.ndjson # Agent's safe-output proposals +│ │ ├── aw_info.json # Runtime engine / agent / source metadata +│ │ └── otel.jsonl # Copilot OTel (when emitted) +│ └── logs/ +│ ├── firewall/ # AWF Squid proxy logs +│ ├── mcpg/ # MCP Gateway logs +│ ├── safeoutputs.log # SafeOutputs HTTP server log +│ └── agent-output.txt # Filtered agent stdout +├── analyzed_outputs[_]/ # Detection stage artifacts +│ ├── threat-analysis.json # Aggregate verdict + reasons +│ └── threat-analysis-output.txt +└── safe_outputs[_]/ # SafeOutputs stage artifacts + └── safe-outputs-executed.ndjson # Per-item execution log +``` + +`aw_info.json`, `otel.jsonl`, and `safe_outputs.ndjson` are searched in `staging/` first, then at the artifact top level, so older artifact layouts still audit cleanly. + +## Report shape (`AuditData`) + +Optional sections are omitted from `--json` output when empty. + +| Key | Source | +|---|---| +| `overview` | ADO build metadata + `aw_info.json` (engine, model, agent name, source, target). | +| `task_domain` | Audit heuristics over the run's prompts and outputs. | +| `behavior_fingerprint` | Higher-level heuristics over the run's behavior patterns. | +| `agentic_assessments` | Higher-level assessments emitted by the analyzers. | +| `metrics` | OTel JSONL (`otel.jsonl`) plus audit-time warning/error counts. | +| `key_findings` | Heuristic rules + analyzer findings (e.g. aggregate-gate rejection). | +| `recommendations` | Follow-up actions derived from findings. | +| `performance_metrics` | Derived from `metrics`, runtime duration, tool usage, and firewall counts. | +| `engine_config` | Runtime engine configuration from `aw_info.json`. | +| `safe_output_summary` | Counts of proposed / executed / rejected / not-processed items. | +| `safe_output_execution` | Per-item trace joining proposal + detection + execution. | +| `rejected_safe_outputs` | Rollup of rejections by reason/threat flag. | +| `detection_analysis` | Contents of `threat-analysis.json`. | +| `mcp_server_health` | MCPG logs aggregated per server. | +| `mcp_tool_usage` | MCPG logs aggregated per `(server, tool)`. | +| `mcp_failures` | MCPG `tool_error` / `server_error` events. | +| `jobs` | ADO `/timeline` records filtered to `type: Job`. | +| `firewall_analysis` | AWF Squid proxy logs aggregated by domain. | +| `policy_analysis` | AWF policy artifacts aggregated into allow/deny summaries. | +| `missing_tools` / `missing_data` / `noops` | NDJSON entries from the corresponding SafeOutputs MCP tools. | +| `downloaded_files` | One entry per file under `/build-/`. | +| `errors` / `warnings` | Run-level error/warning aggregates. | +| `tool_usage` | High-level tool-usage rollups derived from telemetry. | +| `created_items` | Successfully executed items with extracted id/url/title. | + +## Rejected safe-output trace + +When `threat-analysis.json` reports any threat flag, the audit treats the entire SafeOutputs batch as rejected by the aggregate gate and records each proposal with: + +- `status: not_processed_due_to_aggregate_gate` +- `applies_to_whole_batch: true` +- `rejection_reason`: the aggregate `reasons[]` from `threat-analysis.json`, joined with `; ` + +One severity-`high` finding is also emitted summarizing the gate decision: which threat flags fired, how many proposals were dropped, and the full aggregate reasons. + +:::note[Per-item verdicts] +`threat-analysis.json` currently emits an aggregate verdict only. Per-item detection verdicts are a planned follow-up. +::: + +## Cache behavior + +`/build-/run-summary.json` is written after each successful run. + +| Scenario | Behavior | +|---|---| +| Cached `ado_aw_version` matches current CLI | Report rendered from cache; download/analysis skipped. | +| Cache missing, unparseable, or from a different version | Cache ignored; build reprocessed from scratch. | +| `--no-cache` passed | Always reprocesses. | + +The cache-hit info line is printed only in console mode (not with `--json`). + +## Permission failures + +- The initial build-metadata fetch is live ADO only. A 401/403 at this step is fatal. +- If artifact listing or download returns 401/403 and at least one recognized artifact family exists locally, the audit continues from local cache and records a warning. +- If artifact listing or download returns 401/403 and no local cache exists, the command emits a structured error pointing at the manual escape hatch: + +```bash +az pipelines runs artifact download --run-id --path +``` + +## Related + +- [CLI Commands](/ado-aw/setup/cli/) — full CLI reference +- [Safe Outputs](/ado-aw/reference/safe-outputs/) — what agent proposals look like +- [Network](/ado-aw/reference/network/) — AWF firewall configuration +- [ado-aw-debug](/ado-aw/reference/ado-aw-debug/) — debug-only front-matter knobs diff --git a/site/src/content/docs/setup/cli.mdx b/site/src/content/docs/setup/cli.mdx index 407ab843..cd9675ea 100644 --- a/site/src/content/docs/setup/cli.mdx +++ b/site/src/content/docs/setup/cli.mdx @@ -209,6 +209,24 @@ Options: - `--org`, `--project`, `--pat` -- same as `enable` - `--dry-run` -- preview the planned queue body without calling the ADO API +### `audit ` + +Audit one completed Azure DevOps agentic pipeline build. Downloads the three audit artifact families (agent outputs, detection outputs, safe outputs), runs the built-in analyzers, and renders a structured console report. + +```bash +ado-aw audit [--json] [--output ] [--artifacts ] [--no-cache] +``` + +Options: + +- `--json` -- emit the full `AuditData` as JSON to stdout instead of the console report +- `-o, --output ` -- local directory for downloaded artifacts and the cached report (default: `./logs`) +- `--artifacts ` -- restrict download to `agent`, `detection`, and/or `safe-outputs` +- `--no-cache` -- re-process even when a cached `run-summary.json` already exists +- `--org`, `--project`, `--pat` -- same as `enable` + +See the [Audit reference](/ado-aw/reference/audit/) for accepted URL formats, report shape, cache behavior, and permission failure handling. + ## Internal / pipeline runtime commands These commands are used by the compiled pipeline itself and are not typically called by users directly.