Skip to content

Latest commit

 

History

History
198 lines (149 loc) · 7.44 KB

File metadata and controls

198 lines (149 loc) · 7.44 KB

Decision Classification — ADP Specification v0.1

Overview

Every decision made by an AI agent is classified along three axes: type, risk level, and reversibility. This classification determines governance requirements, approval workflows, and audit depth.


Axis 1: Decision Type

D1 — Operational

Execution of a defined task within a bounded scope. The agent follows explicit instructions or well-defined procedures.

Property Value
Scope Single task, bounded parameters
Judgment required Minimal — follows rules/instructions
Impact Limited to task output
Examples API call, database query, data formatting, file processing
Governance Standard logging, no approval required

D2 — Tactical

Selection among multiple approaches to achieve an objective. The agent exercises judgment within defined boundaries.

Property Value
Scope Multi-step, choice between options
Judgment required Moderate — evaluates alternatives
Impact Affects workflow or process outcomes
Examples Task prioritization, vendor selection, resource allocation, routing decisions
Governance Enhanced logging with reasoning, approval varies by autonomy level

D3 — Strategic

Decision with significant organizational or stakeholder impact. The agent makes choices that affect people, finances, or external relationships.

Property Value
Scope Cross-functional, external-facing
Judgment required High — weighs complex tradeoffs
Impact Affects stakeholders, finances, reputation
Examples Client communication, financial commitment, process modification, hiring/firing recommendation
Governance Full audit trail, approval or post-review required, right-to-explanation

D4 — Self-Modification

The agent modifies its own behavior, parameters, prompts, or decision-making logic. This is the highest-risk decision category regardless of other factors.

Property Value
Scope Agent's own configuration/behavior
Judgment required N/A — agent acts on itself
Impact Changes future behavior of the agent
Examples Prompt modification, parameter tuning, tool access change, behavior rule update
Governance Always requires human approval, regardless of autonomy level

Rationale: Self-modification creates compounding risk. An agent that modifies its own behavior can progressively drift from its intended purpose. No autonomy level grants blanket authorization for self-modification.


Axis 2: Risk Level

R1 — Negligible

Property Value
Data sensitivity Non-sensitive, public data only
Impact scope Internal, limited
Reversibility Fully reversible
Regulatory exposure None
Examples Formatting data, querying public APIs, internal calculations
Required controls Standard logging

R2 — Moderate

Property Value
Data sensitivity Low to moderate (may include business data)
Impact scope Internal operations, moderate
Reversibility Mostly reversible
Regulatory exposure Low — general compliance
Examples Processing internal documents, scheduling, inventory management
Required controls Enhanced logging, periodic review

R3 — Elevated

Property Value
Data sensitivity Sensitive (PII, financial data, health data)
Impact scope Affects individuals or critical operations
Reversibility Limited reversibility
Regulatory exposure High — Loi 25, EU AI Act high-risk, HIPAA, SOX
Examples Credit decisions, medical triage, employee evaluation, personal data processing
Required controls Full audit trail, human oversight, right-to-explanation, impact assessment

R4 — Critical

Property Value
Data sensitivity Critical (classified, legal, life-safety)
Impact scope Fundamental rights, critical infrastructure, safety
Reversibility Irreversible or severe consequences
Regulatory exposure Maximum — potential prohibition under EU AI Act
Examples Safety-critical decisions, judicial recommendations, biometric processing
Required controls Real-time monitoring, mandatory human approval, third-party audit, incident response

Axis 3: Reversibility

Total

The action can be completely undone without any residual impact. No data loss, no external notification sent, no commitments made.

Governance implication: Lower approval threshold.

Partial

The action can be partially undone, but some effects persist. May require effort, cost, or time to reverse. External parties may have been notified.

Governance implication: Standard approval threshold. Rollback procedure must be documented.

Irreversible

The action cannot be undone once executed. Funds transferred, data deleted, communications sent to external parties, legal commitments made.

Governance implication: Highest approval threshold. Always triggers escalation for D2+ decisions.


Combined Classification

Each decision receives a combined classification code: {Type}-{Risk}-{Reversibility}

Examples:

  • D1-R1-total — Operational, negligible risk, fully reversible → Minimal governance
  • D2-R2-partial — Tactical, moderate risk, partially reversible → Enhanced logging + periodic review
  • D3-R3-irreversible — Strategic, elevated risk, irreversible → Full audit + human approval mandatory
  • D4-R2-partial — Self-modification, moderate risk → Human approval mandatory (D4 override)

Escalation Rules

Regardless of the authorization matrix, the following conditions always trigger escalation to human oversight:

  1. Risk level R3 or R4 — Any decision at elevated or critical risk
  2. Decision type D4 — Any self-modification
  3. Irreversible + D2 or higher — Irreversible tactical/strategic decisions
  4. Policy violation detected — Agent attempts action outside policy bounds
  5. Anomaly detected — Decision pattern deviates from historical baseline
  6. Cross-agent conflict — Two agents making contradictory decisions
  7. Data breach potential — Decision involves unauthorized data access

JSON Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "ADP Decision Classification",
  "type": "object",
  "required": ["type", "risk_level", "reversibility"],
  "properties": {
    "type": {
      "type": "string",
      "enum": ["D1", "D2", "D3", "D4"],
      "description": "Decision type"
    },
    "risk_level": {
      "type": "string",
      "enum": ["R1", "R2", "R3", "R4"],
      "description": "Risk level assessment"
    },
    "reversibility": {
      "type": "string",
      "enum": ["total", "partial", "irreversible"],
      "description": "Action reversibility"
    },
    "classification_code": {
      "type": "string",
      "pattern": "^D[1-4]-R[1-4]-(total|partial|irreversible)$",
      "description": "Combined classification code"
    },
    "escalation_triggered": {
      "type": "boolean",
      "description": "Whether escalation rules were triggered"
    },
    "escalation_reason": {
      "type": "string",
      "description": "If escalated, the reason"
    }
  }
}