spec: introduce AgentObservation for agent-originated observability records

## Problem

Every AIP adopter will have an observation phase before acting. A security scanner observes vulnerabilities. A cost optimizer observes resource waste. A debugging agent observes unhealthy pods. Currently the spec has no home for these observations.

Without a standard, every adopter builds a different CRD with a different schema. The ecosystem fragments — no shared dashboards, no cross-agent analysis, no consistent audit trail linking *why* an agent acted to *what* it did.

## What was considered and rejected

**Extending `AuditRecord`** (add `source: agent | control-plane`):
- Breaks the immutability guarantee — AuditRecord's value to SIEM and compliance auditors is that exactly one principal (the control plane) writes it
- No enforcement mechanism for the `source` field — a compromised agent could write `source: control-plane`
- Volume mismatch — governance AuditRecords are bounded (~5–10 per AgentRequest); observations are unbounded

**`scope: observationOnly` on `AgentRequest`**:
- Violates the core Kubernetes API convention: Kind determines semantics, not a field within an object
- Makes `action` and `target` (currently required) conditional on `scope` — an explicit antipattern in K8s API conventions (conditional required fields)
- Branches the control plane reconciler everywhere (`if scope == observationOnly`) — OpsLock, SafetyPolicy evaluation, phase machine all need special-casing
- Creates a policy bypass surface — `observationOnly` requests skip SafetyPolicy evaluation

## Proposed solution: `AgentObservation` Kind

A new Kind in the same API group (`governance.aip.io/v1alpha1`). Agent-written directly — no controller involved. Immutable after creation.

### Key design decisions

**No controller required.** The Kubernetes precedent is `Lease` (written directly by the holder, no Lease controller) and `v1.Event` (written directly by controllers, no Event controller). `AgentObservation` fits the same pattern — the API server validates the schema, RBAC controls who writes, the agent writes it once and it is done.

**`metadata.creationTimestamp` is the authoritative timestamp.** Set by the API server on creation, cannot be faked. No controller-set `recordedAt` field needed.

**Immutable after creation via CEL validation rule:**
```yaml
x-kubernetes-validations:
  - rule: self == oldSelf
    message: "AgentObservation is immutable after creation"
```

**Cross-referencing via `aip.io/correlationID` label — not field updates.** The agent generates a UUID before acting, sets it on the `AgentObservation` and as a label on the subsequent `AgentRequest`. No back-link field on `AgentObservation` required — one label query retrieves the full incident chain.

**`AgentObservation` is NOT visible to SafetyPolicy CEL expressions.** Policies only see `request.spec.reasoningTrace.*` — the agent-attested summary baked into the `AgentRequest`. `traceReference` is opaque to the control plane; it is for auditors and tooling only. Allowing policies to JOIN against agent-authored observations would give agent-authored data governance authority, collapsing the trust boundary.

### Example CR

```yaml
apiVersion: governance.aip.io/v1alpha1
kind: AgentObservation
metadata:
  name: diag-abc123
  namespace: production
  creationTimestamp: "2026-03-26T10:00:01Z"  # authoritative, API server set
  labels:
    aip.io/correlationID: diag-abc123
    aip.io/agentIdentity: sre-agent-v2
    aip.io/eventType:     diagnosis
spec:
  agentIdentity: sre-agent-v2
  eventType:     diagnosis    # observation | diagnosis | escalation | signal
  correlationID: diag-abc123
  details:                    # open JSON — same extensibility model as parameters
    rootCause:  OOMKilled
    confidence: 0.91
    alternativesConsidered:
      - action: restart
        selected: true
      - action: "scale out"
        rejected: "memory leak affects all replicas"
    evidenceSources:
      - type: metrics
        ref:  "prometheus://production/container_oom_events"
```

### Querying the full incident chain

```bash
kubectl get agentobservations,agentrequests,auditrecords \
  -n production \
  -l aip.io/correlationID=diag-abc123 \
  --sort-by=.metadata.creationTimestamp
```

Gives the complete story: what the agent observed → what it requested → what the control plane decided → what happened.

## Spec changes required

1. New §3.x — `AgentObservation` Kind: schema, immutability contract, `eventType` enum, `correlationID` convention
2. New §A.x — `aip.io/correlationID` label standard: agent-generated UUID, propagated on `AgentObservation`, `AgentRequest` (as label), and `AuditRecord` (as label)
3. §9 JSON Schema — `AgentObservation` schema
4. A.4 Conformance Checklist — assertions for `AgentObservation` immutability and `correlationID` propagation
5. Clarify that `AgentObservation` details are NOT accessible in SafetyPolicy CEL expressions

## Trust model summary

| Resource | Author | Trust level |
|---|---|---|
| `AgentObservation` | Agent | Informational — agent-attested |
| `AgentRequest` spec | Agent | Informational — agent-attested |
| `AgentRequest` status | Control plane | Authoritative — governance decision |
| `AuditRecord` | Control plane | Authoritative — tamper-evident |

The control plane only touches resources where a governance decision is being made. It has zero involvement with `AgentObservation`. That is the right separation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spec: introduce AgentObservation for agent-originated observability records #6

Problem

What was considered and rejected

Proposed solution: `AgentObservation` Kind

Key design decisions

Example CR

Querying the full incident chain

Spec changes required

Trust model summary

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Resource	Author	Trust level
`AgentObservation`	Agent	Informational — agent-attested
`AgentRequest` spec	Agent	Informational — agent-attested
`AgentRequest` status	Control plane	Authoritative — governance decision
`AuditRecord`	Control plane	Authoritative — tamper-evident

spec: introduce AgentObservation for agent-originated observability records #6

Description

Problem

What was considered and rejected

Proposed solution: AgentObservation Kind

Key design decisions

Example CR

Querying the full incident chain

Spec changes required

Trust model summary

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Proposed solution: `AgentObservation` Kind