Card Discovery Refinement Alignment [Spec + Impl]#380
Conversation
pdettori
left a comment
There was a problem hiding this comment.
Review Summary
This is a well-structured spec that makes the right architectural call at the right time. The API surface improvements are genuine — better naming, security observability, clearer diagnostics — and the cost is minimal given the absence of external consumers.
The synthesis approach is sound: cherry-picking transport security awareness and workload readiness from the refinement doc while preserving PR #372's event-driven model and FetchSkipped semantics. The timing argument for breaking changes is valid.
A few design questions below (none blocking), mostly around ensuring the spec is explicit on edge cases that will matter during implementation.
Areas reviewed: Data model, condition semantics, port resolution, workload readiness, CRD UX
Commits: 3, all signed-off
CI: All passing (E2E pending)
885f495 to
23c8050
Compare
| AgentCardData: *cardData, | ||
| CardHash: newCardHash, | ||
| Protocol: protocol, | ||
| TransportSecurity: transportSecurity, |
There was a problem hiding this comment.
what's the difference between "Protocol" and "TransportSecurity" ?
There was a problem hiding this comment.
They are orthogonal concepts:
protocol= the application-level agent discovery protocol ("a2a"), identifying which card endpoint convention was usedtransportSecurity= the transport layer authentication used for the fetch ("mtls"or"http")
An A2A card can be fetched over either transport. protocol tells you what was fetched, transportSecurity tells you how securely it was fetched.
Spec, plan, research, data-model, tasks, and REVIEWERS.md for aligning the card discovery implementation with the RHAIENG-4948 refinement document. Covers transport security field, condition model restructure, field renames, and protocol-aware port resolution. Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
All arguments now stand on live testing findings, Kubernetes conventions, and API quality rather than referencing external documents. Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
- transportSecurity: validated enum instead of free-form string - scale-to-zero: explicit edge case (WorkloadNotReady is correct) - Sandbox exemption: added rationale for skipping readiness check Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
Generation-based change detection triggers false WorkloadNotReady on scale events. Documented as known limitation with a planned improvement to switch to pod template hash detection. Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
Add transport security visibility, unified condition model, accurate field names, and protocol-aware port resolution for AgentRuntime card discovery. Key changes: - Add transportSecurity field (mTLS/plainHTTP/configMap) to CardStatus - Rename CardSynced condition to CardFetched with transport-aware reasons - Rename cardId to cardHash, fetchedAt to lastCardFetchTime - Add kagenti.io/port annotation and a2a port name resolution chain - Add workload readiness check before card fetch attempt - 464 unit tests passing, full spec compliance verified Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
Rename transportSecurity enum values from mTLS/plainHTTP/configMap to mtls/http, add typed TransportSecurity enum with Go constants, and remove unused configMap value. Assisted-By: 🤖 Claude Code Signed-off-by: Roland Huß <rhuss@redhat.com>
2ba4372 to
1cb6539
Compare
Spec for review
PR #372 introduced card discovery on AgentRuntime. This follow-up aligns the API surface with Kubernetes conventions and the community refinement document before external consumers lock in the current field names and condition types.
Why now
The card discovery API has no consumers yet. Field renames and condition type changes are free today but require migration tomorrow. This is the last window to get the API right before it becomes a compatibility constraint.
Proposed changes (spec only, no code yet)
Transport security visibility (new
transportSecurityfield on CardStatus)A card fetched over plain HTTP within the cluster network has no transport-level identity guarantee. Platform engineers currently cannot tell from the AgentRuntime status whether a card was fetched securely. Adding
transportSecurity("mTLS" or "plainHTTP") makes the security posture observable without digging into operator logs.Condition model cleanup (rename
CardSyncedtoCardFetched)CardSyncedimplies ongoing synchronization, which is misleading for an event-driven, non-polling architectureCardSynced True CardSyncedstutters inkubectl describeoutput (condition type and reason are identical)FetchedvsFetchedInsecure) give immediate security posture visibilityWorkloadNotReadysplit fromServiceNotFoundmaps each reason to one diagnostic actionField name accuracy
cardIdrenamed tocardHash: the field holds a SHA-256 content hash for change detection, not a unique identifierfetchedAtrenamed tolastCardFetchTime: disambiguates "when the controller fetched it" from "when the agent created it"Protocol-aware port resolution
kagenti.io/portannotation override for multi-port Services where auto-detection picks the wrong porta2atakes priority over generichttpsince we're fetching A2A protocol cards specificallyArtifacts
Assisted-By: 🤖 Claude Code