Skip to content

feat(operator): add Snapshot and SnapshotContent v1alpha1 CRDs#1

Merged
Ronkahn21 merged 1 commit into
feat/snapshot-crdfrom
feat/snapshot-crd-a-api
Jun 10, 2026
Merged

feat(operator): add Snapshot and SnapshotContent v1alpha1 CRDs#1
Ronkahn21 merged 1 commit into
feat/snapshot-crdfrom
feat/snapshot-crd-a-api

Conversation

@Ronkahn21

Copy link
Copy Markdown
Owner

Overview:

Component A of the DynamoCheckpoint → Snapshot/SnapshotContent migration: add the two new CRD types. Additive only — zero behavior change (no controller, webhook, or RBAC yet; CRDs are inert until later components).

Internal staging PR into the long-lived base branch feat/snapshot-crd (within this fork).

Details:

  • New nvidia.com/v1alpha1 kinds:
    • Snapshot (namespaced, snap) — minimal "trigger" spec: checkpointID (immutable artifact identity / on-PVC dir name) + source.podRef.name. The node agent reads target-container / storage base path from the source pod's existing annotations & mounts, so they are not duplicated in spec.
    • SnapshotContent (cluster-scoped, snapcontent) — artifact-of-record: snapshotRef back-pointer + self-contained source.snapshotHandle (pvc://<ns>/<claim>/<basePath>/<checkpointID>/versions/<version>).
  • Status is metav1.Conditions (Ready/Failed), matching DynamoCheckpoint/DynamoModel style.
  • Whole spec is immutable via a single CEL rule: !has(oldSelf.spec) || self.spec == oldSelf.spec.
  • Generated: deepcopy, CRD bases, helm chart CRD copies; config/crd/kustomization.yaml updated.
  • Conventions matched to existing single-version CRDs: no +genclient, no +kubebuilder:storageversion, no +kubebuilder:rbac. Single-version ⇒ no conversion (documented).

Verification (sandbox off): make manifests clean, helm copies in sync, go build ./... + go test ./api/v1alpha1 pass. CodeRabbit: no findings.

Where should the reviewer start?

  • deploy/operator/api/v1alpha1/snapshot_types.go
  • deploy/operator/api/v1alpha1/snapshotcontent_types.go
  • deploy/operator/api/v1alpha1/snapshot_types_test.go

Everything else is generated (zz_generated.deepcopy.go, config/crd/bases/*, helm copies) or a 2-line kustomization.yaml change.

Related Issues

🚫 This PR is NOT linked to an issue:

  • Confirmed — no related issue (internal staging PR; the upstream tracking issue is created when the base branch is PR'd to ai-dynamo/dynamo)

Introduce namespaced Snapshot and cluster-scoped SnapshotContent
(nvidia.com/v1alpha1) as the artifact-of-record for checkpoint capture.
Minimal-trigger Snapshot.spec (checkpointID + source.podRef);
self-contained SnapshotContent.snapshotHandle; conditions-based status;
whole spec immutable via CEL. Types only -- no controller, webhook, or
RBAC yet; inert until later components. Single-version, no conversion.

Signed-off-by: Ron Kahn <rkahn@nvidia.com>
@Ronkahn21 Ronkahn21 marked this pull request as ready for review June 10, 2026 09:53
@Ronkahn21 Ronkahn21 merged commit 3f31ce2 into feat/snapshot-crd Jun 10, 2026
14 of 17 checks passed
Ronkahn21 added a commit that referenced this pull request Jun 10, 2026
Introduce namespaced Snapshot and cluster-scoped SnapshotContent
(nvidia.com/v1alpha1) as the artifact-of-record for checkpoint capture.
Minimal-trigger Snapshot.spec (checkpointID + source.podRef);
self-contained SnapshotContent.snapshotHandle; conditions-based status;
whole spec immutable via CEL. Types only -- no controller, webhook, or
RBAC yet; inert until later components. Single-version, no conversion.

Signed-off-by: Ron Kahn <rkahn@nvidia.com>
@Ronkahn21 Ronkahn21 deleted the feat/snapshot-crd-a-api branch June 10, 2026 10:01
Ronkahn21 added a commit that referenced this pull request Jun 11, 2026
Introduce namespaced Snapshot and cluster-scoped SnapshotContent
(nvidia.com/v1alpha1) as the artifact-of-record for checkpoint capture.
Minimal-trigger Snapshot.spec (checkpointID + source.podRef);
self-contained SnapshotContent.snapshotHandle; conditions-based status;
whole spec immutable via CEL. Types only -- no controller, webhook, or
RBAC yet; inert until later components. Single-version, no conversion.

Signed-off-by: Ron Kahn <rkahn@nvidia.com>
Ronkahn21 added a commit that referenced this pull request Jun 22, 2026
Introduce namespaced Snapshot and cluster-scoped SnapshotContent
(nvidia.com/v1alpha1) as the artifact-of-record for checkpoint capture.
Minimal-trigger Snapshot.spec (checkpointID + source.podRef);
self-contained SnapshotContent.snapshotHandle; conditions-based status;
whole spec immutable via CEL. Types only -- no controller, webhook, or
RBAC yet; inert until later components. Single-version, no conversion.

Signed-off-by: Ron Kahn <rkahn@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant