readinessgate: replace live pod Get with a node-scoped informer

## Background

PR #638 introduced the `--pod-readiness-gate` flag (package `pkg/readinessgate`), which defers certificate issuance until specified pod conditions are met. Evaluating a gate requires reading the current state of the pod that owns the volume.

The initial implementation performs a live `client.CoreV1().Pods(ns).Get(...)` call on every gate evaluation:

 https://github.com/cert-manager/csi-driver/blob/main/pkg/readinessgate/readinessgate.go (look for the `TODO` comment)

  ## Problem

`csi-lib`'s renewal loop fires roughly once per second per managed volume. With the current implementation, that translates to one apiserver call per second per pending volume on the node.

  On a node hosting many pods that are awaiting their gates:
  - The driver's client-go QPS limit (default 5 QPS, 10 burst) gets exhausted quickly.
  - Once throttled, gate evaluation slows down, which in turn delays certificate issuance for every pending volume on that node.
  - It also adds avoidable load to the apiserver.

  As @SgtCoDFish noted in https://github.com/cert-manager/csi-driver/pull/638#discussion_r3225023229, this is acceptable for an opt-in feature today, but could bite users at scale and should be tracked.

  ## Proposed fix

  Replace the live `Get` with a shared pod informer scoped to the local node via a `spec.nodeName` field selector. This:
  - Eliminates the per-second apiserver call — readiness gate evaluation becomes a cache lookup.
  - Bounds memory to pods scheduled on this node only (a DaemonSet runs one pod per node, so a node-scoped informer is the right granularity).
  - Sets the informer up only when `--pod-readiness-gate` is provided, so the default deployment is unaffected.

  The local node name is already available to the driver (passed via `--node-id` / `NODE_NAME`).

  ## Acceptance criteria

  - [ ] `readinessgate.NewReadyToRequestFunc` reads pods from an informer cache rather than calling the apiserver on each evaluation.
  - [ ] Informer uses a `spec.nodeName=<this-node>` field selector.
  - [ ] Informer is started only when `--pod-readiness-gate` is set.
  - [ ] Unit tests cover the cache-miss path (pod not yet known to the informer).
  - [ ] The existing `TODO` comment in `pkg/readinessgate/readinessgate.go` is removed.

  ## Related

  - PR #638 (introducing `--pod-readiness-gate`)
  - Review thread: https://github.com/cert-manager/csi-driver/pull/638#discussion_r3225023229

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readinessgate: replace live pod Get with a node-scoped informer #646

Background

Problem

Proposed fix

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

readinessgate: replace live pod Get with a node-scoped informer #646

Description

Background

Problem

Proposed fix

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions