Skip to content

OCPBUGS-66334: remove version check from guard precheck#1520

Open
tjungblu wants to merge 1 commit into
openshift:mainfrom
tjungblu:OCPBUGS-66334
Open

OCPBUGS-66334: remove version check from guard precheck#1520
tjungblu wants to merge 1 commit into
openshift:mainfrom
tjungblu:OCPBUGS-66334

Conversation

@tjungblu

@tjungblu tjungblu commented Dec 4, 2025

Copy link
Copy Markdown
Contributor

This was inadvertently deleting guard pods during upgrades, which caused etcd quorum loss while another component drained a node during a static pod rollout.

Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Dec 4, 2025
@openshift-ci-robot

Copy link
Copy Markdown

@tjungblu: This pull request references Jira Issue OCPBUGS-66334, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

This was inadvertently deleting guard pods during upgrades, which caused etcd quorum loss while another component drained a node during a static pod rollout.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Dec 4, 2025

Copy link
Copy Markdown

Walkthrough

Removed version alignment and synchronization checks from the guardRolloutPreCheck logic in the operator starter. The pre-check now only validates non-SNO topology using NewIsSingleNodePlatformFn, eliminating operator-version gating and etcd clusteroperator status synchronization waits.

Changes

Cohort / File(s) Summary
Operator startup pre-check logic
pkg/operator/starter.go
Removed etcd clusteroperator/operator version alignment verification, Status.Versions extraction, expected version comparison, and related synchronization/mismatch error handling from guardRolloutPreCheck. Simplified to only determine non-SNO topology.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

  • Verify removal impact: Ensure no downstream code depends on the removed version alignment checks or synchronization waits during operator startup.
  • Operator initialization flow: Confirm the simplified pre-check (topology-only) doesn't bypass critical version compatibility validations elsewhere.
  • Error handling: Review whether elimination of not-synced and mismatch error scenarios could mask version inconsistencies in SNO/non-SNO deployments.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 57c4cb5 and a703d01.

📒 Files selected for processing (1)
  • pkg/operator/starter.go (0 hunks)
💤 Files with no reviewable changes (1)
  • pkg/operator/starter.go

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from dusk125 and jubittajohn December 4, 2025 10:52
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 4, 2025
@dusk125

dusk125 commented Dec 4, 2025

Copy link
Copy Markdown
Contributor

/retest-required

@openshift-ci

openshift-ci Bot commented Dec 4, 2025

Copy link
Copy Markdown
Contributor

@tjungblu: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dusk125

dusk125 commented Dec 4, 2025

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Dec 4, 2025
@openshift-ci

openshift-ci Bot commented Dec 4, 2025

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dusk125, tjungblu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tjungblu

tjungblu commented Dec 5, 2025

Copy link
Copy Markdown
Contributor Author

Thanks @dusk125 - I'm going to leave this here until the critical fix label is lifted again

@openshift-bot

Copy link
Copy Markdown
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci Bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2026
@dusk125

dusk125 commented Mar 5, 2026

Copy link
Copy Markdown
Contributor

/remove-lifecycle stale

@openshift-ci openshift-ci Bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 5, 2026
@tjungblu

Copy link
Copy Markdown
Contributor Author

/verified by ci

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 15, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@tjungblu: This PR has been marked as verified by ci.

Details

In response to this:

/verified by ci

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@tjungblu

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 15, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@tjungblu: This pull request references Jira Issue OCPBUGS-66334, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @geliu2016

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

@openshift-ci-robot: GitHub didn't allow me to request PR reviews from the following users: geliu2016.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

@tjungblu: This pull request references Jira Issue OCPBUGS-66334, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @geliu2016

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tjungblu

Copy link
Copy Markdown
Contributor Author

/cherry-pick release-4.22 release-4.21 release-4.20 release-4.19 release-4.18

@openshift-cherrypick-robot

Copy link
Copy Markdown

@tjungblu: once the present PR merges, I will cherry-pick it on top of release-4.22 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.22 release-4.21 release-4.20 release-4.19 release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants