OCPBUGS-38401: Mirror MAPI userData to openshift-cluster-api namespace for CAPI support#3918
Conversation
|
Skipping CI for Draft Pull Request. |
|
@jrvaldes: This pull request references Jira Issue OCPBUGS-38401, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (2)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis change extends userData Secret reconciliation to support both the Machine API and Cluster API namespaces within OpenShift. The 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
/approve cancel |
|
/test ? |
|
/test vsphere-e2e-operator |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@controllers/secret_controller.go`:
- Around line 201-211: The reconcile path incorrectly returns after creating the
missing MAPI secret; update the error-handling around k8sapierrors.IsNotFound so
that when IsNotFound triggers you attempt Create via r.client.Create(ctx,
validUserData) and only return if that Create itself errors, but do not
log+return unconditionally afterward—ensure the r.log.Error(...) and return err
only run for non-NotFound retrieval errors (or create failures) so the code
proceeds to the subsequent CAPI mirroring logic in the same Reconcile pass.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 93241706-2ba8-428d-a1d6-81bc7ac40f8a
📒 Files selected for processing (3)
controllers/secret_controller.gocontrollers/secret_controller_test.gopkg/cluster/config.go
|
the aws-e2e-olmv1-install periodic job has TechPreview enabled and should be able to validate this |
|
/payload-job periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install |
|
@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/dba3c060-2ebb-11f1-9589-3b015ec2a0e4-0 |
|
/test images |
|
Fix proposed in the release repo to address the image promotion on the periodic jobs |
|
/payload-job periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install |
|
@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/942b5570-2ecc-11f1-9f7b-b2ecc0b4486c-0 |
|
/payload-job periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install |
|
@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6fdde5e0-2f92-11f1-92c6-b9db8ea98f74-0 |
|
/payload-job periodic-ci-openshift-windows-machine-config-operator-master-vsphere-e2e-operator-fips |
|
@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/76b944e0-2f92-11f1-8a8c-6a9560b9d752-0 |
|
/payload-with-prs periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install https://github.com/openshift/windows-machine-config-operator#3918 |
|
/payload-job-with-prs periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install https://github.com/openshift/windows-machine-config-operator#3918 |
|
/retest |
|
@jrvaldes I understand periodics cannot be run on this at the moment. Have you been able to test this locally? |
tested with 4.22.0-0.ci-2026-04-08-005530 cluster with FeatureSet "TechPreviewNoUpgrade" enabled, windows-user-data secret oc get secret -A | grep windows-user-data |
|
/retest-required |
| if err != nil && k8sapierrors.IsNotFound(err) { | ||
| if err != nil { | ||
| if !k8sapierrors.IsNotFound(err) { | ||
| r.log.Error(err, "error retrieving the secret", "name", secrets.UserDataSecret) |
There was a problem hiding this comment.
We were not returning error before but creating it if deleted, why is this changing?
There was a problem hiding this comment.
correct, and I tried to keep that behavior, see the negation (!) in the !k8sapierrors.IsNotFound(err)
Given the CAPI secret, reconciliation must occur lower in the execution flow; the logic cannot return earlier after a successful creation, so handle this case and a counterexample.
| return err | ||
| } | ||
| // Secret created successfully - don't requeue | ||
| return nil |
There was a problem hiding this comment.
so will this requeue now?
There was a problem hiding this comment.
with comment // reconciliation successful, no need to requeue
| return r.ensureCAPIUserDataSecret(ctx, validUserData) | ||
| } | ||
|
|
||
| // reconciliation successful, no need to requeue |
There was a problem hiding this comment.
this comment may be redundant
There was a problem hiding this comment.
| ) | ||
|
|
||
| // newTestSecretReconciler returns a SecretReconciler backed by a fake client pre-seeded with initObjs. | ||
| func newTestSecretReconciler(initObjs ...client.Object) *SecretReconciler { |
There was a problem hiding this comment.
we have not been adding unit tests for reconcilers since they are tested as part of e2e tests. Is this a pattern we want to follow?
There was a problem hiding this comment.
right, for this case there is no e2e test yet that covers spining a cluster with a MAPI machineset, so proposed unit test coverage to ensure the namepsace' error handling is valid.
Do you anticipate any issues? I'd be happy to remove. LMK
There was a problem hiding this comment.
no I am okay with this, lets remove it once we can e2e test it.
|
/test nutanix-e2e-operator |
|
/lgtm |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mansikulkarni96 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
When the ClusterAPIMachineManagement feature gate is enabled, OpenShift provisions the openshift-cluster-api namespace and CAPI-based MachineSets expect the windows-user-data bootstrap secret to exist there. Before, WMCO only created this secret in the openshift-machine-api namespace, causing CAPI-provisioned Windows machines to remain stuck in Pending. Mirror the secret to openshift-cluster-api whenever that namespace exists, keeping it in sync on every reconcile. Deleting the copy triggers automatic re-creation via the existing Watches predicate, now updated to match both namespaces. Fixes OCPBUGS-38401
|
/lgtm |
|
@jrvaldes: Overrode contexts on behalf of jrvaldes: ci/prow/azure-e2e-operator DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/override ci/prow/wicd-unit-vsphere |
|
@jrvaldes: Overrode contexts on behalf of jrvaldes: ci/prow/wicd-unit-vsphere DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@jrvaldes: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@jrvaldes: Jira Issue OCPBUGS-38401: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-38401 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
When the ClusterAPIMachineManagement feature gate is enabled, OpenShift provisions the openshift-cluster-api namespace and CAPI-based MachineSets expect the windows-user-data bootstrap secret to exist there. Before, WMCO only created this secret in the openshift-machine-api namespace, causing CAPI-provisioned Windows machines to remain stuck in Pending.
Mirror the secret to openshift-cluster-api whenever that namespace exists, keeping it in sync on every reconcile. Deleting the copy triggers automatic re-creation via the existing Watches predicate, now updated to match both namespaces.
Fixes OCPBUGS-38401
Summary by CodeRabbit
New Features
Tests