Skip to content

OCPBUGS-38401: Mirror MAPI userData to openshift-cluster-api namespace for CAPI support#3918

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
jrvaldes:OCPBUGS-38401
Apr 9, 2026
Merged

OCPBUGS-38401: Mirror MAPI userData to openshift-cluster-api namespace for CAPI support#3918
openshift-merge-bot[bot] merged 1 commit intoopenshift:masterfrom
jrvaldes:OCPBUGS-38401

Conversation

@jrvaldes
Copy link
Copy Markdown
Contributor

@jrvaldes jrvaldes commented Apr 2, 2026

When the ClusterAPIMachineManagement feature gate is enabled, OpenShift provisions the openshift-cluster-api namespace and CAPI-based MachineSets expect the windows-user-data bootstrap secret to exist there. Before, WMCO only created this secret in the openshift-machine-api namespace, causing CAPI-provisioned Windows machines to remain stuck in Pending.

Mirror the secret to openshift-cluster-api whenever that namespace exists, keeping it in sync on every reconcile. Deleting the copy triggers automatic re-creation via the existing Watches predicate, now updated to match both namespaces.

Fixes OCPBUGS-38401

Summary by CodeRabbit

  • New Features

    • Expanded secret management to support both Machine API and Cluster API namespaces, with conditional mirroring to the openshift-cluster-api namespace when present.
    • Added a defined Cluster API namespace value used by the reconciler.
  • Tests

    • Added unit tests covering secret detection, cluster-namespace presence, creation, no-op updates, and replacement behavior.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 2, 2026
@openshift-ci-robot openshift-ci-robot added the jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. label Apr 2, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 2, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 2, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@jrvaldes: This pull request references Jira Issue OCPBUGS-38401, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

When the ClusterAPIMachineManagement feature gate is enabled, OpenShift provisions the openshift-cluster-api namespace and CAPI-based MachineSets expect the windows-user-data bootstrap secret to exist there. Before, WMCO only created this secret in the openshift-machine-api namespace, causing CAPI-provisioned Windows machines to remain stuck in Pending.

Mirror the secret to openshift-cluster-api whenever that namespace exists, keeping it in sync on every reconcile. Deleting the copy triggers automatic re-creation via the existing Watches predicate, now updated to match both namespaces.

Fixes OCPBUGS-38401

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Apr 2, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (2)
  • do-not-merge/work-in-progress
  • do-not-merge/hold

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 616f962c-bf40-43d0-852b-beeee96788f1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This change extends userData Secret reconciliation to support both the Machine API and Cluster API namespaces within OpenShift. The SecretReconciler now detects the presence of the openshift-cluster-api namespace and conditionally mirrors userData Secrets into it. Two new helper methods—isClusterAPIEnabled and ensureCAPIUserDataSecret—implement namespace detection and Secret synchronization logic. A new constant ClusterAPINamespace is defined in the cluster configuration package. Corresponding unit tests validate the expanded namespace detection, conditional mirroring behavior, and idempotent Secret upsert operations.

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Go Best Practices & Build Tags ⚠️ Warning Controllers missing required //go:build !windows tags for Linux-only reconciliation on control planes. Add //go:build !windows directive to controllers/secret_controller.go and controllers/secret_controller_test.go as first line before package declaration.
Windows Service Management ❓ Inconclusive PR changes focus on Kubernetes secret mirroring for CAPI support, not Windows service management. No SCM, reboot, or service configuration logic detected. Clarify if Windows service management check applies to this PR or if check scope is misaligned with actual Kubernetes secret mirroring changes.
✅ Passed checks (3 passed)
Check name Status Explanation
Security: Secrets, Ssh & Csr ✅ Passed Secret data handling is secure: no logging of userData/credentials, byte-level comparison without string conversion, error handling doesn't leak sensitive data, no SSH/CSR operations involved.
Kubernetes Controller Patterns ✅ Passed PR implements solid Kubernetes controller best practices with correct error handling, idempotent reconciliation logic, proper watch predicates, status condition updates, optimistic locking via ResourceVersion, comprehensive test validation, and appropriate finalizer handling.
Platform-Specific Requirements ✅ Passed PR introduces platform-neutral secret mirroring for CAPI MachineSets without platform-specific constraints or modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 2, 2026
@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

✅ Actions performed

Full review triggered.

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

/approve cancel

@openshift-ci openshift-ci bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 2, 2026
@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

/test ?

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

/test vsphere-e2e-operator

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@controllers/secret_controller.go`:
- Around line 201-211: The reconcile path incorrectly returns after creating the
missing MAPI secret; update the error-handling around k8sapierrors.IsNotFound so
that when IsNotFound triggers you attempt Create via r.client.Create(ctx,
validUserData) and only return if that Create itself errors, but do not
log+return unconditionally afterward—ensure the r.log.Error(...) and return err
only run for non-NotFound retrieval errors (or create failures) so the code
proceeds to the subsequent CAPI mirroring logic in the same Reconcile pass.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 93241706-2ba8-428d-a1d6-81bc7ac40f8a

📥 Commits

Reviewing files that changed from the base of the PR and between 1293d5c and b2d8a2a.

📒 Files selected for processing (3)
  • controllers/secret_controller.go
  • controllers/secret_controller_test.go
  • pkg/cluster/config.go

Comment thread controllers/secret_controller.go Outdated
@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

the aws-e2e-olmv1-install periodic job has TechPreview enabled and should be able to validate this

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

/payload-job periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 2, 2026

@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/dba3c060-2ebb-11f1-9589-3b015ec2a0e4-0

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

/test images

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

Fix proposed in the release repo to address the image promotion on the periodic jobs

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 2, 2026

/payload-job periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 2, 2026

@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/942b5570-2ecc-11f1-9f7b-b2ecc0b4486c-0

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 3, 2026

/payload-job periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 3, 2026

@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6fdde5e0-2f92-11f1-92c6-b9db8ea98f74-0

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 3, 2026

/payload-job periodic-ci-openshift-windows-machine-config-operator-master-vsphere-e2e-operator-fips

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 3, 2026

@jrvaldes: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-windows-machine-config-operator-master-vsphere-e2e-operator-fips

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/76b944e0-2f92-11f1-8a8c-6a9560b9d752-0

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 3, 2026

/payload-with-prs periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install https://github.com/openshift/windows-machine-config-operator#3918

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 3, 2026

@jrvaldes: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 3, 2026

/payload-job-with-prs periodic-ci-openshift-windows-machine-config-operator-master-aws-e2e-olmv1-install https://github.com/openshift/windows-machine-config-operator#3918

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 6, 2026

/retest

@mansikulkarni96
Copy link
Copy Markdown
Member

@jrvaldes I understand periodics cannot be run on this at the moment. Have you been able to test this locally?
Looks like service reconciliation tests are failing, is that related to this change?

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 8, 2026

@jrvaldes I understand periodics cannot be run on this at the moment. Have you been able to test this locally? Looks like service reconciliation tests are failing, is that related to this change?

tested with 4.22.0-0.ci-2026-04-08-005530 cluster with FeatureSet "TechPreviewNoUpgrade" enabled,

windows-user-data secret

oc get secret -A | grep windows-user-data
openshift-cluster-api windows-user-data Opaque 1 2m49s
openshift-machine-api windows-user-data

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 8, 2026

/retest-required

if err != nil && k8sapierrors.IsNotFound(err) {
if err != nil {
if !k8sapierrors.IsNotFound(err) {
r.log.Error(err, "error retrieving the secret", "name", secrets.UserDataSecret)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were not returning error before but creating it if deleted, why is this changing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct, and I tried to keep that behavior, see the negation (!) in the !k8sapierrors.IsNotFound(err)

Given the CAPI secret, reconciliation must occur lower in the execution flow; the logic cannot return earlier after a successful creation, so handle this case and a counterexample.

return err
}
// Secret created successfully - don't requeue
return nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so will this requeue now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return r.ensureCAPIUserDataSecret(ctx, validUserData)
}

// reconciliation successful, no need to requeue
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment may be redundant

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)

// newTestSecretReconciler returns a SecretReconciler backed by a fake client pre-seeded with initObjs.
func newTestSecretReconciler(initObjs ...client.Object) *SecretReconciler {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have not been adding unit tests for reconcilers since they are tested as part of e2e tests. Is this a pattern we want to follow?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, for this case there is no e2e test yet that covers spining a cluster with a MAPI machineset, so proposed unit test coverage to ensure the namepsace' error handling is valid.

Do you anticipate any issues? I'd be happy to remove. LMK

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no I am okay with this, lets remove it once we can e2e test it.

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 8, 2026

/test nutanix-e2e-operator

@mansikulkarni96
Copy link
Copy Markdown
Member

/lgtm

@mansikulkarni96 mansikulkarni96 marked this pull request as ready for review April 9, 2026 14:28
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 9, 2026
@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 9, 2026
@openshift-ci openshift-ci bot requested a review from mansikulkarni96 April 9, 2026 14:30
@mansikulkarni96
Copy link
Copy Markdown
Member

/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 9, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mansikulkarni96

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 9, 2026
When the ClusterAPIMachineManagement feature gate is enabled,
OpenShift provisions the openshift-cluster-api namespace and
CAPI-based MachineSets expect the windows-user-data bootstrap
secret to exist there. Before, WMCO only created this secret in the
openshift-machine-api namespace, causing CAPI-provisioned
Windows machines to remain stuck in Pending.

Mirror the secret to openshift-cluster-api whenever that
namespace exists, keeping it in sync on every reconcile.
Deleting the copy triggers automatic re-creation via the
existing Watches predicate, now updated to match both
namespaces.

Fixes OCPBUGS-38401
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 9, 2026
@mansikulkarni96
Copy link
Copy Markdown
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 9, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 374b870 and 2 for PR HEAD 6519963 in total

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 9, 2026

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 9, 2026

@jrvaldes: Overrode contexts on behalf of jrvaldes: ci/prow/azure-e2e-operator

Details

In response to this:

/override ci/prow/azure-e2e-operator

passed before

https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_windows-machine-config-operator/3918/pull-ci-openshift-windows-machine-config-operator-master-azure-e2e-operator/2041199333927292928

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jrvaldes
Copy link
Copy Markdown
Contributor Author

jrvaldes commented Apr 9, 2026

/override ci/prow/wicd-unit-vsphere

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 9, 2026

@jrvaldes: Overrode contexts on behalf of jrvaldes: ci/prow/wicd-unit-vsphere

Details

In response to this:

/override ci/prow/wicd-unit-vsphere

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Apr 9, 2026

@jrvaldes: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 03725d1 into openshift:master Apr 9, 2026
19 checks passed
@openshift-ci-robot
Copy link
Copy Markdown

@jrvaldes: Jira Issue OCPBUGS-38401: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-38401 has been moved to the MODIFIED state.

Details

In response to this:

When the ClusterAPIMachineManagement feature gate is enabled, OpenShift provisions the openshift-cluster-api namespace and CAPI-based MachineSets expect the windows-user-data bootstrap secret to exist there. Before, WMCO only created this secret in the openshift-machine-api namespace, causing CAPI-provisioned Windows machines to remain stuck in Pending.

Mirror the secret to openshift-cluster-api whenever that namespace exists, keeping it in sync on every reconcile. Deleting the copy triggers automatic re-creation via the existing Watches predicate, now updated to match both namespaces.

Fixes OCPBUGS-38401

Summary by CodeRabbit

  • New Features

  • Expanded secret management to support both Machine API and Cluster API namespaces, with conditional mirroring to the openshift-cluster-api namespace when present.

  • Added a defined Cluster API namespace value used by the reconciler.

  • Tests

  • Added unit tests covering secret detection, cluster-namespace presence, creation, no-op updates, and replacement behavior.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jrvaldes jrvaldes deleted the OCPBUGS-38401 branch April 11, 2026 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants