Skip to content

OCPEDGE-2280: Add Adaptable Topology, reorganize topology enhancements#1905

Closed
jaypoulz wants to merge 7 commits into
openshift:masterfrom
jaypoulz:adaptable-topology
Closed

OCPEDGE-2280: Add Adaptable Topology, reorganize topology enhancements#1905
jaypoulz wants to merge 7 commits into
openshift:masterfrom
jaypoulz:adaptable-topology

Conversation

@jaypoulz
Copy link
Copy Markdown
Contributor

@jaypoulz jaypoulz commented Dec 10, 2025

Retired in favor of #2008.

This enhancement introduces Adaptable topology, a new cluster-topology mode that enables clusters to dynamically adjust their behavior based on node count. This allows SingleReplica clusters to scale to multi-node configurations without redeployment.

Key features:

  • Automatic behavior adjustment as control-plane and worker nodes scale
  • One-way transition from SingleReplica to Adaptable topology
  • Operator compatibility declarations via OLM annotations
  • CLI command for safe topology transitions with compatibility checks
  • Shared utilities in library-go to ease operator implementation

The proposal includes complete workflow descriptions, API extensions, test plans, and version skew strategy. Future stages will add AutomaticQuorumRecovery (AQR) to enable DualReplica-based resiliency for two-node configurations.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 10, 2025
@openshift-ci-robot
Copy link
Copy Markdown

@jaypoulz: This pull request explicitly references no jira issue.

Details

In response to this:

This enhancement introduces Adaptable topology, a new cluster-topology mode that enables clusters to dynamically adjust their behavior based on node count. This allows SingleReplica clusters to scale to multi-node configurations without redeployment.

Key features:

  • Automatic behavior adjustment as control-plane and worker nodes scale
  • One-way transition from SingleReplica to Adaptable topology
  • Operator compatibility declarations via OLM annotations
  • CLI command for safe topology transitions with compatibility checks
  • Shared utilities in library-go to ease operator implementation

The proposal includes complete workflow descriptions, API extensions, test plans, and version skew strategy. Future stages will add AutomaticQuorumRecovery (AQR) to enable DualReplica-based resiliency for two-node configurations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jaypoulz
Copy link
Copy Markdown
Contributor Author

jaypoulz commented Dec 11, 2025

/retitle OCPEDGE-2280: Add Adaptable Topology, reorganize topology enhancements

@openshift-ci openshift-ci Bot changed the title NO-JIRA: Add Adaptable Topology enhancement proposal OCPEDGE-2280: Add Adaptable Topology, reorganize topology enhancements Dec 11, 2025
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Dec 11, 2025

@jaypoulz: This pull request references OCPEDGE-2280 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.22" instead.

Details

In response to this:

This enhancement introduces Adaptable topology, a new cluster-topology mode that enables clusters to dynamically adjust their behavior based on node count. This allows SingleReplica clusters to scale to multi-node configurations without redeployment.

Key features:

  • Automatic behavior adjustment as control-plane and worker nodes scale
  • One-way transition from SingleReplica to Adaptable topology
  • Operator compatibility declarations via OLM annotations
  • CLI command for safe topology transitions with compatibility checks
  • Shared utilities in library-go to ease operator implementation

The proposal includes complete workflow descriptions, API extensions, test plans, and version skew strategy. Future stages will add AutomaticQuorumRecovery (AQR) to enable DualReplica-based resiliency for two-node configurations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jaypoulz
Copy link
Copy Markdown
Contributor Author

/retest

Comment thread enhancements/edge-topologies/adaptable-topology.md
Comment thread enhancements/edge-topologies/adaptable-topology.md
Comment thread enhancements/edge-topologies/adaptable-topology.md
Comment thread enhancements/edge-topologies/adaptable-topology.md Outdated
Comment thread enhancements/edge-topologies/adaptable-topology.md
Comment thread enhancements/edge-topologies/adaptable-topology.md
Comment thread enhancements/edge-topologies/adaptable-topology.md
Comment thread enhancements/edge-topologies/adaptable-topology.md
@openshift-bot
Copy link
Copy Markdown

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci Bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2026
@openshift-bot
Copy link
Copy Markdown

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci Bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 23, 2026
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2026
@jaypoulz
Copy link
Copy Markdown
Contributor Author

/remove lifecycle-rotten

@jaypoulz
Copy link
Copy Markdown
Contributor Author

/remove-lifecycle rotten

@openshift-ci openshift-ci Bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 23, 2026
@dhensel-rh
Copy link
Copy Markdown
Contributor

Are there considerations needed for scaling the storage along with the nodes ? This seems like a concern that would need to be tested. Adding nodes seems somewhat easy (I could be talking out of my hat), but scaling storage down as the nodes decrease seems like a challenge.

@dhensel-rh
Copy link
Copy Markdown
Contributor

Is the Edge Enablement team responsible for testing the additions to the other team areas, or does that specific team assist in the testing responsibility ?

| Serial tests | Monthly | Standard test suite (openshift/conformance/serial) on Adaptable topology clusters |
| Upgrade between z-streams | Weekly | Test upgrades on clusters running Adaptable topology |
| Upgrade between y-streams | Weekly | Test upgrades across minor versions on clusters running Adaptable topology |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever becomes of the openshift-test-private tests from QE. This is currently being looked at and moved into other areas.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah these are meant to replicate the lanes we have for SNO in particular. I think the standard conformance suite and upgrade workflows will be mostly applicable here and acceptable for a first pass at this functionality

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaypoulz can you resolve this one?

@openshift-bot
Copy link
Copy Markdown

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci Bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 26, 2026
@openshift-bot
Copy link
Copy Markdown

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci Bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 5, 2026
@jeff-roche
Copy link
Copy Markdown

/remove-lifecycle rotten

Comment thread enhancements/edge-topologies/adaptable-topology.md Outdated
#### Risk: etcd Data Loss If Transitions Are Not Atomic

**Risk**: If CEO cannot make 1→3 or 3→1 etcd member transitions truly atomic,
data loss or corruption could occur.
Copy link
Copy Markdown
Contributor

@tjungblu tjungblu Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case the thought was in the event of the 2-node transient state and if you lose one of the etcd members during that state. That being said we can address this on the other threads about automicity and scaling down

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rephrase this a bit here. The concern is that "quorum loss could occur".

Data loss or corruption is really only a concern if split-brain happens, and that only happens if you lose quorum in the two node state and someone forces a new cluster on both sides.

Comment thread enhancements/edge-topologies/adaptable-topology.md Outdated
Add non-functional constraint clarifying no availability guarantee during
topology transitions, and document compact clusters (dual-role nodes) as
a supported transition path from SNO.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 3, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from dgoodwin. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment thread enhancements/edge-topologies/adaptable-topology.md
jeff-roche and others added 2 commits April 20, 2026 15:29
…back

Drop atomic etcd member transitions in favor of the sequential bootstrap
pattern (1→2→3) for the first iteration. Add dedicated etcd scaling
mechanism section, platform:none scoping, scale-down CLI tooling,
two-node enforcement, and controlPlaneNodeCount Infrastructure status field.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix all pre-existing markdownlint-cli2 errors:
- MD060: align table separator pipe positions with headers
- MD060: add spaces in separator rows for compact-style tables
- MD013: wrap over-length line in topology audit section
- MD051: fix broken link fragment for bare metal risk section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jeff-roche
Copy link
Copy Markdown

/retest

before the next is added.
The 2-member state (quorum=2) is transient; losing either member
during this window is fatal.
When scaling down below 3 control-plane nodes,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to axe this given the meeting today.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to future considerations

* Workload-aware topology decisions
(e.g., transitioning based on application requirements rather than node count)
* Supporting topology transitions for HyperShift clusters
* Implementing topology transitions for MicroShift deployments
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add scale down here for the time being.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this is the natural spot for it currently

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

* Automatically adjust infrastructure workload distribution based on
the number of worker nodes
* Provide a mechanism for operators to detect the topology behavior
through the Infrastructure API
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the call today there was a mention of machine config being the authoritative source, is the above line out of date or am I misunderstanding the context.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will expose controlPlaneNodeCount that the machine config operator will be responsible for update. I will clarify this point

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I already did, it's below in the API Extensions > Infrastructure Config Changes section

**cluster administrator** is managing a cluster already running Adaptable topology.

**Non-functional constraint**: There is no availability guarantee during topology
transitions. Scaling control-plane or worker nodes is an explicit operational
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would scaling worker counts be treated as an operational action? I can see control plane scaling, but curious why workers are included here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think a case of thinking to quick here, I think it only applies to CP nodes as you mention so I will drop the worker mention

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

- Each new node joins as a learner and is promoted to a voting member before the next is added
- The 2-member state is transient — quorum requires both members, so losing either is fatal during this window
- Other operators adjust their behavior to match HighlyAvailable control-plane topology
5. When scaling down and crossing the 3→2 control-plane node threshold:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For removal.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to future considerations

@dgoodwin
Copy link
Copy Markdown
Contributor

/lgtm
/hold

Feel free to release hold when you feel it's ready to merge and proceed to PoC

@openshift-ci openshift-ci Bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Apr 21, 2026
…idation

Scale-down of control-plane and worker nodes is not a goal of this
enhancement. Changes:

- Add scale-down as a non-goal
- Remove scale-down details from current-scope sections:
  - etcd scale-down paragraph from "How Adaptable Topology Works"
  - 3→2 control-plane threshold from "Scaling Control-Plane Nodes"
  - 3→1 etcd scale-down sequence from "Behavior at Three Nodes"
  - Worker node scale-down steps from "Scaling Worker Nodes"
  - `oc adm topology scale-down` CLI command from "oc CLI Changes"
- Update non-functional constraint to scope to control-plane scaling only
- Narrow quorum loss risk to scale-up only (1→2→3)
- Remove scale-down test cases from the test plan
- Update workflow language from "adding or removing" to "adding"
- Add new "Future Considerations" section containing:
  - Scale-down operations (control-plane 3→1, worker thresholds, CLI)
  - Control-plane performance validation during scaling, mirroring
    assisted installer host validation checks (disk I/O, network
    latency, resource capacity)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 21, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 21, 2026

@jaypoulz: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dgoodwin
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Apr 21, 2026
Comment on lines +962 to +964
**Ambiguous Target State**: Should transitioning to HighlyAvailable create a
3-node compact cluster or provision compute nodes for a 5-node cluster?
The end state is unclear.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens today if I take a SNO cluster and create two additional Machines and label them up as control plane? Do they turn into control plane nodes?

Wondering if a transition from SNO to HA is actually feasible as a one direction change? Do we ever see people asking for the reverse?

I'm not entirely sure that creating a new name (Adaptable) removes any of the issues you describe here

Copy link
Copy Markdown
Contributor Author

@jaypoulz jaypoulz Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens today if I take a SNO cluster and create two additional Machines and label them up as control plane? Do they turn into control plane nodes?

Yes, they become control-plane nodes. However, operators such as ingress and console do not respond to the changed topology. They maintain the same number of replicas and filtering as they set when the cluster was deployed. Some operators, such as etcd and api-server, do actually respond as expected.

Wondering if a transition from SNO to HA is actually feasible as a one direction change? Do we ever see people asking for the reverse?

It is technically possible, but you need to change the topology field, delete the configuration of your existing operators, and effectively reinitialize the controllers as though it was a new cluster. I am aware of only 1 ask for scale-down, but we've decided to mark that flow as out of scope for the initial delivery of this feature.

I'm not entirely sure that creating a new name (Adaptable) removes any of the issues you describe here

It doesn't address those particular issues.

The issues it does address are as follows:

  1. Mindset - Topologies have always been designed to be static. The goal of Adaptable topology is to be able to think of topology as fluid. No longer are we thinking of OpenShift in terms of SNO vs Two Node vs HA Compact vs HA with Workers - instead there is just a number of control-plane nodes and infrastructure nodes that have been confirmed by MCO, and each component in the cluster is responsible for behaving appropriately according to the nodes available.
  2. Layered Products/3rd Party Operators - How does a layered product or 3rd party operator identify that it should be expected to adapt to changing control-plane and infrastructure node conditions? We need a way to be able to distinguish between the operational expectations of OpenShift-with-Topologies to just OpenShift that behaves appropriately regardless of its node composition.
  3. It's a 1-way switch - once you go adaptable, there's no going back. You are opting in to a new paradigm where node-resource counts define cluster behavior instead of install-time defined configurations.

The main compelling alternative would be to add a new field to infrastructure that is not an enum per se, but essential tracks if you are operating with mutable topology or not. The tradeoffs with this approach is that we now have to compute and update the effective topology and update it in the infrastructure config. For 3rd party operators and layered products, this always breaks their expectations around something that was always an invariant. Additionally, it preserves a second limitation of topologies as we have them today - having to look at a second API if you have differences in behavior that are grouped together by topology definitions. (E.g. A third party operator may want to define behavior differently on a cluster with 2 compute nodes (HA infrastructure topology) vs 3 nodes (HA infrastructure topology). This last point is more of an academic concern related to storage offerings that have a majority quorum component like etcd, but I did want to include it since there is potential these could increase now that TNF is going GA.

Overall, I think if you see OpenShift for what it is today, I think it's easier to look at this and say - let's just architect this around the mechanic of updating the topology fields in-place because it's the least amount of disruption. What we are proposing is a greater vision for what OpenShift would become - eventually all clusters would be adaptable by default and looking at the replacement fields to controlPlaneNodes and infrastructureNodes to determine their appropriate/desired topological behavior.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't have an issue with clusters becoming adaptable - the idea of allowing transition from SNO to HA makes sense. My issue I think is around whether we need a new API for this.

delete the configuration of your existing operators

Can you expand on this point? Which operators need config deleting and reconfiguring manually?
How does adding a new adaptable topology solve this? Or is it just you now have to go to these components and build logic that says "if you see adaptable, you must ..."

How does a layered product or 3rd party operator identify that it should be expected to adapt to changing control-plane and infrastructure node conditions? We need a way to be able to distinguish between the operational expectations of OpenShift-with-Topologies to just OpenShift that behaves appropriately regardless of its node composition.

Have you conducted research into exactly who consumes the toplogy field as it is today? And with that, how do the consumers behave when they see a particular value? Do you know what they do if they see a value that they don't currently recognise?

Have you found examples of any where upon restart, they wouldn't just adjust if they saw a new topology? Any examples where a single to multi node transition needs to actually be implemented?

Why does a controller looking at the size of the cluster and updating the topology field between the current available enum values not suffice here?

It's a 1-way switch - once you go adaptable, there's no going back. You are opting in to a new paradigm where node-resource counts define cluster behavior instead of install-time defined configurations.

My counter to this would be once you go from SNO to HA, there's no going back. I think it's equivalent here to be honest. Because that's effectively what you're saying adaptable is - you might start as SNO, and then move to HA, or you might not. And the adaptable flag just says to the operators, be prepared. What if they were just always prepared?

but essential tracks if you are operating with mutable topology or not.

Or we make every cluster mutable

For 3rd party operators and layered products, this always breaks their expectations around something that was always an invariant.

This project will break that invariant no matter how it is implemented

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Topologies have always been designed to be static. The goal of Adaptable topology is to be able to think of topology as fluid. No longer are we thinking of OpenShift in terms of SNO vs Two Node vs HA Compact vs HA with Workers - instead there is just a number of control-plane nodes and infrastructure nodes that have been confirmed by MCO, and each component in the cluster is responsible for behaving appropriately according to the nodes available.

The problem is that the correct behaviour always depends on the user's intent, which isn't known to us. A cluster with 3 control plane nodes and 1 worker could represent a normally operating cluster, a degraded or misconfigured cluster to which the user should be alerted, an intentional cost/risk balancing measure, or a dangerous security failure that should be avoided at all costs. The difference is what the user intended to happen.

Historically we have 'solved' this by taking the cluster topology at install time as an indication of the user's intent. This approach is deficient in two ways. First, it makes it hard to change after the fact. But more importantly, it represents only a lossy signal of the user's intent.

This proposal addresses only the first, imho lesser, problem. The second it makes worse, by discarding even such meagre information as we had. This seems unlikely to improve the situation.

5. The cluster installs with behavior matching the effective topology for the initial control-plane, arbiter, and worker node counts
6. After installation completes, the cluster is ready to scale by adding nodes

*Note: The `adaptableTopology` flag is optional and defaults to `false`. Adaptable is intended to become the default topology mode in a future release once the feature reaches GA and has proven stable in production.*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @JoelSpeed 's point, if adaptable becomes the default topology mode it feels like topology just became somewhat meaningless. Is this not possible to implement with the existing topologies without the need for a new one and a layer of indirection to determine the actual topology.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is by design. I've explained my reasoning and the alternatives in my response to Joel.
That said, it wouldn't introduce a layer of indirection because it would be replaced by the infrastructure fields that just declare how many control-plane and infrastructure nodes the operators should be expecting. In time, topology as a field could be retired entirely, but it can be preserved for historical context.

@patrickdillon
Copy link
Copy Markdown
Contributor

It seems like we're missing consideration of master schedulability: topology isn't just a calculation of node count, it also takes into account mastersSchedulable. infrastructureTopology is HighAvailability with 0 compute nodes if mastersSchedulable: true.

As a side note, we have an unresolved bug for correctly calculating the toplogy; ideally we would like to move the infrastructure topology calculation out of the installer, because the calculation depends too much on manifest editing, and the dependency handling is not user friendly.

@jeff-roche
Copy link
Copy Markdown

Note this EP is going to be closed in favor of #2008

1. The cluster creator prepares an `install-config.yaml` with the desired initial node count
2. The cluster creator sets `adaptableTopology: true` in the `install-config.yaml` (optional, defaults to `false`)
3. The cluster creator runs `openshift-install create cluster` to complete the installation
4. The installer validates the configuration and sets both `controlPlaneTopology` and `infrastructureTopology` to `Adaptable` in the Infrastructure config
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires code changes to every OLM operator, whereas if we had a separate field that a controller reads and then writes back the current topology to controlPlaneTopology and infrastructureTopology, it would be backwards-compatible.

Needless to say, many (most?) OLM operators are not controlled in any way by RH. And I think you could argue that this is in a sense breaking the API contract. Previously we've added new topology values for entirely new topologies, and there would have been OLM operators that could not yet operate on a new cluster of that type that had never existed before (e.g. if you have an OLM operator that runs on the control plane then we force you to explicitly decide whether it can run successfully on TNF now that that's a thing). But this would be the first example of changing the topology on cluster types that have long existed.

Furthermore, with all the logic residing in a library rather than a controller, any future bug fixes or improvements to the logic will also require a rollout to every OLM operator.


`platform: none` will be supported for all node configurations.

`platform: baremetal` presents a challenge for single-node clusters.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

platform:baremetal is not supported on single-node clusters. You cannot install one. This is because it is not remotely useful for anything.

which is not useful and creates a point of failure for SNO deployments.
The Bare Metal Networking team will be consulted to determine if this
networking setup can be disabled for single-node clusters.
The goal is to support `platform: baremetal` for all node configurations
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

```yaml
apiVersion: v1
baseDomain: example.com
adaptableTopology: true # optional, defaults to false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm told that boolean fields always turn out to be a mistake.


The console will be updated to:
- Display operator compatibility status for Adaptable topology
- Provide a marketplace filter to show only operators that support Adaptable topology
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an own-goal that will prevent adoption of the feature.

Comment on lines +962 to +964
**Ambiguous Target State**: Should transitioning to HighlyAvailable create a
3-node compact cluster or provision compute nodes for a 5-node cluster?
The end state is unclear.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Topologies have always been designed to be static. The goal of Adaptable topology is to be able to think of topology as fluid. No longer are we thinking of OpenShift in terms of SNO vs Two Node vs HA Compact vs HA with Workers - instead there is just a number of control-plane nodes and infrastructure nodes that have been confirmed by MCO, and each component in the cluster is responsible for behaving appropriately according to the nodes available.

The problem is that the correct behaviour always depends on the user's intent, which isn't known to us. A cluster with 3 control plane nodes and 1 worker could represent a normally operating cluster, a degraded or misconfigured cluster to which the user should be alerted, an intentional cost/risk balancing measure, or a dangerous security failure that should be avoided at all costs. The difference is what the user intended to happen.

Historically we have 'solved' this by taking the cluster topology at install time as an indication of the user's intent. This approach is deficient in two ways. First, it makes it hard to change after the fact. But more importantly, it represents only a lossy signal of the user's intent.

This proposal addresses only the first, imho lesser, problem. The second it makes worse, by discarding even such meagre information as we had. This seems unlikely to improve the situation.

Single Node OpenShift (SNO) clusters are candidates for
transitioning to Adaptable topology.
The primary use case is enabling SNO deployments to scale to
multi-node highly available configurations as requirements change.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SNO clusters don't have load balancers, so there is a lot more to scaling up their control planes than simply adding nodes.

SNO clusters created via IBI depend on networking configuration within the node that cannot work in a multi-node cluster. You will need to have some way of dealing with this.

MicroShift and OpenShift architectures closer together.
MicroShift would benefit from operators having an adaptable topology mode
that handles topology changes via node updates.
A follow-up enhancement will address MicroShift to SNO transitions,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds incredible.

@jeff-roche
Copy link
Copy Markdown

@zaneb @patrickdillon thanks for the reviews! After discussion at a recent arch call, we've shifted from this approach to a new one that is more robust and I think provides a way to address some of the concerns you both brought up:
#2008

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 12, 2026

@jaypoulz: This pull request references OCPEDGE-2280 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "5.0." or "openshift-5.0.", but it targets "openshift-4.22" instead.

Details

In response to this:

Retired in favor of #2008.

This enhancement introduces Adaptable topology, a new cluster-topology mode that enables clusters to dynamically adjust their behavior based on node count. This allows SingleReplica clusters to scale to multi-node configurations without redeployment.

Key features:

  • Automatic behavior adjustment as control-plane and worker nodes scale
  • One-way transition from SingleReplica to Adaptable topology
  • Operator compatibility declarations via OLM annotations
  • CLI command for safe topology transitions with compatibility checks
  • Shared utilities in library-go to ease operator implementation

The proposal includes complete workflow descriptions, API extensions, test plans, and version skew strategy. Future stages will add AutomaticQuorumRecovery (AQR) to enable DualReplica-based resiliency for two-node configurations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jaypoulz
Copy link
Copy Markdown
Contributor Author

Closing as retired in favor of #2008.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.