CNTRLPLANE-3460: Refactor defrag controller to lower impact of defrag by dusk125 · Pull Request #1618 · openshift/cluster-etcd-operator

dusk125 · 2026-05-13T19:06:28Z

Summary

Adds a fixed 5-minute time budget per defrag controller reconciliation loop
Before each member's defrag, the controller checks elapsed time; if the budget is exceeded, remaining members are skipped and deferred to the next sync cycle
Budget-skipped members do not count toward the degraded failure threshold — only real defrag errors trigger degradation

Test plan

Existing unit tests pass unchanged
New TestDefragBudgetExceeded test verifies budget enforcement with a near-zero budget (1ns): no defrags occur, a DefragControllerBudgetExceeded event is recorded, and the controller does not degrade
Manual verification in a test cluster with fragmented etcd members

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added a defragmentation cooldown to prevent repeated defrag operations within a short window.
Improvements
- Defragmentation now targets one fragmented member per run and orders attempts to prioritize highest fragmentation.
- Failure tracking and degraded-state handling simplified and made more deterministic; successful recovery clears degraded status.
Tests
- Added tests covering cooldown behavior and adjusted existing defrag-related tests.

The defrag controller now tracks elapsed time during each sync and skips remaining etcd members if the 5-minute budget is exceeded. Budget-skipped members are deferred to the next reconciliation loop and do not count toward the degraded failure threshold. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openshift-ci · 2026-05-13T19:06:33Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2026-05-13T19:06:42Z

Walkthrough

Adds a defragmentation cooldown to DefragController, refactors runDefrag to target one fragmented member per sync (filtering learners, sorting by fragmentation, leader last), updates failure/degraded handling, adjusts clearDegraded, updates fakeEtcdClient.Defragment, and adds/updates tests including TestDefragCooldown.

Changes

Defrag controller cooldown & refactor

Layer / File(s)	Summary
Cooldown fields, imports, constructor, and sync check `pkg/operator/defragcontroller/defragcontroller.go`	Adds imports for `cmp`/`slices`, `defaultDefragCooldown` constant, `DefragController` fields `defragCooldown` and `lastDefragTime`, initializes `defragCooldown` in `NewDefragController`, and makes `sync` skip defrag while cooldown is active.
runDefrag: single-member defrag, sorting and strict failure handling `pkg/operator/defragcontroller/defragcontroller.go`	Rewrites `runDefrag` to filter learner/unstarted members, build member/status mappings, sort non-leader endpoints by fragmentation descending (leader appended last), attempt defrag on the first fragmented member only, record attempt/success/failure events, wait for cluster recovery on success, update `lastDefragTime` and reset failure counters, and set degraded state on immediate errors. Removes previous multi-member partial-aggregate path.
clearDegraded update `pkg/operator/defragcontroller/defragcontroller.go`	`clearDegraded` no longer returns the status update error; it performs the update and logs warnings on failure.
Tests and fake Etcd helper `pkg/etcdcli/helpers.go`, `pkg/operator/defragcontroller/defragcontroller_test.go`	`fakeEtcdClient.Defragment` now updates stored `StatusResponse.DbSize` from `DbSizeInUse`. Tests updated to set `defragCooldown: 0` where needed, `TestNewDefragController` uses looped syncs, `TestNewDefragControllerMultiSyncs` adjusts error generation and loop counts, and new `TestDefragCooldown` validates cooldown blocking and reset behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality	❓ Inconclusive	Custom check specifies Ginkgo test patterns (It blocks, BeforeEach/AfterEach) but PR uses standard Go table-driven tests with t.Run, not Ginkgo framework.	Check is inapplicable. Repository uses standard Go testing, not Ginkgo. Confirm check applies only to Ginkgo-based tests or provide standard Go test quality criteria.

✅ Passed checks (10 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	Project uses Go standard testing, not Ginkgo. All test names are stable and deterministic with no dynamic content. Check not applicable.
Microshift Test Compatibility	✅ Passed	No Ginkgo e2e tests were added. All test changes use standard Go unit test pattern (func TestXxx), not Ginkgo. Check not applicable to this PR.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	Custom check is not applicable. The PR adds only standard Go unit tests (TestDefragCooldown in defragcontroller_test.go), not Ginkgo e2e tests. This check targets only new Ginkgo e2e tests.
Topology-Aware Scheduling Compatibility	✅ Passed	No scheduling constraints introduced. Only Go code changes. Defrag controller already checks ControlPlaneTopology and disables on SNO/HyperShift.
Ote Binary Stdout Contract	✅ Passed	No OTE Binary Stdout Contract violations detected. All files are library packages with no process-level functions. The fmt.Printf found is in a test function block, which is allowed.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	This PR adds only traditional Go unit tests (func Test*), not Ginkgo e2e tests. The custom check only applies to Ginkgo e2e tests (It(), Describe(), etc.), so it is not applicable here.
Title check	✅ Passed	The title accurately summarizes the main change: refactoring the defrag controller to reduce defragmentation impact through a cooldown mechanism and single-member-per-sync approach.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-05-13T19:06:44Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ingvagabund for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/operator/defragcontroller/defragcontroller_test.go`:
- Around line 464-468: The test uses a tiny non-zero defragBudget which can be
flaky; set the DefragController's defragBudget to 0 (or switch the controller to
use a controllable/test clock) so budget exhaustion is deterministic. Locate
where defragBudget is set in the test setup (the variable named defragBudget
used to construct the controller) and replace 1 * time.Nanosecond with 0 (or
adapt the controller constructor to accept a mock clock and use that in the
test) so calling controller.sync(context.TODO(),
factory.NewSyncContext("defrag-controller", eventRecorder)) always observes an
exhausted budget.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e0de8aac-2c9a-42a2-9dee-693e18c04822

📥 Commits

Reviewing files that changed from the base of the PR and between c0614ca and 2c8da6e.

📒 Files selected for processing (2)

pkg/operator/defragcontroller/defragcontroller.go
pkg/operator/defragcontroller/defragcontroller_test.go

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openshift-ci-robot · 2026-05-14T16:19:22Z

@dusk125: This pull request references CNTRLPLANE-3460 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

Adds a fixed 5-minute time budget per defrag controller reconciliation loop

Before each member's defrag, the controller checks elapsed time; if the budget is exceeded, remaining members are skipped and deferred to the next sync cycle

Budget-skipped members do not count toward the degraded failure threshold — only real defrag errors trigger degradation

Test plan

Existing unit tests pass unchanged

New TestDefragBudgetExceeded test verifies budget enforcement with a near-zero budget (1ns): no defrags occur, a DefragControllerBudgetExceeded event is recorded, and the controller does not degrade

Manual verification in a test cluster with fragmented etcd members

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features

Added a configurable time budget for defragmentation that stops further member processing when exceeded.

Improvements

Enhanced tracking to report attempted vs. successful defragmentations.

Improved error accounting and more accurate partial-failure reporting when budget limits or mapping errors occur.

Tests

Added a test validating behavior when the defragmentation budget is immediately exhausted.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Rework the defrag controller to defrag at most one etcd member per reconciliation loop instead of all members in a single pass. After a successful defrag, a 1-hour cooldown prevents further defrags until the next cycle, spacing them out to be gentler on the cluster. Key changes: - Replace time budget with per-sync cooldown (defaultDefragCooldown = 1h) - Sort non-leader members by fragmentation descending, leader always last - Post-defrag: verify member responds via Status(), log before/after DbSize - Replace deprecated wait.Poll with wait.PollUntilContextTimeout - Fix fake Defragment() to update per-member status for multi-sync tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openshift-ci · 2026-05-15T18:06:23Z

@dusk125: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-metal-ipi-ovn-ipv6	`de3f881`	link	true	`/test e2e-metal-ipi-ovn-ipv6`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

atiratree · 2026-05-20T13:08:41Z

-		if len(member.ClientURLs) == 0 {
-			// skip unstarted member
+	var (
+		etcdMembers    []*etcdserverpb.Member


this could be a hashmap

atiratree · 2026-05-20T13:10:52Z


+	// Sort non-leader members by fragmentation percentage descending so we
+	// defrag the most fragmented member first each cycle.
+	slices.SortFunc(endpointStatus, func(a, b *clientv3.StatusResponse) int {


can you please add a unit test for this sort?

atiratree · 2026-05-20T13:23:53Z


+	if !c.lastDefragTime.IsZero() && time.Since(c.lastDefragTime) < c.defragCooldown {
+		klog.V(4).Infof("Defrag cooldown active, %v remaining", c.defragCooldown-time.Since(c.lastDefragTime))
+		return nil


we could schedule a next defrag if we know the remaining time here

atiratree · 2026-05-20T13:27:14Z

 		return nil
 	}

+	if !c.lastDefragTime.IsZero() && time.Since(c.lastDefragTime) < c.defragCooldown {


Could it be a problem that a member could exceed the fragmentation limit for 1h in high churn cluster, since this backfoff / cooldown is member agnostic?

hmm maybe we can frame this cooldown differently.... e.g after a defrag, the cluster must have no unhealthy members for at least 10m before defragmenting the next member

atiratree · 2026-05-20T17:25:27Z

+		recorder.Eventf("DefragControllerDefragmentSuccess", "etcd member has been defragmented: %s, memberID: %d", member.Name, member.ID)
+
+		// Give cluster time to recover before the next sync defrags another member.
+		if err := wait.PollUntilContextTimeout(ctx, pollWaitDuration, pollTimeoutDuration, false,


This is not the best pattern for controllers. Could we just select a single member try to defrag it and use a queue for any backoff checks like checking the health?

I suppose the defrag is not that often so a 10m between each member defrag shouldn't be an issue?

line 144 already has that

if !etcdcli.IsClusterHealthy(memberHealth)

so we can have the backoff work through the sync loop for some period of time

tjungblu · 2026-05-26T13:26:45Z

 	numDefragFailures  int
 	defragWaitDuration time.Duration
+	defragCooldown     time.Duration
+	lastDefragTime     time.Time


you will have to persist this somewhere (operator conditions?), imagine you're restarting CEO within that hour.

tjungblu · 2026-05-26T13:40:40Z

+		recorder.Eventf("DefragControllerDefragmentAttempt", "Attempting defrag on member: %s, memberID: %x, dbSize: %d, dbInUse: %d, leader ID: %d", member.Name, member.ID, status.DbSize, status.DbSizeInUse, status.Leader)
+		if _, err := c.defragClient.Defragment(ctx, member); err != nil {
+			// Defrag can timeout if defragmentation takes longer than etcdcli.DefragDialTimeout.
+			errMsg := fmt.Sprintf("failed defrag on member: %s, memberID: %x: %v", member.Name, member.ID, err)


it might not indicate a failure in this case. We have two options: increase the timeout (plus some slack) based on the db size, or just always use some crazy value like 10 minutes.

Not sure it makes sense to poll the status instead, so have a short timeout always and just poll until we're back up and the size has been reduced.

okay I see you added the polling on success - but we need the polling on failure 😄

tjungblu · 2026-05-26T13:44:28Z

-			}
+		recorder.Eventf("DefragControllerDefragmentSuccess", "etcd member has been defragmented: %s, memberID: %d", member.Name, member.ID)
+
+		// Give cluster time to recover before the next sync defrags another member.


if the defrag call came back, then the cluster should be in healthy state and respond. I think the polling is a bit misplaced here.

openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 13, 2026

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

Comment thread pkg/operator/defragcontroller/defragcontroller_test.go Outdated

Use zero defragBudget in test for deterministic budget exhaustion

b0a559c

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dusk125 changed the title ~~WIP: Add time budget to defrag controller~~ WIP: CNTRLPLANE-3460: Add time budget to defrag controller May 14, 2026

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 14, 2026

dusk125 changed the title ~~WIP: CNTRLPLANE-3460: Add time budget to defrag controller~~ WIP: CNTRLPLANE-3460: Refactor defrag controller to lower impact of defrag May 15, 2026

dusk125 marked this pull request as ready for review May 15, 2026 14:22

dusk125 changed the title ~~WIP: CNTRLPLANE-3460: Refactor defrag controller to lower impact of defrag~~ CNTRLPLANE-3460: Refactor defrag controller to lower impact of defrag May 15, 2026

openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 15, 2026

openshift-ci Bot requested review from ardaguclu and p0lyn0mial May 15, 2026 14:22

atiratree reviewed May 20, 2026

View reviewed changes

tjungblu reviewed May 26, 2026

View reviewed changes

Conversation

dusk125 commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented May 13, 2026

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

openshift-ci Bot commented May 13, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

openshift-ci-robot commented May 14, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented May 15, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dusk125 commented May 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 13, 2026 •

edited

Loading

openshift-ci-robot commented May 14, 2026 •

edited by openshift-ci Bot

Loading