Skip to content

Conversation

@stas-openai
Copy link

Description

This PR adds an opt-in configuration to Change Feed Processor (CFP) to allow acquiring more than one lease per lease-
acquire cycle, which can significantly improve rebalance/convergence time during scale-out and rolling deployments
(especially when hosts use ephemeral identities, e.g., Kubernetes rollouts).

Today, when multiple CFP workers exist, the default EqualPartitionsBalancingStrategy is intentionally conservative and
effectively attempts to take ownership of at most one lease per cycle. In large lease sets / high physical partition
counts, this can make rebalancing very gradual and can temporarily reduce processing throughput after deployments.

Changes included:

  • Added ChangeFeedProcessorOptions#setMaxLeasesToAcquirePerCycle(int) (default 0 preserves legacy behavior).
  • Updated EqualPartitionsBalancingStrategy to honor this option for expired/unowned lease acquisition
  • Wired the option through both CFP implementations (EPK and PK/Incremental).
  • Added unit tests covering strategy selection and ensuring the load balancer attempts all leases returned by the
    strategy.
  • Updated CHANGELOG.

No Swagger regeneration involved.

All SDK Contribution checklist:

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously
    merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR,
    see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Copilot AI review requested due to automatic review settings December 28, 2025 15:04
@stas-openai stas-openai requested review from a team and kirankumarkolli as code owners December 28, 2025 15:04
@github-actions github-actions bot added Community Contribution Community members are working on the issue Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Dec 28, 2025
@github-actions
Copy link
Contributor

Thank you for your contribution @stas-openai! We will review the pull request and get back to you soon.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an opt-in configuration (setMaxLeasesToAcquirePerCycle) to the Change Feed Processor (CFP) that allows acquiring multiple leases per balancing cycle, improving rebalance/convergence time during scale-out and rolling deployments. The default value of 0 preserves the legacy conservative behavior of acquiring at most one lease per cycle when multiple workers exist.

Key Changes:

  • Added maxLeasesToAcquirePerCycle field and accessors to ChangeFeedProcessorOptions with validation
  • Updated EqualPartitionsBalancingStrategy to honor the new option for expired/unowned lease acquisition while keeping the legacy 1-lease-per-cycle behavior for stealing
  • Wired the new option through both CFP implementations (EPK and PK/Incremental versions)

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
ChangeFeedProcessorOptions.java Adds the new maxLeasesToAcquirePerCycle field with getter/setter and validation
IncrementalChangeFeedProcessorImpl.java Passes the new option to the EqualPartitionsBalancingStrategy constructor
ChangeFeedProcessorImplBase.java Passes the new option to the EqualPartitionsBalancingStrategy constructor
EqualPartitionsBalancingStrategy.java Implements the multi-lease acquisition logic while maintaining legacy behavior for stealing
CHANGELOG.md Documents the new feature in the changelog
PartitionLoadBalancerImplTests.java (pkversion) Adds test verifying all leases returned by strategy are attempted
PartitionLoadBalancerImplTests.java (epkversion) Adds test verifying all leases returned by strategy are attempted
EqualPartitionsBalancingStrategyTests.java Adds comprehensive unit tests for the new multi-lease acquisition behavior

@github-actions
Copy link
Contributor

github-actions bot commented Dec 28, 2025

API Change Check

APIView identified API level changes in this PR and created the following API reviews

com.azure:azure-cosmos

@stas-openai
Copy link
Author

@microsoft-github-policy-service agree company="OpenAI"

@stas-openai stas-openai changed the title Enable Configurable Multi-Lease Acquisition Per Cycle in Change Feed Processor Enable configurable multi-lease acquisition per cycle in change feed processor Dec 28, 2025
@kushagraThapar
Copy link
Member

Thank you @stas-openai for your contributions to improve the Change feed processor logic, we will review this PR in the upcoming week (once everyone is back from vacation) and will get back to you with any comments, thanks again, your contributions are deeply appreciated!

/**
* Maximum number of leases the instance will try to acquire in a single load balancing cycle.
* <p>
* A value of {@code 0} keeps the legacy behavior (which is intentionally conservative when multiple workers exist and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Short comment explaining when/why using a higher value could be useful - also what are the reasons to not use a super high value etc. Basically providing a bit of guidance on how to estimate a good value here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind - saw that the comment for the public API has some guidance already.

### 4.77.0-beta.1 (Unreleased)

#### Features Added
* Added `ChangeFeedProcessorOptions#setMaxLeasesToAcquirePerCycle(int)` to allow faster acquisition of unused/expired leases during scale-out and rolling deployments (default `0` preserves legacy behavior).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT - template should be followed (including link to the PR)

Copy link
Member

@FabianMeiswinkel FabianMeiswinkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - Thanks for the contribution!

}
// For multiple acquisitions, shuffle and take a random subset.
Collections.shuffle(expiredLeases, random);
this.logger.info("Found {} unused or expired leases; previous lease count for instance owner {} is {}, count of leases to target is {} and maxScaleCount {} ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These log lines can include maxLeasesToAcquirePerCycle as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community Contribution Community members are working on the issue Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants