Skip to content

feat(tonic-xds): add gRFC A42 ring-hash picker + member tracking#2695

Merged
YutaoMa merged 5 commits into
grpc:masterfrom
madhurishgupta:madhurishgupta/a42-pr2-ring-hash-picker
Jun 23, 2026
Merged

feat(tonic-xds): add gRFC A42 ring-hash picker + member tracking#2695
YutaoMa merged 5 commits into
grpc:masterfrom
madhurishgupta:madhurishgupta/a42-pr2-ring-hash-picker

Conversation

@madhurishgupta

Copy link
Copy Markdown
Contributor

Summary

Adds the gRFC A42 ring-hash load-balancing picker to the loadbalance stack, giving consistent-hash request affinity: requests carrying the same hash key are routed to the same backend.

What's included

  • RingHashPicker (pickers/ring_hash.rs)
    • Builds a hash ring over the cluster's full healthy-EDS membership with uniform per-member weighting.
    • Entries keyed xxh64("{addr}_{i}", 0) (XXH64, seed 0); the ring is held lock-free behind an ArcSwap and rebuilt only on membership change.
    • pick() reads RouteDecision.request_hash (per-request random fallback when absent), finds the ring position closest to that hash (first entry with hash ≥ request, wrapping), and walks clockwise to the first ready host — returning Unavailable if no ring host is ready.
  • LoadBalancer member tracking: tracks the full healthy-EDS member set (independent of connection/ejection state), rebuilding the picker's ring once per discovery drain.

Behavior notes

  • Outlier detection composes with no ring-hash-specific code. The ring is built over members; picks resolve against the ready set. An ejected host stays in the ring but is absent from ready, so its keys fall through clockwise to the next ready host.

Testing

Added UTs.
cargo fmt, clippy, and cargo test -p tonic-xds all clean.

##nPlan (A42 series)

  • Last PR: Request-hash computation + plumbing. (feat(tonic-xds): compute gRFC A42 request hash from header hash policy #2686)
  • This PR: Add gRFC A42 ring-hash picker + member tracking
  • Third PR: CDS wiring — parse lb_policy: RING_HASH and ring_hash_lb_config (validating hash_function == XX_HASH), and select the ring-hash picker from the cluster's LB policy.
  • Fourth PR: RDS wiring — parse RouteAction.hash_policy to populate the policy list (replacing the empty scaffold here).

Implements the ring-hash LB picker on the loadbalance/ stack, with ring
construction and the hash-position walk mirroring grpc-go's ringhash balancer:

- RingHashPicker: builds an A42-conformant ring (uniform per-member weighting) —
  size = smallest multiple of N >= min_ring_size, clamped to max_ring_size;
  entries keyed xxh64("{addr}_{i}", 0); ring held lock-free behind ArcSwap.
  pick() reads RouteDecision.request_hash (per-request random fallback), finds
  the ring position closest to that hash and walks clockwise to the first ready
  host (None/Unavailable if no ring host is ready).
- ChannelPicker::on_members_changed hook (default no-op; P2C inherits it),
  delegating to RingHashPicker::rebuild.
- LoadBalancer tracks `members` (full healthy-EDS set, independent of
  connection/ejection state) and rebuilds the picker's ring once per discovery
  drain. Outlier detection composes for free: ejected hosts stay in the ring
  but are not picked (not in `ready`).

Currently it supports uniform weighting and an eager-connect pick that selects
the first ready host. The remaining A42 connection semantics — IDLE-start with
connect-on-pick, queuing while CONNECTING, the TRANSIENT_FAILURE-aware walk,
weight-proportional rings, and aggregated-connectivity-state rules — are gated
on the load balancer's connection model and deferred. The picker is not yet
selected by lb_policy (default-wired in a later change).

Tests: 16 picker unit tests + 1 LoadBalancer member-tracking integration test.
@madhurishgupta madhurishgupta force-pushed the madhurishgupta/a42-pr2-ring-hash-picker branch from f9b2dcc to 575ebed Compare June 19, 2026 00:29

/// Ring-hash LB configuration (gRFC A42 `ring_hash_lb_config`).
#[derive(Debug, Clone, Copy)]
pub(crate) struct RingHashConfig {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a validation to RingHashConfig? This can prevent massive vector allocation for ring object when the config value is invalid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation is part of third PR in which CDS wiring will be implemented.

@YutaoMa

YutaoMa commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Is the plan to only support uniform weighting after the 4 PRs? A42 requires EDS and locality weighting. It's ok to defer it after this PR but to declare proper A42 support adding weight support is needed.

for (k, addr) in members.iter().enumerate() {
let target = (ring_size as u128 * (k as u128 + 1)).div_ceil(n as u128) as u64;
for i in 0..(target - emitted) {
let key = format!("{addr}_{i}");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can reuse the same String buffer in each iteration instead of allocating every time

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, made the change

Build the per-entry ring key into a single reused String (clear + write!)
instead of allocating a fresh String each iteration, turning O(ring_size)
allocations per rebuild into O(1). The key contents are unchanged, so the
ring is identical (pinned-digest tests still pass).
/// entries. Each entry is keyed
/// `xxh64("{addr}_{i}", 0)`, `i` being the member's previous appearance
/// count, and entries are then sorted by hash.
fn build_ring(config: &RingHashConfig, members: &IndexSet<EndpointAddress>) -> Vec<RingEntry> {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one Rust idiomatic style suggestion: use a new type pattern for Ring(Vec<RingEntry>) so that only valid rings can be represented. Can then move most of the build and pick logic into method on the Ring type, and the picker does minimal delegation between config and the ring.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for mentioning this, it makes sense.
Create a new struct Ring with new and pick functions.

madhurishgupta and others added 2 commits June 23, 2026 14:44
Introduce `struct Ring(Vec<RingEntry>)` whose only constructor sorts the
entries, so a Ring is always sorted by hash and Ring::pick's binary search is
sound by construction rather than by an implicit convention. Move ring building
and the hash-position walk onto Ring; RingHashPicker now just extracts the
request hash and delegates to the loaded ring.
@madhurishgupta

Copy link
Copy Markdown
Contributor Author

Is the plan to only support uniform weighting after the 4 PRs? A42 requires EDS and locality weighting. It's ok to defer it after this PR but to declare proper A42 support adding weight support is needed.

Correct. Uniform weighting is an intentional initial scope for this PR.
Full A42 conformance needs weight-proportional rings and that's planned as a dedicated follow-up.

Add a TODO marking that full A42 sizes the ring by each endpoint's EDS and
locality weight; the picker currently uses uniform weights.
@YutaoMa YutaoMa merged commit 618389f into grpc:master Jun 23, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants