Skip to content

feat(tonic-xds): parse RING_HASH lb policy from CDS (gRFC A42)#2701

Open
madhurishgupta wants to merge 5 commits into
grpc:masterfrom
madhurishgupta:madhurishgupta/a42-pr3-cds-ring-hash
Open

feat(tonic-xds): parse RING_HASH lb policy from CDS (gRFC A42)#2701
madhurishgupta wants to merge 5 commits into
grpc:masterfrom
madhurishgupta:madhurishgupta/a42-pr3-cds-ring-hash

Conversation

@madhurishgupta

@madhurishgupta madhurishgupta commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds CDS support for the gRFC A42 ring-hash load-balancing policy. A Cluster with lb_policy: RING_HASH is now accepted and its ring_hash_lb_config validated into LbPolicy::RingHash(RingHashSettings { min_ring_size, max_ring_size }).

Validation rules (mirroring grpc-go's balancer/ringhash/config.go ordering — ceiling → min/max → cap):

  • NACK a hash_function other than XX_HASH (rejects MURMUR_HASH_2).
  • NACK any ring size above the 8M ceiling.
  • NACK min_ring_size > max_ring_size.
  • Default unset minimum_ring_size to 1024 and maximum_ring_size to 4096, then clamp both to the local cap of 4096.

Tests

  • RingHashSettings::validate unit tests in lb_policy: defaults; custom sizes within the cap; sizes clamped to the local cap; reject non-XX_HASH; reject size above the 8M ceiling; reject min > max.
  • cluster tests: a RING_HASH cluster validates end-to-end to LbPolicy::RingHash(..); an unsupported policy (MAGLEV) is still NACKed.
  • cargo fmt, clippy, and cargo test -p tonic-xds all clean.

Plan

Accept lb_policy: RING_HASH and parse ring_hash_lb_config into
LbPolicy::RingHash { min_ring_size, max_ring_size }:

- NACK a hash_function other than XX_HASH.
- NACK any ring size above the 8M ceiling, and NACK min_ring_size >
  max_ring_size (matching grpc-go).
- Default unset minimum_ring_size to 1024 and maximum_ring_size to 4096, then
  clamp both to the local cap of 4096.

Picker selection from lb_policy is deferred (no live Stack B construction yet).
…policy

Move the LbPolicy enum and ring_hash_lb_config parsing out of cluster.rs into a
dedicated lb_policy module. The RingHash variant now carries a RingHashSettings
payload whose validate() owns the gRFC A42 rules, so the validation lives with
the config it produces and is unit-testable on its own. cluster.rs re-exports
LbPolicy so existing cluster::LbPolicy paths are unchanged; ClusterResource just
delegates to RingHashSettings::validate.
@madhurishgupta madhurishgupta force-pushed the madhurishgupta/a42-pr3-cds-ring-hash branch from 7fdaaad to a921b5e Compare June 24, 2026 17:44
@madhurishgupta madhurishgupta marked this pull request as ready for review June 24, 2026 17:53
@YutaoMa YutaoMa self-assigned this Jun 24, 2026
Comment thread tonic-xds/src/xds/resource/lb_policy.rs Outdated
Comment thread tonic-xds/src/xds/resource/lb_policy.rs Outdated
Defaulting an unset maximum_ring_size directly to the local cap made a config
with min_ring_size in (4096, 8M] and an unset max get NACKed (min > max), which
grpc-go accepts: its CDS layer defaults unset max to 8M, so min <= max holds and
both are clamped to the cap later. Restore that two-step (default 8M, then clamp
to 4096) so the min > max check matches grpc-go.
@madhurishgupta madhurishgupta force-pushed the madhurishgupta/a42-pr3-cds-ring-hash branch from 9c21137 to e46263d Compare June 24, 2026 21:32
Comment thread tonic-xds/src/xds/resource/lb_policy.rs Outdated
Comment thread tonic-xds/src/xds/resource/lb_policy.rs Outdated
…ields

Make RingHashSettings fields private with getters so the only way to obtain a
value is RingHashSettings::validate — invalid (out-of-bounds) values can no
longer be constructed directly. Also collapse the duplicate 8M default/ceiling
constants into a single RING_HASH_SIZE_CEILING used for both the unset-max
default and the ceiling check.
@YutaoMa

YutaoMa commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Heads up: since this PR flips ring hash config from NACK to ACK, I'd like to wait until either the LB support is added, or the first version of tonic-xds is released, before merging this code. Reason being if we merge this now and have a version published before the LB is actually wired in, that version will ACK ring hash config but fail to do ring hash properly, resulting in silent failure from the POV of the control plane.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants