refactor: reconcile retry defaults and backoff across the two retry steps#97
Merged
Conversation
… and server-driven retries Three additive, opt-in enhancements to the retry layer: - Parse fractional Retry-After values. RetryAfterParser now reads the Retry-After delta-seconds form with toDoubleOrNull, so values like `Retry-After: 1.5` are honoured to nanosecond resolution instead of being silently ignored. Integer seconds and the HTTP-date form are unchanged, and the negative / NaN / non-finite guards plus the 365-day ceiling still apply. The numeric form now shares that ceiling with the date and Unix-epoch forms, so a far-future delta clamps rather than flowing through unbounded. Both retry layers benefit, since they delegate parsing here. - Optional per-attempt retry-count request header. RetrySettings gains an opt-in attemptHeaderName (default null). When set, the recovery-aware RetryStep stamps that header with the 1-based attempt ordinal on a per-attempt copy of the request, so servers and proxies can observe the retry count. The header is applied via Request.newBuilder, leaving the immutable template and any caller-supplied idempotency key untouched. - Server-driven retry predicate. New ServerOverrideRetryPredicate implements the composable HttpRetryConditionPredicate seam and honours an X-Should-Retry-style response header (configurable name): a truthy value forces a retry, a falsy value suppresses one, and an absent or unrecognised value defers to a delegate predicate. It is not installed by default; the default retryable-status set is unchanged, so 409 stays out of it unless a server explicitly opts a response in.
The stage-based DefaultRetryStep and the recovery-aware RetryStep shipped divergent retry defaults and two different backoff formulas. DefaultRetryStep used an 800ms base delay, a hard-coded `1L shl tryCount` schedule, and ±5% asymmetric jitter; RetryStep used a 200ms initial delay, a configurable 2.0 multiplier, and 0.2 symmetric jitter via BackoffCalculator. They also disagreed on retryable statuses: RetryUtils (and HttpException.retryable) treat 408 as retryable, but RetrySettings dropped it from the default set, so the two stacks retried different responses. Reconcile both around a single source of truth: - BackoffCalculator is now the only backoff implementation. DefaultRetryStep computes its exponential delay through it (via a RetrySettings view built from the options plus the shared multiplier/jitter constants), with the total-timeout deadline disabled since the stage-based step carries no budget. The duplicate shift-based formula, its jitter helper, and the now dead MAX_SHIFT_TRY_COUNT constant are removed. - The canonical defaults live as public constants on RetrySettings (DEFAULT_DELAY_MULTIPLIER, DEFAULT_JITTER, DEFAULT_MAX_ATTEMPTS, plus the existing delay constants). HttpRetryOptions references them, so its base delay (200ms), max delay (8s), and retry budget can no longer drift. Both stacks now default to 3 total sends (HttpRetryOptions.maxRetries=2, RetrySettings.maxAttempts=3). - 408 is retryable by default everywhere: it is added to RetrySettings.DEFAULT_RETRYABLE_STATUSES, matching RetryUtils and the stage-based classifier. Updates KDoc and docs/pipelines.md to state the shared values, adds tests pinning the defaults and the backoff sequence on both stacks, and refreshes the existing jitter/clamp assertions for the unified ±10% jitter. Regenerates the sdk-core API snapshot for the constant changes.
…y validation Stop the recovery-aware 408 test from leaking a single-thread scheduled executor: with zero delays the step never schedules a deferred wait, so omit the caller-supplied scheduler and let RetryStep route through its process-wide lazy daemon scheduler (which is never leaked per run). RetryStep does not own or shut down a caller-supplied scheduler, so constructing one inline left a non-daemon thread per test run. Replace the inline fully-qualified references (assertions, Executors, HttpException, Protocol, Method, Clock, Instant) with imports to match the rest of the suite, and add a test pinning that constructing DefaultRetryStep now validates delay magnitudes eagerly. Document that eager validation on the backoffSettings field and smooth the delay-precedence KDoc phrasing.
…econcile # Conflicts: # sdk-core/src/main/kotlin/org/dexpace/sdk/core/http/pipeline/steps/ServerOverrideRetryPredicate.kt # sdk-core/src/main/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryAfterParser.kt # sdk-core/src/test/kotlin/org/dexpace/sdk/core/http/pipeline/steps/RetryStepTest.kt # sdk-core/src/test/kotlin/org/dexpace/sdk/core/pipeline/step/retry/RetryAfterParserTest.kt
…ryOptions multiplier/jitter knobs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The two retry stacks — the stage-based
DefaultRetryStepand the recovery-awareRetryStep— had drifted apart: different defaults (base delay 800 ms vs 200 ms, ±5% asymmetric vs ±10% symmetric jitter, 4 vs 3 total sends) and two separate backoff formulas, and they disagreed on whether 408 is retryable.This reconciles them:
BackoffCalculator(deadline-aware, symmetric jitter) becomes the single backoff implementation; the stage-based step now computes its delay through it, and the duplicate shift-based formula and its jitter helper are removed.RetrySettings(initial 200 ms, max 8 s, multiplier 2.0, ±10% jitter, 3 total sends) and both stacks reference them, so they can't silently drift again.Tests pin the now-shared defaults and the backoff sequence on both stacks so future drift fails the build; the API snapshot adds the shared constants and drops a now-dead one.
Closes #38