Skip to content
161 changes: 88 additions & 73 deletions gems/smithy-client/lib/smithy-client/plugins/retry_errors.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,66 +6,30 @@ module Plugins
# @api private
class RetryErrors < Plugin
option(
:retry_strategy,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Juli might have more context, but should the class option be removed? I think it allowed customers to implement their own custom retry strategies, but not sure if we still want to support that or not.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SEP doesn't require custom retry strategy configuration & afaik specific configurable options exposed in V3's retry plugin also only apply to legacy. I definitely might be missing something but it seems V3 doesn't support fully customizable strategy either.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In V3, a customer could pass their own custom Proc to influence how backoffs work but that is under the :legacy umbrella. We could have a way to accommodate this legacy support via migration gem.

While SEP does not mention custom retries, I did see some references in Smithy docs: https://smithy.io/2.0/guides/client-guidance/retries.html

For customers who wants to NOT use migration gem BUT want custom retries to be a thing. They could opt to implement their own via duck typing our retries interface (aka below). Other ways are replacing this plugin with their own. Or even utilizing custom welds.

Note: I added more context in a separate comment below on what we need to do to resolve this.

:retry_mode,
Comment thread
jterapin marked this conversation as resolved.
default: 'standard',
doc_default: "'standard'",
doc_type: 'String, Class',
doc_type: String,
docstring: <<~DOCS)
The retry strategy to use when retrying errors. This can be one of the following:
* `standard` - A standardized retry strategy used by the AWS SDKs. This includes support
for retry quotas, which limit the number of unsuccessful retries a client can make.
* `adaptive` - An experimental retry strategy that includes all the functionality of the
`standard` strategy along with automatic client side throttling. This is a provisional
strategy that may change behavior in the future.
* Any instance of a class that implements the following methods:
- `acquire_initial_retry_token(token_scope)`
- `refresh_retry_token(retry_token, error_info)`
- `record_success(retry_token)`
Specifies which retry algorithm to use. Values are:

* `standard` - A standardized set of retry rules across the Smithy-based SDKs.
This includes support for retry quotas, which limit the number of
unsuccessful retries a client can make. This is the default
value if no retry mode is provided.

* `adaptive` - A retry mode that includes all the functionality of
`standard` mode along with automatic client side throttling.
DOCS

option(
:retry_max_attempts,
:max_attempts,
default: 3,
doc_type: Integer,
docstring: <<~DOCS)
The maximum number attempts that will be made for a single request, including
the initial attempt. Used in the `standard` and `adaptive` retry strategies.
DOCS

option(
:retry_max_delay,
default: 20,
docstring: <<~DOCS)
The maximum delay, in seconds, between retry attempts. This option is ignored
if a custom `retry_backoff` is provided. Used in the `standard` and `adaptive`
retry strategies.
DOCS

option(
:retry_base_delay,
default: 2,
docstring: <<~DOCS)
The base delay, in seconds, used to calculate the exponential backoff for
retry attempts. This option is ignored if a custom `retry_backoff` is provided.
Used in the `standard` and `adaptive` retry strategies.
the initial attempt.
DOCS

option(
:retry_backoff,
doc_default: 'Smithy::Client::Retry::ExponentialBackoff.new',
rbs_type: 'Smithy::Client::Retry::ExponentialBackoff',
doc_type: '#call(attempts)',
docstring: <<~DOCS) do |config|
A callable object that calculates a backoff delay for a retry attempt. The callable
should accept a single argument, `attempts`, that represents the number of attempts
that have been made. Used in the `standard` and `adaptive` retry strategies.
DOCS
Retry::ExponentialBackoff.new(
retry_base_delay: config.retry_base_delay,
retry_max_delay: config.retry_max_delay
)
end

option(
:adaptive_retry_wait_to_fill,
default: true,
Expand All @@ -76,24 +40,54 @@ class RetryErrors < Plugin
not retry instead of sleeping.
DOCS

option(
:retry_strategy,
doc_type: 'Object',
docstring: <<~DOCS)
The retry strategy used by the client. If not provided, a default strategy is built
based on `:retry_mode` — either `Standard` or `Adaptive`.

A custom strategy must respond to:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

* `#acquire_initial_retry_token` - returns a token
* `#refresh_retry_token(token, error_info)` - returns a token or nil
* `#record_success(token)` - records a successful request
* `#request_bookkeeping(error_info = nil)` - updates internal state
DOCS

REQUIRED_STRATEGY_METHODS = %i[
acquire_initial_retry_token
refresh_retry_token
record_success
request_bookkeeping
].freeze

def after_initialize(client)
config = client.config
config.retry_strategy =
case config.retry_strategy
when 'standard'
Retry::Standard.new(
max_attempts: config.retry_max_attempts,
backoff: config.retry_backoff
)
when 'adaptive'
Retry::Adaptive.new(
max_attempts: config.retry_max_attempts,
backoff: config.retry_backoff,
wait_to_fill: config.adaptive_retry_wait_to_fill
)
else
config.retry_strategy
end
if config.retry_strategy
validate_strategy(config.retry_strategy)
else
config.retry_strategy = build_strategy(config)
end
end

private

def validate_strategy(strategy)
missing = REQUIRED_STRATEGY_METHODS.reject { |m| strategy.respond_to?(m) }
return if missing.empty?

raise ArgumentError, "Custom retry_strategy must respond to: #{missing.join(', ')}"
end

def build_strategy(config)
case config.retry_mode
when 'standard'
Retry::Standard.new(max_attempts: config.max_attempts)
when 'adaptive'
Retry::Adaptive.new(max_attempts: config.max_attempts, wait_to_fill: config.adaptive_retry_wait_to_fill)
else
raise ArgumentError, "Must provide either 'standard' or 'adaptive' for retry_mode"
end
end

# @api private
Expand All @@ -108,19 +102,35 @@ def call(context)

def handle(context, retry_strategy, token)
response = track_feature(retry_strategy) { @handler.call(context) }
if (error = response.error)
return response unless retryable?(context.http_request)

error_info = Http::ErrorInspector.new(error, context.http_response)
token = retry_strategy.refresh_retry_token(token, error_info)
return response unless token
error_info = Http::ErrorInspector.new(response.error, context.http_response) if response.error
retry_strategy.request_bookkeeping(error_info)

Kernel.sleep(token.retry_delay)
else
unless response.error
retry_strategy.record_success(token)
return response
end
return response unless retryable?(context.http_request)

token = handle_error(context, retry_strategy, token, error_info)
return response unless token

retry_request(context, response, retry_strategy, token)
end

def handle_error(context, retry_strategy, token, error_info)
token = retry_strategy.refresh_retry_token(token, error_info)
return unless token

if token.no_retry_reason == :quota_exhausted
Kernel.sleep(token.retry_delay) if long_polling_operation?(context)
return
end

Kernel.sleep(token.retry_delay)
token
end

def retry_request(context, response, retry_strategy, token)
reset_request(context)
reset_response(context, response)
context.retries += 1
Expand All @@ -141,6 +151,11 @@ def reset_response(context, response)
response.error = nil
end

# TODO: Revisit after trait is finalized.
def long_polling_operation?(context)
context.operation.traits.key?('smithy.api#longPoll')

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how this is gonna look, but at least according to the SEP this trait is aws.api#longPoll.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I should add a revisit task for this and ClockSkew on our board if there's isn't one already.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richardwang1124 At least in Smithy website, it's under smithy.api. I guess we'll see where it gets namespaced to when it's finalized: https://smithy.io/2.0/trait-index.html.

end

def track_feature(retry_strategy, &block)
case retry_strategy
when Retry::Standard then Features.track('RETRY_MODE_STANDARD', &block)
Expand Down
14 changes: 14 additions & 0 deletions gems/smithy-client/lib/smithy-client/plugins/stub_responses.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,20 @@ def add_handlers(handlers, config)
def after_initialize(client)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed? In V3 retries are disabled when stub_responses is true since customers need to mock the response behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having this logic be here, I moved it off to add_handlers in the retry error plugin. It's sort of annoying because we use this same plugin in AWS side also, and this doesn't remove AWS-specific retry plugin. Actually, alternative & better way might be having a AWS-specific stub_responses.rb plugin that replaces this, so we can still have all the plugins to remove in one place. Lmk what you think.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think keeping all the stub_responses-related logic contained here makes the overall sense to find all its configuration but I'm open to ways. At the end, We should probably lean one way however.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with keeping all stub_responses related logic in a same place, I think what I'll do is create a AWS specific plugin and replace generic one with AWS specific one via default weld.

return unless client.config.stub_responses

remove_common_handlers(client)
remove_context_handlers(client)
end

private

def remove_common_handlers(client)
# Handlers removed when stubbing regardless of context.
# Subclasses should not override this method.
end

def remove_context_handlers(client)
# Context-specific handler removals. Override in subclasses
# to remove domain-specific handlers (e.g., domain-specific retry handler).
client.handlers.remove(RetryErrors::Handler)
end

Expand Down
48 changes: 24 additions & 24 deletions gems/smithy-client/lib/smithy-client/retry/adaptive.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@ module Client
module Retry
# Adaptive retry strategy for retrying requests.
class Adaptive
# @option [#call] :backoff (ExponentialBackoff.new) A callable object that
# calculates a backoff delay for a retry attempt.
# @option [Integer] :max_attempts (3) The maximum number of attempts that
# will be made for a single request, including the initial attempt.
# @option [Boolean] :wait_to_fill When true, the request will sleep until
Expand All @@ -15,60 +13,62 @@ class Adaptive
# not retry instead of sleeping.
def initialize(options = {})
super()
@backoff = options[:backoff] || ExponentialBackoff.new(
base_delay: options[:base_delay],
max_delay: options[:max_delay]
)
@max_attempts = options[:max_attempts] || 3
@wait_to_fill = options[:wait_to_fill] || true
@wait_to_fill = options.fetch(:wait_to_fill, true)
@client_rate_limiter = ClientRateLimiter.new
@quota = Quota.new
@capacity_amount = 0
end

# @return [#call]
attr_reader :backoff

# @return [Integer]
attr_reader :max_attempts

# @return [Boolean]
attr_reader :wait_to_fill

# Updates internal state based on the response outcome.
# @param [Http::ErrorInspector, nil] error_info The error info, or nil on success.
def request_bookkeeping(error_info = nil)
is_throttle = error_info&.error_type == 'Throttling'
@client_rate_limiter.update_sending_rate(is_throttle)
end

def acquire_initial_retry_token(_token_scope = nil)
@client_rate_limiter.token_bucket_acquire(1, wait_to_fill: @wait_to_fill)
Token.new
end

def refresh_retry_token(retry_token, error_info)
def refresh_retry_token(retry_token, error_info) # rubocop:disable Metrics/AbcSize
return unless error_info.retryable?

@client_rate_limiter.update_sending_rate(
error_info.error_type == 'Throttling'
)
return if retry_token.retry_count >= @max_attempts - 1

@capacity_amount = @quota.checkout_capacity(error_info)
return unless @capacity_amount.positive?
@client_rate_limiter.token_bucket_acquire(1, wait_to_fill: @wait_to_fill)

capacity_amount = @quota.checkout_capacity(error_info)
delay = backoff.call(retry_token.retry_count, error_info)
retry_token.capacity_amount = capacity_amount

if capacity_amount.zero?
retry_token.retry_delay = delay
retry_token.no_retry_reason = :quota_exhausted
return retry_token
end

delay = compute_delay(error_info, retry_token.retry_count)
retry_token.retry_count += 1
retry_token.retry_delay = delay
retry_token.no_retry_reason = nil
retry_token
end

def record_success(retry_token)
@client_rate_limiter.update_sending_rate(false)
@quota.release(@capacity_amount)
@quota.release(retry_token.capacity_amount)
retry_token
end

private

def compute_delay(error_info, retry_count)
return @backoff.call(retry_count) unless error_info.hints[:retry_after]

[error_info.hints[:retry_after], @backoff.max_delay].min
def backoff
@backoff ||= ExponentialBackoff.new
end
end
end
Expand Down
39 changes: 25 additions & 14 deletions gems/smithy-client/lib/smithy-client/retry/exponential_backoff.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,36 @@
module Smithy
module Client
module Retry
# Default exponential backoff retry strategy for retrying requests.
# @api private
# Default exponential backoff for retrying requests.
class ExponentialBackoff
Comment thread
jterapin marked this conversation as resolved.
def initialize(options = {})
@base_delay = options[:base_delay] || 2
@max_delay = options[:max_delay] || 20
end

# @return [Numeric]
attr_reader :base_delay

# @return [Numeric]
attr_reader :max_delay
MAX_BACKOFF = 20
EXPONENTIAL_BASE = 2

# Calculates a delay based on exponential backoff strategy. Uses full jitter approach.
# @param [Integer] attempts
# @param [Smithy::Client::Http::ErrorInspector] error_info
# @return [Numeric] delay in seconds
def call(attempts)
delay = (@base_delay**attempts)
[delay, @max_delay].min * Kernel.rand
def call(attempts, error_info)
# From SEP: t_i = b * min(x * r^i, MAX_BACKOFF)
calculated_delay = backoff_scalar_x(error_info) * (EXPONENTIAL_BASE**attempts)
t_i = Kernel.rand * [calculated_delay, MAX_BACKOFF].min
apply_retry_after(t_i, error_info)
end

private

def apply_retry_after(t_i, error_info)
retry_after = error_info.hints[:retry_after]
return t_i unless retry_after

# Clamp retry delay to t_i < delay < t_i + 5 per SEP.
delay = [t_i, retry_after].max
[delay, t_i + 5].min
end

def backoff_scalar_x(error_info)
error_info.error_type == 'Throttling' ? 1 : 0.05
end
end
end
Expand Down
10 changes: 5 additions & 5 deletions gems/smithy-client/lib/smithy-client/retry/quota.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@ module Smithy
module Client
module Retry
# @api private
# Used in 'standard' and 'adaptive' retry modes.
# Used in :standard and :adaptive retry modes.
class Quota
INITIAL_RETRY_TOKENS = 500
RETRY_COST = 5
RETRY_COST = 14
NO_RETRY_INCREMENT = 1
TIMEOUT_RETRY_COST = 10
THROTTLING_RETRY_COST = 5

def initialize
@mutex = Mutex.new
Expand All @@ -23,8 +23,8 @@ def initialize
def checkout_capacity(error_info)
@mutex.synchronize do
capacity_amount =
if error_info.error_type == 'Transient'
TIMEOUT_RETRY_COST
if error_info.error_type == 'Throttling'
THROTTLING_RETRY_COST
else
RETRY_COST
end
Expand Down
Loading
Loading