Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@ jobs:
version: '1.60.3'
- run: smithy --version

- name: Pin rdoc for JRuby
if: startsWith(matrix.ruby, 'jruby')
run: echo 'gem "rdoc", "< 8.0"' >> Gemfile

- name: Setup Ruby
uses: ruby/setup-ruby@v1
with:
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ jobs:
steps:
- uses: actions/checkout@v4

- name: Pin rdoc for JRuby
if: startsWith(matrix.ruby, 'jruby')
run: echo 'gem "rdoc", "< 8.0"' >> Gemfile

- name: Setup Ruby
uses: ruby/setup-ruby@v1
with:
Expand Down
1 change: 1 addition & 0 deletions gems/smithy-client/lib/smithy-client.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
require_relative 'smithy-client/log_param_formatter'
require_relative 'smithy-client/managed_file'
require_relative 'smithy-client/networking_error'
require_relative 'smithy-client/page_enumerator'
require_relative 'smithy-client/pageable_response'
require_relative 'smithy-client/param_converter'
require_relative 'smithy-client/param_validator'
Expand Down
36 changes: 36 additions & 0 deletions gems/smithy-client/lib/smithy-client/page_enumerator.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# frozen_string_literal: true

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the custom enumerator with limited API surface to safeguard against unsafe Enumerable methods


module Smithy
module Client
# A controlled enumerator for paginated results. Exposes only safe
# enumeration methods that are meaningful for page/item iteration,
# preventing accidental / wasteful full-pagination calls like #count or #sort.
class PageEnumerator
def initialize(&block)
@block = block
end

def each(&consumer)
return self unless consumer

enum.each(&consumer)
end

def map(&) = enum.map(&)
def select(&) = enum.select(&)
def filter(&) = enum.select(&)
def flat_map(&) = enum.flat_map(&)
def reduce(*, &) = enum.reduce(*, &)
def first(*) = enum.first(*)

def take(val) = enum.take(val)
def lazy = enum.lazy

private

def enum
Enumerator.new(&@block)
end
Comment on lines +31 to +33

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current PageEnumerator creates a new Enumerator on every method call (map, select, first, etc.) via the private enum method. Is there a way to prevent this or was this intentional?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allocation cost is negligible compared to the API calls each iteration triggers & returning brand new enumerator each time I think has a subtle benefit of thread safety with interleaved paging operations when it's used concurrently. But changing it to reuse enumerator is a one line change so if this is based on rubyist convention / you feel strongly for reusing, we can do that

end
end
end
53 changes: 42 additions & 11 deletions gems/smithy-client/lib/smithy-client/pageable_response.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,13 @@ def initialize(response)
# This yields one response object per API call made. The SDK retrieves additional
# pages of data to complete the request.
#

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates the PageableResponse module that gets added to Response to return the custom enumerator when not give a block, and if given a block, enumerate using the custom enumerator and yield results. As it has been, PageableResponse#each will paginate over pages instead of items due to uniqueness of AWS background discussed offline.

# When called without a block, `each_page` returns a {PageEnumerator} that supports
# chaining safe enumeration methods:
#
# weather.list_cities.each_page.map { |page| page.items.map(&:name) }
# weather.list_cities.each_page.first(3)
# weather.list_cities.each_page.flat_map { |page| page.items }
#
# If the operation allows for it, a selected item can be enumerated using
# `each_item`:
#
Expand Down Expand Up @@ -96,9 +103,21 @@ def next_page(params = {})
end

# Yields the current and each following response to the given block.
# When called without a block, returns a {PageEnumerator}.
# @yieldparam [Response] response
# @return [Enumerable, nil] Returns a new Enumerable if no block is given.
def each_page(&)
# @return [PageEnumerator, nil]
def each_page(&block)
unless block
return PageEnumerator.new do |y|
response = self
y << response
until response.last_page?
response = response.next_page
y << response
end
end
end

response = self
yield(response)
until response.last_page?
Expand All @@ -107,16 +126,28 @@ def each_page(&)
end

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about returning self here to enable chaining?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: What could be the usecase for returning self to allow chaining here? If user doesn't provide a block, they get enumerator they can chain to already, and if user provides a block, whatever the user needs to do would be specified within that block. Fwiw, V3 also returns nil instead of self for each when user provides a block: https://github.com/aws/aws-sdk-ruby/blob/313cb58ad76dac97c27ff6d63b99eaadd5ad4d00/gems/aws-sdk-core/lib/aws-sdk-core/pageable_response.rb#L188-L196

end

# Yields the current and each following item to the given block.
# Yields the current and each following response to the given block.
# When called without a block, returns a {PageEnumerator}.
# This is an alias for {#each_page}.
# @yieldparam [Response] response
# @return [PageEnumerator, nil]
def each(&)
each_page(&)
end

# Yields each item across all pages to the given block.
# When called without a block, returns a {PageEnumerator}.
# @yieldparam [Object] item
# @return [Enumerable, nil] Returns a new Enumerable if no block is given.
def each_item(&)
response = self
@paginator.items(response.data).each(&)
until response.last_page?
response = response.next_page
@paginator.items(response.data).each(&)
# @return [PageEnumerator, nil]
# @raise [NotImplementedError] if the service API does not define items.
def each_item(&block)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the API does not define items and customer calls .each_item, they'll get a NotImplementedError. Should we document this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, added a @raise doc comment.

unless block
return PageEnumerator.new do |y|
each_page { |page| @paginator.items(page.data).each { |item| y << item } }
end
end

each_page { |page| @paginator.items(page.data).each(&block) }
end

private
Expand All @@ -130,7 +161,7 @@ def next_page_params(params)
prev_tokens = @paginator.prev_tokens(context.params)
# Remove all previous tokens from original params
# Sometimes a token can be nil and merge would not include it.
new_params = context[:original_params].except(*prev_tokens)
new_params = context[:original_params].except(*prev_tokens.keys)
new_params.merge!(@paginator.next_tokens(data).merge(params))
end
end
Expand Down
95 changes: 95 additions & 0 deletions gems/smithy-client/spec/smithy-client/page_enumerator_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# frozen_string_literal: true

require_relative '../spec_helper'

module Smithy
module Client
describe PageEnumerator do
let(:enumerator) do
PageEnumerator.new do |y|
y << 'page1'
y << 'page2'
y << 'page3'
end
end

describe '#each' do
it 'yields each value when a block is given' do
results = []
enumerator.each { |v| results << v } # rubocop:disable Style/MapIntoArray
expect(results).to eq %w[page1 page2 page3]
end

it 'returns self when no block is given' do
expect(enumerator.each).to be(enumerator)
end
end

describe '#map' do
it 'transforms each value' do
expect(enumerator.map(&:upcase)).to eq %w[PAGE1 PAGE2 PAGE3]
end
end

describe '#select' do
it 'filters values' do
expect(enumerator.select { |v| v.include?('2') }).to eq ['page2']
end
end

describe '#filter' do
it 'filters values (alias for select)' do
expect(enumerator.filter { |v| v.include?('3') }).to eq ['page3']
end
end

describe '#flat_map' do
it 'maps and flattens values' do
expect(enumerator.flat_map { |v| [v, v] }).to eq %w[page1 page1 page2 page2 page3 page3]
end
end

describe '#reduce' do
it 'accumulates values' do
expect(enumerator.reduce('') { |acc, v| acc + v }).to eq 'page1page2page3'
end

it 'supports an initial value' do
expect(enumerator.reduce([]) { |acc, v| acc + [v] }).to eq %w[page1 page2 page3]
end
end

describe '#first' do
it 'returns the first value with no argument' do
expect(enumerator.first).to eq 'page1'
end

it 'returns the first n values with an argument' do
expect(enumerator.first(2)).to eq %w[page1 page2]
end
end

describe '#take' do
it 'returns the first n values' do
expect(enumerator.take(2)).to eq %w[page1 page2]
end
end

describe '#lazy' do
it 'returns a lazy enumerator' do
lazy = enumerator.lazy
expect(lazy).to be_a(Enumerator::Lazy)
expect(lazy.take(1).to_a).to eq ['page1']
end
end

describe 'blocked methods' do
%i[count sort min max tally to_a sum zip].each do |method|
it "does not respond to ##{method}" do
expect(enumerator.respond_to?(method)).to be false
end
end
end
end
end
end
Loading
Loading