Skip to content

add pagination & remote search flow for remote storage#9708

Open
Light2Dark wants to merge 3 commits into
mainfrom
sham/storage-inspector-pagination
Open

add pagination & remote search flow for remote storage#9708
Light2Dark wants to merge 3 commits into
mainfrom
sham/storage-inspector-pagination

Conversation

@Light2Dark
Copy link
Copy Markdown
Collaborator

@Light2Dark Light2Dark commented May 28, 2026

📝 Summary

Fixes #9662

Screen.Recording.2026-05-28.at.10.24.22.AM.mov

📋 Pre-Review Checklist

  • For large changes, or changes that affect the public API: this change was discussed or approved through an issue, on Discord, or the community discussions (Please provide a link if applicable).
  • Any AI generated code has been reviewed line-by-line by the human PR author, who stands by it.
  • Video or media evidence is provided for any visual changes (optional).

✅ Merge Checklist

  • I have read the contributor guidelines.
  • Documentation has been updated where applicable, including docstrings for API changes.
  • Tests have been added for the changes made.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment May 28, 2026 8:35am

Request Review

@github-actions github-actions Bot added the bash-focus Area to focus on during release bug bash label May 28, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 16 files

Architecture diagram
sequenceDiagram
    participant UI as React UI (StorageInspector)
    participant Hook as useStorageEntries Hook
    participant State as Jotai Store (state.ts)
    participant API as Server API (commands.py)
    participant Backend as StorageBackend (storage.py)
    participant Obstore as Obstore (S3/GCS)
    participant Fsspec as FsspecFilesystem (local/SFTP)

    Note over UI,Fsspec: Storage List with Pagination

    UI->>Hook: Expand directory or section
    Hook->>State: Check cache (entriesByPath)
    alt Cache miss
        State-->>Hook: No cached entries
        Hook->>API: NEW: listEntries(namespace, prefix, limit, pageToken=null)
        API->>Backend: list_entries(prefix, limit, page_token)
        
        alt Obstore backend
            Backend->>Obstore: list_with_delimiter(prefix)
            Obstore-->>Backend: Raw entries (may be truncated at 1000)
            Backend->>Backend: _paginate_entries(offset=0, limit)
            Backend-->>API: StorageListResult(entries, nextPageToken, mayHaveMore)
        else Fsspec backend
            Backend->>Fsspec: ls(prefix, detail=True)
            Fsspec-->>Backend: File list
            Backend->>Backend: _paginate_entries(offset=0, limit)
            Backend-->>API: StorageListResult(entries, nextPageToken)
        end
        
        API-->>Hook: NEW: StorageEntriesNotification(entries, nextPageToken, mayHaveMore)
        Hook->>State: NEW: setEntries({entries, nextPageToken, mayHaveMore})
        State-->>Hook: entries, hasMore, mayHaveMore
        Hook-->>UI: Render entries + "Load more" button
    else Cache hit
        State-->>Hook: Cached entries + pagination metadata
        Hook-->>UI: Render entries
        alt hasMore (nextPageToken != null)
            UI->>Hook: Show "Load more" button
        else mayHaveMore (no token, but may exist)
            UI->>Hook: Show "May exist more" indicator
        end
    end

    Note over UI,Hook: Load More Click

    alt hasMore is true
        UI->>Hook: loadMore()
        Hook->>Hook: setIsLoadingMore(true)
        Hook->>API: NEW: listEntries(namespace, prefix, limit, pageToken=nextPageToken)
        API->>Backend: list_entries(prefix, limit, page_token)
        
        alt Obstore backend
            Backend->>Obstore: list_with_delimiter(prefix)
            Obstore-->>Backend: Raw entries
            Backend->>Backend: _paginate_entries(offset=parseInt(pageToken), limit)
            Backend-->>API: StorageListResult(entries, nextPageToken, mayHaveMore)
        else Fsspec backend
            Backend->>Fsspec: ls(prefix, detail=True)
            Fsspec-->>Backend: File list
            Backend->>Backend: _paginate_entries(offset=parseInt(pageToken), limit)
            Backend-->>API: StorageListResult(entries, nextPageToken)
        end
        
        API-->>Hook: NEW: StorageEntriesNotification(entries, nextPageToken, mayHaveMore)
        Hook->>State: NEW: setEntries({entries, append:true, nextPageToken, mayHaveMore})
        State-->>Hook: Appended entries + updated metadata
        Hook-->>UI: Render appended entries
        alt Has more pages
            UI->>Hook: Keep "Load more" button
        else No more pages but mayHaveMore
            UI->>Hook: Show "Maybe more" indicator
        end
    end

    Note over Hook,State: Error Handling

    alt Load more fails
        Hook->>Hook: catch error -> setLoadMoreError
        Hook-->>UI: Show error message with retry
        UI->>Hook: loadMore() retry
    end
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread marimo/_data/_external_storage/storage.py Outdated
Comment thread frontend/src/core/storage/state.ts Outdated
@Light2Dark Light2Dark changed the title implement pagination flow for remote storage implement pagination & remote search flow for remote storage May 28, 2026
@Light2Dark Light2Dark changed the title implement pagination & remote search flow for remote storage add pagination & remote search flow for remote storage May 28, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 6 files (changes from recent commits).

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread frontend/src/core/storage/state.ts
Comment thread frontend/src/components/storage/storage-inspector.tsx
@Light2Dark Light2Dark added the bug Something isn't working label May 28, 2026
@Light2Dark Light2Dark marked this pull request as ready for review June 1, 2026 16:24
Copilot AI review requested due to automatic review settings June 1, 2026 16:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses remote storage listings being effectively truncated (e.g., delimiter-based object store listings commonly cap results), by introducing a pagination model plus a “remote search continuation” flow so users can browse/search beyond the first page of results (Fixes #9662).

Changes:

  • Add page-based listing support to the runtime storage command/notification flow (page_token request; next_page_token + may_have_more response metadata).
  • Implement offset-based pagination and provider-truncation detection in external storage backends (obstore/fsspec) via a StorageListResult return type.
  • Update tests (backend + runtime + frontend) to cover pagination metadata, “load more”, and remote-search continuation behavior.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated no comments.

Show a summary per file
File Description
marimo/_data/_external_storage/models.py Introduces StorageListResult and updates the storage backend interface to return paginated results.
marimo/_data/_external_storage/storage.py Implements offset-based pagination helpers and backend-specific listing behavior, including may_have_more for provider truncation.
marimo/_runtime/commands.py Extends StorageListEntriesCommand with page_token.
marimo/_runtime/callbacks/external_storage.py Threads pagination parameters through to backends and emits pagination metadata in StorageEntriesNotification.
marimo/_messaging/notification.py Adds next_page_token and may_have_more fields to StorageEntriesNotification.
marimo/_server/models/models.py Ensures server request models forward page_token into runtime commands.
packages/openapi/api.yaml Updates OpenAPI schemas for new pagination request/notification fields.
packages/openapi/src/api.ts Updates generated TS API types to match the OpenAPI pagination schema changes.
frontend/src/core/storage/types.ts Adds pagination metadata types and updates storageUrl signature.
frontend/src/core/storage/state.ts Stores per-path pagination metadata and implements “load more” fetching/append behavior.
frontend/src/components/storage/storage-inspector.tsx Adds UI/logic for “Load more”, “may have more” messaging, and remote-search continuation flow.
tests/_data/_external_storage/test_storage_models.py Adds/updates backend tests for StorageListResult pagination tokens and provider-boundary signaling.
tests/_runtime/test_runtime_external_storage.py Adds runtime-level tests for paging requests and notifications carrying pagination metadata.
frontend/src/core/storage/tests/types.test.ts Updates tests for the new storageUrl signature/behavior.
frontend/src/core/storage/tests/state.test.ts Updates storage state tests to validate pagination metadata storage and cache behavior.
frontend/src/core/storage/tests/useStorageEntries.test.tsx Adds coverage for loadMore behavior, append semantics, and pagination metadata handling.
frontend/src/components/storage/tests/storage-inspector.test.ts Adds coverage for remote-search prefix derivation and filtering behavior.

Light2Dark added a commit that referenced this pull request Jun 2, 2026
## 📝 Summary

<!--
If this PR closes any issues, list them here by number (e.g., Closes
#123).

Detail the specific changes made in this pull request. Explain the
problem addressed and how it was resolved. If applicable, provide before
and after comparisons, screenshots, or any relevant details to help
reviewers understand the changes easily.
-->
These fields are to indicate whether a table's schemas has been loaded
vs truly empty.
That allows the frontend to distinguish whether to hide empty vs lazy
(which we shouldn't hide).

This change makes sense for future work, since it's important to
distinguish between truly empty vs not yet loaded. Another option was
considered: `list[DataTable] | None`, where `None` indicates not yet
loaded. However, this is semantically not clear even if a simpler code
change.

This is in support of this PR
#9708

## 📋 Pre-Review Checklist
<!-- These checks need to be completed before a PR is reviewed -->

- [x] For large changes, or changes that affect the public API: this
change was discussed or approved through an issue, on
[Discord](https://marimo.io/discord?ref=pr), or the community
[discussions](https://github.com/marimo-team/marimo/discussions) (Please
provide a link if applicable).
- [x] Any AI generated code has been reviewed line-by-line by the human
PR author, who stands by it.
- [ ] Video or media evidence is provided for any visual changes
(optional). <!-- PR is more likely to be merged if evidence is provided
for changes made -->

## ✅ Merge Checklist

- [x] I have read the [contributor
guidelines](https://github.com/marimo-team/marimo/blob/main/CONTRIBUTING.md).
- [ ] Documentation has been updated where applicable, including
docstrings for API changes.
- [x] Tests have been added for the changes made.

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bash-focus Area to focus on during release bug bash bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remote storage limits fo view

2 participants