fix(cursor): range-aware scan instead of blind 250k ROWID cap (#482)#512
Merged
Conversation
…482) Large Cursor DBs were scanned as "the most-recent 250k bubbles by ROWID", which dropped in-range older sessions from long-range reports and warned even when the requested window fit comfortably. The bubble timestamp lives in the JSON value (no index), so the date filter forces a full decode per row, which is why a scan bound exists. Replace the blind cap with a range-aware paged scan: for DBs over the budget, page ROWID-descending, keep only rows within the window (createdAt > timeFloor), and stop once a full page falls past the window floor. The hard budget remains as a backstop for genuinely enormous in-range scans, and the "older sessions may be missing" warning now fires only when that budget is actually hit. Effect: short ranges decode far fewer rows and no longer warn; long ranges return the full window when it fits the budget; truncation keeps the newest in-range bubbles and warns only then. Small DBs are unchanged (un-paged query). Budget is overridable via CODEBURN_CURSOR_MAX_BUBBLES for tests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #482. Opening for review (not merging).
Problem
On large Cursor DBs, the scan kept only "the most-recent 250k bubbles by ROWID" and warned unconditionally. Because ROWID ≈ insertion order (not strictly the requested window), this could drop in-range older sessions from long-range reports, and it warned even when the requested window fit well under the cap. The cap exists because the bubble timestamp lives inside the JSON value (no index), so the date filter forces a full JSON decode per row — multi-GB DBs were stalling 30s+.
Fix
Replace the blind ROWID cap with a range-aware paged scan for DBs over the budget:
createdAt > timeFloor), and stop once a full page falls past the window floor (so short ranges decode far fewer rows).Small DBs (≤ budget) keep the original un-paged query unchanged.
Behavior
Tests
New
tests/providers/cursor-large-db-cap.test.ts(budget shrunk viaCODEBURN_CURSOR_MAX_BUBBLES):Full suite: 1209/1209. Existing cursor tests unchanged.