Skip to content

fix(cursor): range-aware scan instead of blind 250k ROWID cap (#482)#512

Merged
iamtoruk merged 1 commit into
mainfrom
fix/cursor-range-aware-cap
Jun 18, 2026
Merged

fix(cursor): range-aware scan instead of blind 250k ROWID cap (#482)#512
iamtoruk merged 1 commit into
mainfrom
fix/cursor-range-aware-cap

Conversation

@iamtoruk

Copy link
Copy Markdown
Member

Fixes #482. Opening for review (not merging).

Problem

On large Cursor DBs, the scan kept only "the most-recent 250k bubbles by ROWID" and warned unconditionally. Because ROWID ≈ insertion order (not strictly the requested window), this could drop in-range older sessions from long-range reports, and it warned even when the requested window fit well under the cap. The cap exists because the bubble timestamp lives inside the JSON value (no index), so the date filter forces a full JSON decode per row — multi-GB DBs were stalling 30s+.

Fix

Replace the blind ROWID cap with a range-aware paged scan for DBs over the budget:

  • Page ROWID-descending, keep only rows within the window (createdAt > timeFloor), and stop once a full page falls past the window floor (so short ranges decode far fewer rows).
  • Keep a hard budget (default 250k) as a backstop for genuinely enormous in-range scans.
  • The "older sessions may be missing" warning now fires only when the budget is actually hit, not on every large DB.

Small DBs (≤ budget) keep the original un-paged query unchanged.

Behavior

  • Short ranges: fewer decodes, no spurious warning, no loss.
  • Long ranges that fit the budget: full window returned (previously could drop older in-range sessions).
  • Truly huge in-range scans: truncated to the budget keeping the newest in-range bubbles, warned only then.
  • Also improves the daily-cache backfill (long-range / menu-bar history) the same way.

Tests

New tests/providers/cursor-large-db-cap.test.ts (budget shrunk via CODEBURN_CURSOR_MAX_BUBBLES):

  1. In-range sessions with low ROWIDs are kept when the DB is over budget (old code returned 0 here).
  2. Over-budget DB but in-range fits the budget → full window returned, no truncation.
  3. Over budget → truncates to budget keeping the newest in-range bubbles (warns only then).

Full suite: 1209/1209. Existing cursor tests unchanged.

…482)

Large Cursor DBs were scanned as "the most-recent 250k bubbles by ROWID",
which dropped in-range older sessions from long-range reports and warned even
when the requested window fit comfortably. The bubble timestamp lives in the
JSON value (no index), so the date filter forces a full decode per row, which
is why a scan bound exists.

Replace the blind cap with a range-aware paged scan: for DBs over the budget,
page ROWID-descending, keep only rows within the window (createdAt > timeFloor),
and stop once a full page falls past the window floor. The hard budget remains
as a backstop for genuinely enormous in-range scans, and the "older sessions
may be missing" warning now fires only when that budget is actually hit.

Effect: short ranges decode far fewer rows and no longer warn; long ranges
return the full window when it fits the budget; truncation keeps the newest
in-range bubbles and warns only then. Small DBs are unchanged (un-paged query).
Budget is overridable via CODEBURN_CURSOR_MAX_BUBBLES for tests.
@iamtoruk iamtoruk merged commit 9e997dd into main Jun 18, 2026
3 checks passed
@iamtoruk iamtoruk deleted the fix/cursor-range-aware-cap branch June 18, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cursor provider should not truncate reports at 250k bubbles

1 participant