fix(cursor): range-aware scan instead of blind 250k ROWID cap (#482) by iamtoruk · Pull Request #512 · getagentseal/codeburn

iamtoruk · 2026-06-18T16:40:56Z

Fixes #482. Opening for review (not merging).

Problem

On large Cursor DBs, the scan kept only "the most-recent 250k bubbles by ROWID" and warned unconditionally. Because ROWID ≈ insertion order (not strictly the requested window), this could drop in-range older sessions from long-range reports, and it warned even when the requested window fit well under the cap. The cap exists because the bubble timestamp lives inside the JSON value (no index), so the date filter forces a full JSON decode per row — multi-GB DBs were stalling 30s+.

Fix

Replace the blind ROWID cap with a range-aware paged scan for DBs over the budget:

Page ROWID-descending, keep only rows within the window (createdAt > timeFloor), and stop once a full page falls past the window floor (so short ranges decode far fewer rows).
Keep a hard budget (default 250k) as a backstop for genuinely enormous in-range scans.
The "older sessions may be missing" warning now fires only when the budget is actually hit, not on every large DB.

Small DBs (≤ budget) keep the original un-paged query unchanged.

Behavior

Short ranges: fewer decodes, no spurious warning, no loss.
Long ranges that fit the budget: full window returned (previously could drop older in-range sessions).
Truly huge in-range scans: truncated to the budget keeping the newest in-range bubbles, warned only then.
Also improves the daily-cache backfill (long-range / menu-bar history) the same way.

Tests

New tests/providers/cursor-large-db-cap.test.ts (budget shrunk via CODEBURN_CURSOR_MAX_BUBBLES):

In-range sessions with low ROWIDs are kept when the DB is over budget (old code returned 0 here).
Over-budget DB but in-range fits the budget → full window returned, no truncation.
Over budget → truncates to budget keeping the newest in-range bubbles (warns only then).

Full suite: 1209/1209. Existing cursor tests unchanged.

…482) Large Cursor DBs were scanned as "the most-recent 250k bubbles by ROWID", which dropped in-range older sessions from long-range reports and warned even when the requested window fit comfortably. The bubble timestamp lives in the JSON value (no index), so the date filter forces a full decode per row, which is why a scan bound exists. Replace the blind cap with a range-aware paged scan: for DBs over the budget, page ROWID-descending, keep only rows within the window (createdAt > timeFloor), and stop once a full page falls past the window floor. The hard budget remains as a backstop for genuinely enormous in-range scans, and the "older sessions may be missing" warning now fires only when that budget is actually hit. Effect: short ranges decode far fewer rows and no longer warn; long ranges return the full window when it fits the budget; truncation keeps the newest in-range bubbles and warns only then. Small DBs are unchanged (un-paged query). Budget is overridable via CODEBURN_CURSOR_MAX_BUBBLES for tests.

iamtoruk merged commit 9e997dd into main Jun 18, 2026
3 checks passed

iamtoruk deleted the fix/cursor-range-aware-cap branch June 18, 2026 16:45

iamtoruk mentioned this pull request Jun 18, 2026

Cursor provider should not truncate reports at 250k bubbles #482

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cursor): range-aware scan instead of blind 250k ROWID cap (#482)#512

fix(cursor): range-aware scan instead of blind 250k ROWID cap (#482)#512
iamtoruk merged 1 commit into
mainfrom
fix/cursor-range-aware-cap

iamtoruk commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

iamtoruk commented Jun 18, 2026

Problem

Fix

Behavior

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant