Skip to content

Make inserter batch size and parallelism configurable#2957

Closed
ff137 wants to merge 2 commits into
odota:masterfrom
ff137:feat/inserter-backpressure
Closed

Make inserter batch size and parallelism configurable#2957
ff137 wants to merge 2 commits into
odota:masterfrom
ff137:feat/inserter-backpressure

Conversation

@ff137
Copy link
Copy Markdown
Contributor

@ff137 ff137 commented May 14, 2026

What

  • Adds INSERTER_BATCH_SIZE and INSERTER_PARALLELISM (defaults 100 / 100) so operators can tune how many insert_queue rows are pulled per tick and how many insertMatch calls run at once, without code changes.
  • Replaces the hardcoded LIMIT 100 with a bound LIMIT ? driven by batch size.
  • Processes each tick in waves of at most INSERTER_PARALLELISM inserts (with defaults, that is still one wave of up to 100, matching previous behavior).
  • Keeps the maintainer’s 10s watchdog that records inserter_timeout and process.exit(1) if a tick stalls; wraps the tick in try / finally so the timer is always cleared on success.
  • Fixes the invalid Error("no match in row: %s", …) usage and logs Promise.allSettled rejections per chunk.

Why

The pain was too many concurrent inserts when the system is under load; parallelism is what sets that peak, batch size is how many rows you dequeue per tick. Exposing both as config makes it easy to lower concurrency when Postgres is saturated while defaults stay at the old 100-wide rate so ingest keeps up when healthy.

@howardchung
Copy link
Copy Markdown
Member

I think we need the existing rate numbers to keep up with new matches. But I think there's a good point of the timeout not cancelling existing stalled inserts, so we probably need to kill the process and restart if that happens

@howardchung
Copy link
Copy Markdown
Member

I made a separate change to kill the process on timeout. Let's see if that helps

@ff137 ff137 force-pushed the feat/inserter-backpressure branch 2 times, most recently from f6ff86b to bf8ac71 Compare May 15, 2026 15:00
ff137 and others added 2 commits May 15, 2026 17:01
Defaults 100/100 match prior inserter throughput (keeps current ingest rate).
Operators may lower parallelism if Postgres is saturated.

Co-authored-by: Cursor <cursoragent@cursor.com>
Fetch insert_queue with LIMIT from INSERTER_BATCH_SIZE; run insertMatch in
waves of at most INSERTER_PARALLELISM. Keep the upstream whole-tick watchdog
(10s, redisCount inserter_timeout, process.exit) so stalled work restarts the
process instead of only rejecting a Promise.

Co-authored-by: Cursor <cursoragent@cursor.com>
@ff137 ff137 force-pushed the feat/inserter-backpressure branch from bf8ac71 to a720e8a Compare May 15, 2026 15:01
@ff137 ff137 changed the title Limit how many matches the inserter processes at once Make inserter batch size and parallelism configurable May 15, 2026
@ff137
Copy link
Copy Markdown
Contributor Author

ff137 commented May 15, 2026

@howardchung I've rebased to keep the new timeout change, and I've also updated config to preserve existing rates.

The main change now is being able to tune 2 knobs without code edits: batch size per tick, and the num parallel inserts

The batch size can be increased so more matches get pulled from insert_queue per tick, and lower the parallel inserts so that postgres pressure is reduced. I think that's the direction to go down -- keep total matches per batch high, while reducing concurrent inserts. But will take experimentation to find the sweet spot

@ff137 ff137 closed this May 17, 2026
@ff137 ff137 deleted the feat/inserter-backpressure branch May 17, 2026 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants