Make inserter batch size and parallelism configurable#2957
Conversation
|
I think we need the existing rate numbers to keep up with new matches. But I think there's a good point of the timeout not cancelling existing stalled inserts, so we probably need to kill the process and restart if that happens |
|
I made a separate change to kill the process on timeout. Let's see if that helps |
f6ff86b to
bf8ac71
Compare
Defaults 100/100 match prior inserter throughput (keeps current ingest rate). Operators may lower parallelism if Postgres is saturated. Co-authored-by: Cursor <cursoragent@cursor.com>
Fetch insert_queue with LIMIT from INSERTER_BATCH_SIZE; run insertMatch in waves of at most INSERTER_PARALLELISM. Keep the upstream whole-tick watchdog (10s, redisCount inserter_timeout, process.exit) so stalled work restarts the process instead of only rejecting a Promise. Co-authored-by: Cursor <cursoragent@cursor.com>
bf8ac71 to
a720e8a
Compare
|
@howardchung I've rebased to keep the new timeout change, and I've also updated config to preserve existing rates. The main change now is being able to tune 2 knobs without code edits: batch size per tick, and the num parallel inserts The batch size can be increased so more matches get pulled from |
What
INSERTER_BATCH_SIZEandINSERTER_PARALLELISM(defaults 100 / 100) so operators can tune how manyinsert_queuerows are pulled per tick and how manyinsertMatchcalls run at once, without code changes.LIMIT 100with a boundLIMIT ?driven by batch size.INSERTER_PARALLELISMinserts (with defaults, that is still one wave of up to 100, matching previous behavior).inserter_timeoutandprocess.exit(1)if a tick stalls; wraps the tick intry/finallyso the timer is always cleared on success.Error("no match in row: %s", …)usage and logsPromise.allSettledrejections per chunk.Why
The pain was too many concurrent inserts when the system is under load; parallelism is what sets that peak, batch size is how many rows you dequeue per tick. Exposing both as config makes it easy to lower concurrency when Postgres is saturated while defaults stay at the old 100-wide rate so ingest keeps up when healthy.