Rewrite of worker pool with multi-task dispatch per worker by tcely · Pull Request #26 · kikkia/yt-cipher

tcely · 2026-01-06T00:20:51Z

PR Type

Enhancement

Description

Refactor worker pool to process multiple tasks per worker lifecycle
Implement per-worker message limits with automatic worker rotation
Add robust error handling, timeouts, and recovery mechanisms
Introduce deque-based task queue with pluggable implementations
Add comprehensive worker lifecycle management and in-flight task tracking

Diagram Walkthrough

flowchart LR
  A["Task Queue<br/>Deque-based"] -->|dispatch| B["Idle Worker<br/>with Budget"]
  B -->|postMessage| C["Worker Process"]
  C -->|message event| D["Task Resolution<br/>resolve/reject"]
  D -->|releaseWorker| E{Budget<br/>Remaining?}
  E -->|Yes| F["Return to Idle Pool"]
  E -->|No| G["Retire & Replace"]
  F -->|fillWorkers| H["Create New Worker<br/>if needed"]
  C -->|error/crash| I["Reject Task<br/>& Retire Worker"]
  I -->|scheduleRefill| H

File Walkthrough

Relevant files

Enhancement

taskQueueDeque.ts `Add pluggable deque-based task queue implementation` src/taskQueueDeque.ts New file implementing pluggable deque-based task queue Supports two implementations: `@alg/deque` and `@korkje/deque` Adapters provide unified `TaskQueue` interface with `push()`, `shift()`, and `length` Environment variable `TASK_QUEUE_DEQUE_IMPL` controls which implementation is used	+62/-0
types.ts `Update types for worker limits and task tracking` src/types.ts Replace `WorkerWithStatus` with `WorkerWithLimit` interface tracking message budget Add `TaskQueue` interface for queue abstraction Add `InFlight` type to track in-flight tasks with timeout IDs Add `SafeCallOptions` type for error handling and logging configuration	+30/-3
utils.ts `Add safe function call utility with error handling` src/utils.ts Add `safeCall()` function for safe function invocation with error handling Implement `looksLikeSafeCallOptions()` type guard for options validation Support optional logging, error callbacks, and label-based error tracking Preserve caller's `this` context using `Reflect.apply()`	+65/-1
workerPool.ts `Refactor worker pool with budgets and lifecycle management` src/workerPool.ts Complete refactor of worker pool with multi-task dispatch per worker Implement per-worker message budgets with `MESSAGES_LIMIT` configuration Add comprehensive in-flight task tracking with timeout protection Implement worker lifecycle management: idle pool, retirement, and replacement Add recovery mechanism with exponential backoff for pool failures Implement permanent message/error/crash handlers on workers Add task age validation and queue draining on fatal errors Replace simple dispatch loop with `fillWorkers()` and `dispatch()` functions Track idle workers using both Set and Stack for efficient access Add helper functions: `releaseWorker()`, `setIdle()`, `setInFlight()`, `clearInFlight()`, `takeIdleWorker()`	+427/-30

Documentation

README.md `Add comprehensive environment variables documentation` src/README.md New documentation file documenting all environment variables Document `TASK_QUEUE_DEQUE_IMPL` for queue implementation selection Document `MAX_THREADS` for worker pool concurrency control Document `MESSAGES_LIMIT` for per-worker message budget configuration Document other existing variables: `API_TOKEN`, `HOST`, `PORT`, `HOME`, `XDG_CACHE_HOME`, cache size limits	+90/-0

kikkia · 2026-01-13T07:03:16Z

Hey there, I took a look a while back and was thinking a bit on it, what is the benefit here vs the current implementation? The current implementation is simple and easily understandable.

Since the these jobs are only used for the first request on a new script its rare they happen. The public instance only see's a couple of those per week despite serving about 200-300k deciphers a day.

Unless there is some big improvement or bug that I have not seen or experienced at my scale, then I could be open to it, but I think the simplicity and ease of understanding is preferable.

tcely · 2026-01-13T07:42:44Z

I have also seen very few actual new players since I put up a server and pointed yt-dlp at it.

I'd very much like to see dispatch using a loop to assign more than one task before calling dispatch again.

A benefit of the new worker-pool implementation is robustness and liveness under edge cases that can otherwise lead to stuck requests, silent leaks, or long-lived degraded behavior. In other words: it’s about “never getting wedged.”

Concretely, the new code adds protections that the current simple model doesn’t provide:

1) Hard liveness guarantees (prevents “hung forever” requests)

New behavior

Each in-flight task gets an explicit timeout (IN_FLIGHT_TIMEOUT_MS = 60m). If a worker never responds (crash, deadlock, event loop stall, etc.), the task is rejected and the worker is retired.
Tasks sitting in the queue too long are rejected (MAX_TASK_AGE_MS = 30m) rather than being served extremely late.

Current behavior risk

If a worker gets into a bad state and never posts back, that request can hang indefinitely, and the pool can slowly degrade (depending on how many workers get wedged).

Even if “new script” events are rare, the impact of a single hung request can be high (clients waiting forever, connection accumulation, etc.).

2) Worker recycling via message budgets (prevents long-lived memory/GC creep)

New behavior

Workers have a configurable message budget (MESSAGES_LIMIT, default 10_000). When exhausted, the worker is terminated and replaced.

Why it matters even if first-script is rare

These workers execute fairly heavy JavaScript parsing / transforms and may retain memory unintentionally (library/global caches, V8 isolates, fragmentation, etc.). Recycling prevents “one bad long-lived isolate” behavior.
This is a common production tactic for long-running worker processes.

If at your scale scripts are rarely processed, the budget will rarely tick down—so overhead is near-zero, but it still protects you against “one worker that got large / leaky over months.”

For very small scale situations, the limit can be significantly lowered.

3) Persistent handlers + in-flight tracking = fewer “state bugs”

New behavior

Permanent message, messageerror, and error handlers are attached once per worker.
In-flight tasks are tracked by worker in a WeakMap, and “stray messages” cause worker retirement (protects correctness).

Current behavior risk

The old code attaches a new message handler per task and removes it on response. That’s simple, but if anything goes wrong (double message, exception before remove, unexpected payload), it’s easier to end up in weird states or leak handlers.
The new code trades simplicity for a stronger correctness model: “worker has at most 1 in-flight task, tracked explicitly.”

4) Safer task settle (prevents crashes caused by userland reject/resolve)

New behavior

safeCall() is used to invoke resolve/reject defensively. If user code throws during resolve/reject, the pool keeps working.

Current behavior risk

If a consumer of execInPool() throws from a resolve path (it happens more than is often expected), the worker-pool logic can be disrupted.

5) Backoff-based recovery scaffolding

The scheduleRefillAndDispatch() path wraps refill/dispatch in a try/catch and has backoff retry. This is effectively a “pool self-heals” loop if internal bookkeeping gets into a bad state.

I agree this is heavier than needed for normal operation; but it’s specifically targeted at the hard-to-debug “pool got into a broken bookkeeping state” incidents.

6) Deque selection is mostly about avoiding pathological array behavior

This PR adds TaskQueue + createTaskQueue() with selectable deque backend.

In your workload, queue depth probably stays tiny, so it won’t matter.
But it prevents worst-case O(n) behavior from shift/unshift patterns if queue grows (e.g., transient spikes, upstream slowdown, worker crash causing backlog).

Tradeoff: You’re right about simplicity

The current implementation is extremely easy to reason about. The new one is more complex and has more moving parts (idle stack/set, in-flight maps, retire sets, timers, recovery state).

If your primary concern is maintainability and you haven’t seen worker wedging / memory creep, a reasonable compromise would be:

Keep the old structure, but cherry-pick two high-value safety features:
1. In-flight timeout per task (prevents indefinite hangs)
2. Worker message budget recycling (prevents long-lived isolate creep)
Optionally keep safeCall() around (very small surface area, good safety win).

That would retain simplicity while still addressing the biggest “production footguns.”

Bottom line

At your scale, you likely won’t see a throughput improvement because this path is cold most of the time. The benefit is mainly:

bounded hangs (timeouts),
bounded queue staleness (max age),
bounded worker lifetime (message budgets),
fewer “pool stuck forever” failure modes (explicit tracking + recovery).

If those failure modes aren’t a concern for your deployment, I’d recommend a reduced-scope version (less recursion by using a loop in dispatch, timeouts + budget recycling) to preserve readability while still getting the reliability wins.

tcely added 4 commits January 5, 2026 19:09

Rewrite of worker pool with multi-task dispatch per worker

a8355b1

Add and use a createWorker function

42a068b

Expand the TaskQueue interface and update the adapters

bb182fa

Fix shadowed this

2d7bc89

tcely force-pushed the tcely-worker-pool branch from f073ba1 to 2d7bc89 Compare January 7, 2026 21:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rewrite of worker pool with multi-task dispatch per worker#26

Rewrite of worker pool with multi-task dispatch per worker#26
tcely wants to merge 4 commits intokikkia:masterfrom
tcely:tcely-worker-pool

tcely commented Jan 6, 2026

Uh oh!

kikkia commented Jan 13, 2026

Uh oh!

tcely commented Jan 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tcely commented Jan 6, 2026

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

kikkia commented Jan 13, 2026

Uh oh!

tcely commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1) Hard liveness guarantees (prevents “hung forever” requests)

2) Worker recycling via message budgets (prevents long-lived memory/GC creep)

3) Persistent handlers + in-flight tracking = fewer “state bugs”

4) Safer task settle (prevents crashes caused by userland reject/resolve)

5) Backoff-based recovery scaffolding

6) Deque selection is mostly about avoiding pathological array behavior

Tradeoff: You’re right about simplicity

Bottom line

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tcely commented Jan 13, 2026 •

edited

Loading