- Bun clustering (native, pm2 and bm2)
This repository demonstrates that clustering with Bun and pm2/bm2 doesn't work as expected.
The setup is very simple: a bare server responds to requests made to /api, where a cpuIntensiveWork() call is made. cpuIntensiveWork() blocks that server's app for 1s at 100% CPU. When another request is made to the same endpoint, the load balancer should route the request to the next available cluster instance (three are available).
A simple HTML page is served by the same server under /, it simply fires three simultaneous requests to /api, which should all resolve in ~1s (indicating each instance of the cluster received each request separately and responded in parallel). Requests completing sequentially at 1s, 2s, 3s indicate that all requests are being handled by the same instance.
A node application running under pm2 is clustered as expected, routing requests to each instance:
- Request 2, fetch completed in 1007ms: Request 2 was resolved by nodeapp-2 and took 1001ms to complete
- Request 1, fetch completed in 1008ms: Request 1 was resolved by nodeapp-1 and took 1001ms to complete
- Request 3, fetch completed in 1015ms: Request 3 was resolved by nodeapp-0 and took 1000ms to complete
Both pm2-beta and bm2 fail to properly cluster and route the requests:
- Request 1, fetch completed in 1005ms: Request 1 was resolved by undefined-0 and took 1000ms to complete
- Request 2, fetch completed in 2005ms: Request 2 was resolved by undefined-0 and took 1000ms to complete
- Request 3, fetch completed in 3004ms: Request 3 was resolved by undefined-0 and took 1000ms to complete
These are the same results one gets when running the server unclustered.
To rule out pm2/bm2 as the cause, two custom clustering scripts were written for Bun.
Spawns N worker processes directly using Bun.spawn(). Since Bun.serve() uses reusePort: true, all workers bind to the same port and rely on the OS kernel to distribute incoming connections via SO_REUSEPORT.
Run with:
cd bun
bun "native clustering/cluster-runner.ts"
navigate to http://localhost:5555.
Uses Bun's implementation of Node.js's node:cluster module, which is supposed to run a primary process that accepts all connections and distributes them to workers via IPC in round-robin.
Run with:
cd bun
bun "native clustering/cluster.ts"
navigate to http://localhost:5555.
Both approaches fail completely — all requests are handled by the same instance:
$ bun "native clustering/cluster-runner.ts"
Started 3 workers
Request 1 received by 'bunapp-1'
Request 2 received by 'bunapp-1'
Request 3 received by 'bunapp-1'
$ bun "native clustering/cluster.ts"
Primary 73257 running, forking 3 workers
Request 1 received by worker 73260 (instance 2)
Request 2 received by worker 73260 (instance 2)
Request 3 received by worker 73260 (instance 2)
An initial hypothesis was that the browser might be reusing a single TCP connection for all 3 requests (HTTP keep-alive), causing them to be handled sequentially by the same instance. Two things were investigated:
-
Connection: closeheader:index.htmlsends this header with each fetch to hint the server to close the connection after each response. However,Connectionis a forbidden header name per the Fetch spec — browsers silently drop it, so it has no effect. -
Terminal requests: To bypass the browser entirely, the same 3 simultaneous requests were made from separate terminal tabs using
curl. The results were identical — all requests still went to the same instance.
This ruled out browser connection reuse as the cause. The issue is in how the OS and Bun distribute connections across workers.
cluster-runner.ts distributes requests across multiple workers, but unevenly. Over 10+ runs, the pattern is consistent: hash collisions reliably send 2 requests to one worker and 1 to another, leaving the third worker idle:
$ bun "native clustering/cluster-runner.ts"
Started 3 workers
Request 1 received by 'bunapp-0'
Request 3 received by 'bunapp-1'
Request 2 received by 'bunapp-0' ← should have gone to bunapp-2
In the browser this shows up as request 2 completing in 2s instead of 1s, because it had to wait behind request 1 on the same worker.
cluster.ts (node:cluster) behaves differently from cluster-runner.ts on Linux. Over 10+ runs it sometimes distributes all 3 requests to 3 separate workers correctly, and sometimes exhibits the same collision pattern:
$ bun "native clustering/cluster.ts"
Primary 2658669 running, forking 3 workers
Request 3 received by worker 2658683 (instance 1)
Request 1 received by worker 2658681 (instance 0)
Request 2 received by worker 2658681 (instance 0) ← should have gone to instance 2
The occasional correct distribution suggests Bun's node:cluster does implement some form of master-side distribution, but it is not reliable — unlike Node.js's SCHED_RR which distributes correctly on every run.
As a control, the same node:cluster approach was implemented for Node in node+npm/cluster.js. Unlike bun/native clustering/cluster.ts, this runs on real Node.js where SCHED_RR is fully implemented.
Run with:
cd node+npm
node cluster.js
navigate to http://localhost:5556.
Results are consistent across all runs — every request goes to a separate worker and all complete in ~1s:
- Request 1, fetch completed in 1007ms: Request 1 was resolved by nodeapp-0 and took 1000ms to complete
- Request 2, fetch completed in 1014ms: Request 2 was resolved by nodeapp-2 and took 1001ms to complete
- Request 3, fetch completed in 1014ms: Request 3 was resolved by nodeapp-1 and took 1001ms to complete
This confirms that the issue is specific to Bun's node:cluster implementation, not to the test setup.
On macOS (BSD-derived), SO_REUSEPORT allows multiple sockets to bind to the same port but does not distribute connections across them. All connections go to whichever socket calls accept() first. When 3 requests arrive in rapid succession, the same worker accepts all 3 from its queue before its event loop gets blocked by cpuIntensiveWork.
On Linux, SO_REUSEPORT was specifically designed for load balancing: the kernel hashes each connection's 4-tuple (src_ip, src_port, dst_ip, dst_port) and maps it to one of the listening sockets. This distributes connections, but the distribution is hash-based, not round-robin. With only 3 simultaneous connections, hash collisions are common — two connections can easily hash to the same worker, leaving another worker idle.
In real Node.js, the cluster module uses SCHED_RR (round-robin) by default on non-Windows: the primary process owns the single listening socket, accepts every connection itself, and hands the socket fd to workers in strict rotation — one per worker, no hashing. This is why Node + pm2 achieves perfect distribution on every run.
Bun's node:cluster is only partially implemented. On macOS it falls back entirely to SO_REUSEPORT. On Linux it sometimes distributes correctly across all workers, but is not reliable — suggesting the master-side distribution logic exists but has a race or timing issue that allows connections to bypass it.
| Approach | Mechanism | macOS | Linux |
|---|---|---|---|
| Node + pm2 cluster | Master accepts + round-robin IPC | ✅ All 3 workers, ~1s each | ✅ All 3 workers, ~1s each |
Node cluster.js (native) |
Master accepts + round-robin IPC | ✅ All 3 workers, ~1s each | ✅ All 3 workers, ~1s each |
| Bun + pm2-beta cluster | SO_REUSEPORT | ❌ 1 worker, sequential | |
| Bun + bm2 cluster | SO_REUSEPORT | ❌ 1 worker, sequential | |
Bun cluster-runner.ts |
SO_REUSEPORT | ❌ 1 worker, sequential | |
Bun cluster.ts (node:cluster) |
Partial master + SO_REUSEPORT fallback | ❌ 1 worker, sequential |
The fix needs to be in Bun's node:cluster implementation: the primary process must reliably own the listening socket and distribute connections to workers in round-robin on every run, rather than falling back to SO_REUSEPORT.
The Bun version:
cd bun
bun index.ts
navigate to http://localhost:5555.
The node version is more or less the same:
cd node+npm
node index.js
navigate to http://localhost:5556.
The Bun versions:
cd bun
bun "native clustering/cluster-runner.ts" # Bun.spawn + reusePort approach
bun "native clustering/cluster.ts" # node:cluster approach
navigate to http://localhost:5555.
The node version:
cd node+npm
node cluster.js # node:cluster approach
navigate to http://localhost:5556.
The Bun version:
cd bun
npm -g uninstall pm2
bun -g install pm2-beta
pm2 update
pm2 start ecosystem.config.js
navigate to http://localhost:5555.
Compare the results with the node version:
pm2 delete all
bun -g uninstall pm2-beta
npm -g install pm2
pm2 update
cd node+npm
pm2 start ecosystem.config.cjs
navigate to http://localhost:5556.
cd bun
bun -g install bm2
bm2 start bm2.ecosystem.config.ts
navigate to http://localhost:5557.
Since pm2 cluster mode is broken for Bun (see root cause analysis below), the correct approach is to run each Bun instance as an independent process on its own port (fork mode) and put nginx in front as the actual load balancer.
All three files live under bun/forking/:
bun/forking/index.ts — A copy of index.ts adapted for fork mode. Instead of relying on SO_REUSEPORT, each instance binds a dedicated port derived from NODE_APP_INSTANCE (set automatically by pm2):
const BASE_PORT = 5556;
const instance = parseInt(process.env.NODE_APP_INSTANCE ?? "0");
const port = BASE_PORT + instance;With 3 instances, workers listen on ports 5556, 5557, and 5558.
bun/forking/ecosystem.config.cjs — pm2 ecosystem config using exec_mode: "fork" and 3 instances. Uses .cjs extension with module.exports syntax. pm2 must be invoked from inside bun/forking/ so that script: "index.ts" resolves correctly — pm2 resolves script paths relative to the working directory where it is invoked, not relative to the config file.
bun/forking/nginx.conf — nginx upstream config that load balances across the three worker ports using least_conn, the correct strategy when response times vary significantly (e.g. a GraphQL server with queries ranging from milliseconds to several seconds). Unlike round-robin, least_conn routes each new request to whichever worker currently has the fewest active connections, avoiding the situation where a slow request blocks subsequent ones on the same worker.
To run:
# 1. Start the Bun workers via pm2 — must run from inside bun/forking/
cd bun/forking
pm2 start ecosystem.config.cjs
# 2. Load the nginx config (adjust path for your nginx setup)
nginx -c $(pwd)/nginx.conf
navigate to http://localhost:5555 (nginx) — requests are distributed across workers on :5556, :5557, :5558.
pm2 still provides logging, auto-restart on crash, and memory/CPU monitoring for each worker process. nginx owns all load balancing decisions.
Note: nginx's
proxy_next_upstream error timeoutdirective is included but should be removed for non-idempotent operations (e.g. GraphQL mutations) to avoid a failed request being silently retried on another worker.
Inspecting Bun's source code reveals the precise cause of the failure.
Node's round-robin mechanism works by intercepting net.Server.listen() calls in worker processes via cluster._getServer() (src/js/internal/cluster/child.ts). When a worker calls net.Server.listen(), the cluster module:
- Sends a
"queryServer"IPC message to the primary - The primary creates a single
RoundRobinHandle(src/js/internal/cluster/RoundRobinHandle.ts) that owns the real listening socket - The worker receives a faux handle — a plain JS object with no real socket behind it
- Connections arrive at the primary's socket →
distribute()→handoff()→ sent to workers via IPC
Workers never bind a real socket. All connection distribution is explicit, round-robin, controlled by the primary.
Bun.serve() is a native Zig API that creates a TCP socket directly, completely bypassing net.Server. Because it never calls net.Server.listen(), it never triggers cluster._getServer(). As a result:
- The primary never creates a
RoundRobinHandle - Workers never enter the round-robin system
- Each worker binds its own independent real socket on the port
In src/bun.js/api/server/ServerConfig.zig:406:
// If this is a node:cluster child, let's default to SO_REUSEPORT.
// That way you don't have to remember to set reusePort: true in Bun.serve() when using node:cluster.
.reuse_port = env.get("NODE_UNIQUE_ID") != null,When a process is a cluster worker (detected by NODE_UNIQUE_ID being set), Bun.serve() automatically enables SO_REUSEPORT. This is a convenience workaround: since the round-robin mechanism never activates, at least let the OS distribute connections at the kernel level. But as demonstrated, this produces hash-based distribution on Linux (unreliable) and no distribution at all on macOS.
The "sometimes works" behaviour seen with cluster.ts on Linux is not the cluster module working — it is purely the SO_REUSEPORT hash occasionally distributing 3 connections to 3 different workers by chance.
Even if Bun.serve() were fixed to call cluster._getServer(), the round-robin mechanism still could not work — because the IPC handle transfer in src/bun.js/node/node_cluster_binding.zig is entirely unimplemented.
In Node.js, when the primary accepts a connection it passes the raw TCP socket (file descriptor) to the chosen worker via IPC using Unix SCM_RIGHTS fd passing. The worker then adopts that fd into its own event loop and handles the request. In Bun's IPC layer this mechanism does not exist:
sendHelperChild throws if a handle is passed (node_cluster_binding.zig:30):
if (!handle.isNull()) {
return globalThis.throw("passing 'handle' not implemented yet", .{});
}sendHelperPrimary silently discards the handle (node_cluster_binding.zig:205):
_ = handle; // silently ignored
const success = ipc_data.serializeAndSend(globalThis, message, .internal, .null, null);All IPC callbacks always deliver null as the handle (node_cluster_binding.zig:136-139, 146-149):
event_loop.runCallback(callback, globalThis, this.worker.get().?, &.{
message,
.null, // handle — always null, never a real socket fd
});This means RoundRobinHandle's handoff() call in RoundRobinHandle.ts — which is supposed to send a live TCP socket to the chosen worker — is effectively dead code. Even if the primary accepted a connection and selected a worker via round-robin, the socket would be dropped at the Zig IPC layer before reaching the worker. The newconn IPC message arrives at the worker's onconnection() handler in child.ts with a null handle, so no socket is ever adopted.
Fixing this correctly requires three layers of work:
-
Native IPC fd transfer in
node_cluster_binding.zig: implement UnixSCM_RIGHTSfd passing so that a live TCP socket fd can be sent from the primary to a worker process over the IPC channel. The"passing 'handle' not implemented yet"error at line 30 must be replaced with real fd serialization. -
Bun.serve()cluster integration: whenNODE_UNIQUE_IDis set (i.e., the process is a cluster worker),Bun.serve()must callcluster._getServer()the same waynet.Server.listen()does — sending aqueryServerIPC message to the primary and receiving a faux handle in return. The relevant files:src/js/internal/cluster/child.ts—cluster._getServer()hook needs to be reachable fromBun.serve()src/bun.js/api/server/ServerConfig.zig—Bun.serve()needs to call into the cluster module instead of defaulting toSO_REUSEPORT
-
uWS socket adoption: once a worker receives a live fd via IPC, it must adopt that fd into its uWebSockets event loop. The existing
SocketContext.adoptSocket()API provides this capability, but it is not yet wired into the cluster path.
Once all three layers are in place, the SO_REUSEPORT auto-enable in ServerConfig.zig at line 406 could be removed, since workers would no longer need to bind their own sockets.