What improved system performance
- (Most important) Tune batcher per service: sometimes batcher timeout constitutes most of a request's latency
- Use pprof and OpenTelemtry to find aggregate CPU/block times + break down latency per request
- Keep exec's serial runCoordinator light
- Convert HTTP to TCP: network cost is nontrivial especially for a replication protocol
- Disabling OpenTelemtry/lots of logging during test time
- Having middle service fan out to only the primary batcher of the backend service
- If you have a mac, have it be charging