04 Event Loop Design

04-Event-Loop-Design: Mastering the Reactor Pattern and Building Scalable Reactive Systems

Keywords: Event Loop, Reactor Pattern, Proactor Pattern, Netty, Java NIO, Selector, Epoll, Kqueue, IOCP, Non-blocking I/O, Thread-Per-Request, Boss/Worker Threads, Dispatch Loop, Backpressure, Throughput, Latency, Tail Latency, Fairness, Starvation, Mechanical Sympathy, Event-Driven Architecture, Concurrency, Saturation

🔍 Introduction

An event loop is one of the most important architectural patterns in high-performance systems.

It is the mechanism that allows a small number of threads to coordinate a very large number of events efficiently.

Instead of asking:

Which thread should handle this request?

a reactive system asks:

What is ready right now?

This is the core idea behind:

Java NIO
Netty
Vert.x
Spring WebFlux
reactive servers
high-concurrency gateways
low-latency messaging systems
websocket platforms
proxy servers
streaming systems

The event loop exists because thread-per-request architectures eventually hit a wall:

thread explosion
context switching overhead
memory pressure
unstable tail latency
poor scalability under idle connections
excessive scheduler contention

A well-designed event loop is not just a loop.

It is a control plane for:

I/O coordination
readiness detection
task dispatch
overload protection
fairness
latency control
CPU efficiency
connection management

This page explains how to design event loops that are fast, safe, and production-grade.

The C10K Problem: Event loops are the definitive architectural answer to the challenge of handling 10,000+ concurrent connections efficiently on a single machine.

⚖️ 1. The Paradigm Shift: Thread-Per-Request vs Event Loop

To understand why the event loop exists, you must understand what it replaces.

Thread-Per-Request vs Event Loop

❌ The Thread-Per-Request Model

Traditional servlet-based systems often use this architecture:

1 Request
↓
1 Thread
↓
Blocking I/O
↓
Business Logic
↓
Response

This looks simple, but it becomes expensive at scale.

Why it breaks down

each thread consumes memory
each thread competes for CPU scheduling
blocked threads waste resources
idle connections still occupy thread slots
context switching becomes dominant
stack memory grows quickly
throughput drops under load

A single Java thread may consume around 1 MB of stack memory in many real deployments.
10,000 concurrent idle connections can therefore consume enormous memory just to wait.

✅ The Event Loop Model

A non-blocking system uses a different architecture:

Connections
↓
Event Loop
↓
Ready Events
↓
Dispatch / Handoff
↓
Business Processing
↓
Response

Instead of waiting on every connection, the event loop monitors readiness and reacts only when there is actual work.

This creates several advantages:

fewer threads
lower context switching overhead
lower memory usage
better scalability with many idle connections
more predictable resource consumption

Comparison Table

Feature	Thread-Per-Request	Event Loop
Blocking	Yes	No
Context Switching	High	Very Low
Memory Footprint	Heavy	Lightweight
Scalability	Limited by threads	Limited by CPU, network, and downstream capacity
Fairness	Depends on scheduling	Explicitly designed
Tail Latency	Often unstable under load	Can be tightly controlled
Complexity	Simpler linear code	More architectural discipline required

⚙️ 2. The Reactor Pattern

The event loop is the implementation of the Reactor Pattern.

The Reactor responds to I/O events by dispatching them to the appropriate handler.

Architecture:

I/O Source
↓
Event Demultiplexer
↓
Event Loop / Reactor
↓
Handler
↓
Business Logic

The selector is the event demultiplexer.
The event loop reads the ready set and sends work to handlers.

This is the foundation of most high-performance non-blocking systems in Java.

Reactor vs Proactor Diagram
Visual 1.2: Reactor pattern (readiness-based) vs Proactor pattern (completion-based).

🔄 Reactor vs. Proactor: Two Sides of Async

While the Event Loop is the heart of the Reactor pattern, it's important to distinguish it from its cousin:

Reactor (Java NIO / Netty): The Loop waits for a resource to become ready (e.g., "data is available to read"). You perform the actual I/O.
Proactor (Windows IOCP / AIO): You tell the OS to perform the I/O in the background. The Loop is notified only when the operation is complete.

Note: Java's high-performance networking is almost entirely based on the Reactor pattern due to OS portability.

🔹 Single-Threaded Reactor

A single thread accepts connections, reads data, processes it, and writes the response.

This model is famously used in systems such as:

Redis
Node.js-style single-loop designs
some embedded or specialized high-performance services

Pros

zero lock contention inside the loop
simple state reasoning
predictable ordering

Cons

limited multi-core utilization
one slow task can freeze all connections assigned to the loop
not ideal for mixed I/O + CPU workloads

If processing one event takes 1 second, all other connections handled by that loop wait.

🔹 Multi-Threaded Reactor: Boss and Worker Groups

This is the architecture commonly used by Netty.

It separates the accepting phase from the processing phase.

Architecture flow:

Client
↓
Boss Event Loop Group
↓
Accept Connection
↓
Register Channel with Worker Event Loop Group
↓
Worker Handles Read / Write / Dispatch

Boss Event Loop Group

usually a very small pool
often just 1 thread
listens for incoming connections
handles OP_ACCEPT
hands accepted channels to workers

Worker Event Loop Group

handles actual read/write readiness
processes OP_READ, OP_WRITE, OP_CONNECT
usually sized based on CPU cores
keeps each channel bound to a stable loop for locality

🧠 3. Inside the Event Loop

An event loop is literally a repeating control cycle.

Simplified pseudo-code:

while (!isShutdown) {
    Set<SelectionKey> readyKeys = selector.select();

    processSelectedKeys(readyKeys);

    runAllTasks();
}

Event Loop Cycle Diagram
Visual 1.3: The high-performance cycle: Polling events → Dispatching handlers → Running scheduled tasks.

The loop does three major things:

waits for events from the OS,
processes ready I/O events,
runs scheduled asynchronous tasks.

This is the mechanical core of the reactor.

⚖️ Scheduling, Fairness & The Epoll Bug

A production-grade Event Loop doesn't just process I/O; it must manage tasks fairly.

Task Slicing: To prevent one massive task from "starving" others, Netty uses an ioRatio. It limits how much time is spent on non-I/O tasks vs. I/O events in a single cycle.
The "JDK Epoll Bug" Fix: There is a famous bug where Selector.select() wakes up for no reason, causing 100% CPU usage. Netty detects this "spinning" and automatically rebuilds the Selector on the fly.

🔹 Mechanical Sympathy & Epoll/Kqueue

Event loops achieve mechanical sympathy by leveraging OS-level mechanisms like epoll (Linux) and kqueue (macOS). This ensures zero wasted CPU cycles on idle connections.

🏗️ 4. How the Selector and Event Loop Work Together

The event loop usually sits on top of a Selector.

The selector detects readiness.
The event loop decides what to do with it.

Relationship:

Selector = readiness detection
Event Loop = dispatch and control

A selector by itself is not enough.
The event loop gives it policy, fairness, and lifecycle control.

🔄 5. Event Loop Lifecycle

A production event loop usually follows this lifecycle:

Step 1: Wait for Events

The loop blocks efficiently while waiting for readiness notifications.

In Java NIO, this is usually done through:

selector.select();

This means:

sleep without burning CPU
wake up when channels are ready
continue only when work exists

Step 2: Read Ready Keys

Once awakened, the loop retrieves ready events.

Example:

Set<SelectionKey> selectedKeys = selector.selectedKeys();

These keys represent channels that are ready for operation.

Step 3: Dispatch Events

Each key is inspected and sent to the correct handler.

Possible event types:

accept
read
write
connect

The event loop must make dispatch decisions quickly.

Step 4: Remove Processed Keys

Processed keys must be removed from the selected set.

Otherwise, the same event may be processed repeatedly.

This is one of the most common bugs in NIO-based systems.

Step 5: Repeat

The loop continues forever or until shutdown is requested.

⚡ 6. Event Types in Java NIO

Java NIO event loops usually work with these readiness events:

| Event | Meaning | |---|---|---| | OP_ACCEPT | A new incoming connection is ready | | OP_CONNECT | A connection has been established | | OP_READ | Data is available to read | | OP_WRITE | Socket is ready to accept more data |

Each event type requires different handling logic.
The loop must route each one correctly.

🧩 7. Readiness vs Work

This distinction is critical:

readiness means the channel can be operated on
work means the actual processing you do after readiness

The event loop detects readiness.
The handler performs work.

Do not confuse them.

If you mix readiness detection with heavy processing, your loop becomes slow and unstable.

🔄 8. The Dispatch Model

The event loop is not just a polling machine.
It is a dispatching machine.

Typical structure:

Selector
↓
Ready Key Set
↓
Event Dispatcher
↓
Handler
↓
Task Queue or Worker Pool

A good dispatch model separates:

I/O coordination
protocol parsing
business logic
persistence
downstream calls

This separation keeps the loop responsive.

🧠 9. Why Event Loops Scale Better

Event loops scale well because they reduce:

thread count
blocking time
scheduling overhead
memory usage
lock contention

Instead of many idle threads, you have a smaller number of active coordination threads.

This is especially valuable in workloads with:

many idle connections
bursty traffic
chat systems
gateways
proxy services
websocket servers
streaming platforms
SSE connections
IoT device fleets

⚖️ 10. Event Loops vs Thread Pools

These are not the same thing.

Event Loop

Purpose:

Manage readiness and dispatch

Best for:

I/O coordination
non-blocking sockets
readiness polling

Thread Pool

Purpose:

Execute independent tasks

Best for:

CPU-bound work
blocking work
background processing
parallel computation

A strong architecture usually combines both:

Event Loop → Worker Pool → Business Logic

🧩 11. Event Loop Architecture Pattern

A production-grade architecture often looks like this:

Client Connections ➔ Event Loop ➔ Dispatch ➔ Worker Thread Pool ➔ Business Logic ➔ Response

Event Loop Architecture Pattern Summary

This separation is critical.

If the event loop starts doing business logic itself, performance degrades quickly.

If the worker pool is unbounded, overload spreads.

If the dispatch layer is slow, latency grows.

Everything matters.

🎨 Visual Aids

Visual 1.1: Side-by-side comparison of Thread-Per-Request vs Event Loop.
Visual 1.2: Netty's Boss/Worker threading model architecture.
Visual 1.3: Data flow of Zero-Copy (Disk to NIC bypassing JVM heap).

⚙️ 12. The Cost of Blocking Inside an Event Loop

Blocking inside an event loop is catastrophic.

Examples of blocking operations:

database calls
file I/O
network calls to other services
long CPU tasks
sleep calls
synchronous remote APIs
slow logging sinks

Why it is dangerous:

the event loop cannot service other channels
tail latency increases
queue depth grows
throughput collapses
timeouts cascade
one connection can freeze thousands

This is one of the most common production mistakes in event-driven systems.

Blocking vs Offloading Diagram
Visual 1.4: Impact of blocking on the loop vs. isolating tasks to dedicated worker pools.

❌ Anti-Pattern Example: EventLoop-1 handles 1,000 connections. Connection A performs a blocking JDBC call taking 5 seconds. Result: All other 999 connections freeze completely for those 5 seconds.

✅ Best Practice: Always offload blocking work to a dedicated Unbounded or Fixed worker pool.

🚨 13. Fairness and Starvation

An event loop must handle work fairly.

If one connection produces too much work, it can monopolize the loop.

Problems include:

one client starving others
hot channels dominating the ready set
uneven latency
processing bias
unfair wakeup patterns

Good event loop design uses:

bounded work per iteration
fair dispatching
task slicing
handoff to workers when needed

🧠 14. Backpressure in Event Loops

Backpressure is essential.

Without it, the event loop can accept more work than it can handle.

Symptoms of missing backpressure:

queue growth
memory growth
buffer buildup
increased latency
collapse under load

Backpressure mechanisms include:

bounded queues
limited per-connection work
rejected tasks
write interest toggling
adaptive throttling
rate limiting
dropping low-priority work

Event loops must stay stable under overload.

⚡ 15. `OP_WRITE` Is Dangerous

OP_WRITE is often misunderstood.

A socket is frequently writable.

If you keep write interest enabled all the time, the selector may wake continuously even when there is no meaningful work.

This can lead to:

busy loops
CPU spikes
repeated wakeups
wasted scheduling cycles
self-inflicted overload

Correct strategy:

enable write interest only when there is queued outbound data
disable it after flushing the buffer

This is one of the most important optimization rules in NIO-based event loops.

🧩 16. Work Slicing

A single loop iteration should not try to do everything.

A production event loop often slices work into smaller units.

Example:

accept a connection
read a fixed amount of data
queue the rest
return to the loop

Why this matters:

prevents starvation
keeps latency predictable
improves fairness
avoids monopolization by one channel

Large tasks should be chunked and offloaded.

🧠 17. Event Loop State Management

The event loop itself is a stateful machine.

Common states:

State	Meaning
Starting	Initial setup
Running	Normal operation
Draining	Finishing queued work
Stopping	No new work accepted
Terminated	Shutdown complete

Good systems define clear transitions.

Without clear lifecycle control, shutdown becomes messy.

⚙️ 18. Shutdown Design

A proper event loop must stop cleanly.

Graceful shutdown should:

stop accepting new events,
finish or cancel existing work,
close channels safely,
release selector resources,
terminate workers if needed.

A poor shutdown can leave:

half-open sockets
leaked selectors
lingering threads
resource leaks
inconsistent state

Shutdown is part of design, not an afterthought.

// Logic for Graceful Shutdown
public void shutdown() {
    bossGroup.shutdownGracefully();
    workerGroup.shutdownGracefully();
    // Wait for the loops to finish pending tasks and release selectors
}

🔹 Shutdown Workflow

// Standard Netty Graceful Shutdown
bossGroup.shutdownGracefully();
workerGroup.shutdownGracefully();
// This handles: stopping acceptance, draining tasks, and closing selectors.

🧠 19. Single Reactor vs Multi-Reactor

There are several event-loop topologies.

Single Reactor

One event loop handles all events.

Pros

simple
easy to understand
lower coordination overhead

Cons

limited scalability
can become a bottleneck
poor multi-core utilization

Multi-Reactor

Multiple event loops share the load.

Pros

better scaling on multi-core systems
higher throughput
more isolation
lower per-loop pressure

Cons

more complex
requires better coordination
more careful channel assignment

Large systems often use a multi-reactor model.

⚙️ 20. Event Loop Threading Models

There are different threading approaches around event loops.

Model 1: Single Loop, Worker Pool

one loop for readiness
worker pool for heavy tasks

This is common and practical.

Model 2: One Loop Per Core

each loop handles a subset of channels
better hardware utilization
more complexity
strong cache locality

This is common in high-performance frameworks.

Model 3: Event Loop + Specialized Pools

I/O loop
CPU pool
blocking pool
scheduled pool

This is often the most production-friendly design.

🧠 21. Reactor vs Proactor

The Reactor pattern is not the only event-based model.

Reactor

waits for readiness
dispatches when channels are ready
common in Java NIO and Netty

Proactor

starts asynchronous operations
receives completion notifications
common in completion-based I/O systems

The architectural difference is subtle but important:

Reactor: “tell me when it is ready”
Proactor: “tell me when it is done”

🧩 22. Mechanical Sympathy and OS Integration

The event loop achieves mechanical sympathy because it maps well to modern operating system capabilities.

Instead of asking 10,000 sockets:

Are you ready?
Are you ready?
Are you ready?

the loop uses OS-level mechanisms like:

epoll on Linux
kqueue on macOS / BSD
IOCP on Windows in completion-oriented models

This avoids wasting CPU cycles on idle connections.

The OS is asked to do readiness tracking efficiently.

⚙️ 23. Cache Locality and CPU Pinning

Why does Netty often keep a specific connection tied to the same event loop thread?

Because of CPU cache locality.

If a connection is processed by the same thread repeatedly:

data stays warm in L1/L2 caches
less cache invalidation
less memory traffic
better branch prediction
lower latency

If a connection bounces between threads:

caches are constantly invalidated
memory access becomes slower
overhead rises sharply

Stable thread-to-connection affinity is often a major performance win.

Zero-Copy and Cache Locality Diagram
Visual 1.5: Zero-Copy flow moving data from disk to NIC bypassing the JVM heap.

⚡ Advanced Throughput: Zero-Copy & Batching

To push performance to the absolute limit, the Event Loop utilizes:

Zero-Copy I/O: Using FileChannel.transferTo(), data moves directly from disk to the network buffer without ever entering the JVM Heap. This saves CPU cycles and memory bandwidth.
Event Batching: Instead of waking up for every single packet, the loop can gather multiple ready events in one poll() call, drastically reducing the cost of system calls (syscalls).

⚠️ 24. Common Event Loop Mistakes

❌ Doing blocking I/O in the loop

This is the most dangerous mistake.

❌ Doing too much work per event

This increases tail latency and harms fairness.

❌ Forgetting to remove selected keys

This leads to repeated handling and busy loops.

❌ Enabling `OP_WRITE` permanently

This can cause endless wakeups.

❌ Using unbounded queues for handoff

This leads to hidden overload and memory growth.

❌ Not separating protocol parsing from business logic

This makes the loop fragile and hard to scale.

❌ Having a single god-loop for the whole system

This wastes available cores and creates a bottleneck.

📊 25. Event Loop Metrics to Watch

A production event loop should be observable.

Important metrics:

Metric	Meaning
Loop Iteration Time	How long each cycle takes
Ready Key Count	How many events are detected
Dispatch Time	How long handling takes
Queue Depth	How much work is waiting
Rejection Count	How often overload happens
Wakeup Count	How often the loop is interrupted
Tail Latency	Worst-case response behavior
Busy Loop Rate	Indicator of accidental spinning
Idle Time	Whether the loop is underutilized or sleeping appropriately

Metrics reveal whether the loop is healthy.

🧵 26. The Modern Era: Event Loops vs. Virtual Threads (Project Loom)

Java 21 introduced Virtual Threads, changing the landscape. How do they compare?

Event Loops: Still the gold standard for Network Proxies, Gateways, and Message Brokers where maximum throughput and fine-grained control over I/O are required.
Virtual Threads: The best choice for Standard CRUD/Business APIs. They allow you to write simple, blocking code that scales like an Event Loop.

The Hybrid Rule: Use Event Loops (Netty) for your infrastructure/networking layer and consider Virtual Threads for your heavy business logic layer.

💡 27. Real-World Case Study: The 100k WebSockets Collapse

Scenario

A chat application built on Spring Boot with a traditional thread-per-request server model needed to support 100,000 concurrent WebSockets.

The Crash

At around 8,000 users, the JVM threw:

OutOfMemoryError: unable to create new native thread

The system was using huge amounts of memory just to keep idle WebSocket threads alive.

The Fix

The team migrated to an event-driven architecture using Netty via a reactive stack.

Result

memory usage dropped dramatically
context switching overhead disappeared
throughput stabilized
the system could handle far more idle long-lived connections
the architecture became suitable for websocket-style workloads

Lesson

For long-lived, mostly idle connections like:

WebSockets
SSE
chat sessions
real-time dashboards

event loops are often the correct architectural choice.

🧠 28. Event Loop Design Principles

A strong event loop design usually follows these rules:

keep the loop lightweight
never block in the loop
bound work per iteration
hand off expensive work
use backpressure
make shutdown explicit
monitor queue growth
prioritize fairness
separate I/O from business logic
keep per-connection state minimal
preserve cache locality
avoid unnecessary wakeups
use specialized pools for blocking work

These rules are what make event-driven systems stable.

🛑 29. Event Loop Anti-Patterns Checklist

If your event-driven system is slow, check for these fatal mistakes:

Hidden blocking: using InputStream, URLConnection, JDBC, or other blocking APIs inside a reactive chain.
Synchronous logging: writing logs to a slow sink inside the event loop.
God threads: having one loop do everything instead of using available cores.
Lack of backpressure: accepting more work than downstream systems can process.
Unbounded handoff queues: hiding overload until memory fails.
Mixing business logic with readiness logic.
Enabling write readiness permanently.
Doing large CPU work inline.
Ignoring per-connection fairness.

🛠️ 30. Diagnostic Tools & Flags

If your Event Loop is struggling, use these professional diagnostic tools:

Tool/Flag	Purpose	Key Metric
`-Dio.netty.eventLoopThreads=N`	Manual Thread Tuning	Compare throughput vs. core count.
`jcmd <pid> Thread.print`	Thread Dump	Look for `BLOCKED` states in `nioEventLoop` threads.
async-profiler	CPU Profiling	Check for "Selector Spinning" or high `select()` time.
JFR (Flight Recorder)	Latency Analysis	Look for "Socket Read" events exceeding your p99 targets.

Event Loop Profiling Tools
Visual 1.6: Visualizing bottlenecks using async-profiler flame graphs and JFR timelines.

🎨 Visual Aids (Recommendations for your Wiki)

Visual 1.1: A side-by-side of 1,000 threads (heavy) vs. 1 Event Loop (light).
Visual 1.2: A diagram showing "Boss" passing a key to a "Worker" queue.
Visual 1.3: A "Zero-Copy" flow: Disk → Kernel Buffer → NIC (bypassing User Space).

🚀 31. Real-World Relevance

Event loops are foundational in:

Netty
Vert.x
reactive servers
websocket gateways
message brokers
low-latency trading systems
API gateways
proxy servers
streaming systems
high-concurrency microservices

If the system has many concurrent connections and a small number of active workers, event loops are often the right tool.

🔗 32. Related Deep Dives

Continue exploring:

💬 Final Thought

An event loop is not just a programming construct.

It is an architectural boundary.

It decides:

what happens immediately
what gets deferred
what gets handed off
what gets rejected
what gets delayed
what gets protected from overload
what gets kept hot in cache
what gets isolated into workers

The best engineers do not just write loops.

They design control systems that keep the system fast, fair, and stable under load.

04 Event Loop Design

04-Event-Loop-Design: Mastering the Reactor Pattern and Building Scalable Reactive Systems

🔍 Introduction

⚖️ 1. The Paradigm Shift: Thread-Per-Request vs Event Loop

❌ The Thread-Per-Request Model

Why it breaks down

✅ The Event Loop Model

Comparison Table

⚙️ 2. The Reactor Pattern

🔄 Reactor vs. Proactor: Two Sides of Async

🔹 Single-Threaded Reactor

Pros

Cons

🔹 Multi-Threaded Reactor: Boss and Worker Groups

Boss Event Loop Group

Worker Event Loop Group

🧠 3. Inside the Event Loop

⚖️ Scheduling, Fairness & The Epoll Bug

🔹 Mechanical Sympathy & Epoll/Kqueue

🏗️ 4. How the Selector and Event Loop Work Together

🔄 5. Event Loop Lifecycle

Step 1: Wait for Events

Step 2: Read Ready Keys

Step 3: Dispatch Events

Step 4: Remove Processed Keys

Step 5: Repeat

⚡ 6. Event Types in Java NIO

🧩 7. Readiness vs Work

🔄 8. The Dispatch Model

🧠 9. Why Event Loops Scale Better

⚖️ 10. Event Loops vs Thread Pools

Event Loop

Thread Pool

🧩 11. Event Loop Architecture Pattern

🎨 Visual Aids

⚙️ 12. The Cost of Blocking Inside an Event Loop

🚨 13. Fairness and Starvation

🧠 14. Backpressure in Event Loops

⚡ 15. OP_WRITE Is Dangerous

🧩 16. Work Slicing

🧠 17. Event Loop State Management

⚙️ 18. Shutdown Design

🔹 Shutdown Workflow

🧠 19. Single Reactor vs Multi-Reactor

Single Reactor

Pros

Cons

Multi-Reactor

Pros

Cons

⚙️ 20. Event Loop Threading Models

Model 1: Single Loop, Worker Pool

Model 2: One Loop Per Core

Model 3: Event Loop + Specialized Pools

🧠 21. Reactor vs Proactor

Reactor

Proactor

🧩 22. Mechanical Sympathy and OS Integration

⚙️ 23. Cache Locality and CPU Pinning

⚡ Advanced Throughput: Zero-Copy & Batching

⚠️ 24. Common Event Loop Mistakes

❌ Doing blocking I/O in the loop

❌ Doing too much work per event

❌ Forgetting to remove selected keys

❌ Enabling OP_WRITE permanently

❌ Using unbounded queues for handoff

❌ Not separating protocol parsing from business logic

❌ Having a single god-loop for the whole system

📊 25. Event Loop Metrics to Watch

🧵 26. The Modern Era: Event Loops vs. Virtual Threads (Project Loom)

💡 27. Real-World Case Study: The 100k WebSockets Collapse

Scenario

The Crash

The Fix

Result

Lesson

🧠 28. Event Loop Design Principles

🛑 29. Event Loop Anti-Patterns Checklist

🛠️ 30. Diagnostic Tools & Flags

🎨 Visual Aids (Recommendations for your Wiki)

⚡ 15. `OP_WRITE` Is Dangerous

❌ Enabling `OP_WRITE` permanently