Skip to content

[C#] Add shrike - epoll engine, RCA=true async handler#832

Open
MDA2AV wants to merge 1 commit into
mainfrom
add-shrike
Open

[C#] Add shrike - epoll engine, RCA=true async handler#832
MDA2AV wants to merge 1 commit into
mainfrom
add-shrike

Conversation

@MDA2AV
Copy link
Copy Markdown
Owner

@MDA2AV MDA2AV commented Jun 7, 2026

Description

shrike — a from-scratch C# epoll engine with an IVTS-backed, RunContinuationsAsynchronously = true handler loop. engine-tier, serving baseline / pipelined / limited-conn. This is the RCA=true counterpart to the RCA=false io_uring engines (minima / minima-sync).

Architecture (fully async-work-ready)

  • The worker thread is a pure I/O pump: epoll_wait → recv-drain → SignalReadable. It runs no handler code.
  • The per-connection handler resumes on the thread pool (RCA=true). It can await arbitrary work and still respond.
  • FlushAsync does a thread-safe send() directly — so an off-worker handler sends with no handoff back to the worker (the epoll advantage: send() is a syscall any thread can make).
  • Per-worker SO_REUSEPORT, pooled connections, native slab buffers, hand-rolled HTTP/1.1.

Handler (Program.cs)

Hand-rolled over the recv buffer: request line, Content-Length and chunked bodies, keep-alive, pipelining (batched per drain), fragmented-read reassembly. Connection: close sends a FIN via shutdown(SHUT_WR).

Endpoint Response
GET/POST /baseline11?a=&b= text/plaina + b (+ POST body)
GET /pipeline text/plainok

One vendored fix

EPOLLIN is made one-shot (re-armed by the handler in ReadAsync). Because the handler runs off-worker (RCA=true), the original edge-triggered EPOLLIN let the worker recv-drain the connection buffer while the handler was still parsing it — a data race exposed by fragmented requests. One-shot serializes recv against the handler.

Verification

  • validate.sh: 14/14, including every TCP-fragmentation case (split request line / headers / body bytes) and chunked.
  • 8000-request keep-alive load (8 conns × 1000, Content-Length reads): 0 drops.

epoll (not io_uring); seccomp isn't required, but validate.sh enables it unconditionally.

@MDA2AV
Copy link
Copy Markdown
Owner Author

MDA2AV commented Jun 7, 2026

/benchmark -f shrike

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 7, 2026

👋 /benchmark request received. A collaborator will review and approve the run.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 7, 2026

Benchmark Results

Framework: shrike | Test: all tests

Test Conn RPS CPU Mem Δ RPS Δ Mem
baseline 512 2,500,390 6615.2% 88MiB NEW NEW
baseline 4096 2,620,801 6492.7% 110MiB NEW NEW
pipelined 512 37,090,147 6636.6% 54MiB NEW NEW
pipelined 4096 34,745,632 6509.2% 113MiB NEW NEW
limited-conn 512 4,369 0.0% 0MiB NEW NEW
limited-conn 4096 60,786 0.0% 0MiB NEW NEW
Full log
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency     78us     64us    129us    237us    483us

  24201 requests in 5.00s, 21846 responses
  Throughput: 4.37K req/s
  Bandwidth:  281.44KB/s
  Status codes: 2xx=21846, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 21825 / 21846 responses (99.9%)
  Reconnects: 4588796
  Errors: connect 4583749, read 0, timeout 0
  Per-template: 7310,7195,7320
  Per-template-ok: 7310,7195,7320
[info] CPU 0.0% | Mem 0MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     512 (8/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 4994943
  Errors: connect 4994924, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     512 (8/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 5114213
  Errors: connect 5114188, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

=== Best: 4369 req/s (CPU: 0.0%, Mem: 0MiB) ===
[info] input BW: 345.59KB/s (avg template: 81 bytes)
[info] saved results/limited-conn/512/shrike.json
httparena-bench-shrike
httparena-bench-shrike

==============================================
=== shrike / limited-conn / 4096c (tool=gcannon) ===
==============================================
[info] waiting for server...
[info] server ready

[run 1/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   2.36ms    611us   3.29ms   68.80ms   115.50ms

  330055 requests in 5.00s, 303933 responses
  Throughput: 60.76K req/s
  Bandwidth:  3.82MB/s
  Status codes: 2xx=303933, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 303929 / 303933 responses (100.0%)
  Reconnects: 4225104
  Errors: connect 4165026, read 4, timeout 0
  Per-template: 101465,100612,101856
  Per-template-ok: 101465,100612,101856
[info] CPU 0.0% | Mem 0MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 4909332
  Errors: connect 4909327, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  1 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  16B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 4896386
  Errors: connect 4896383, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

=== Best: 60786 req/s (CPU: 0.0%, Mem: 0MiB) ===
[info] input BW: 4.70MB/s (avg template: 81 bytes)
[info] saved results/limited-conn/4096/shrike.json
httparena-bench-shrike
httparena-bench-shrike
[info] skip: shrike does not subscribe to json
[info] skip: shrike does not subscribe to json-comp
[info] skip: shrike does not subscribe to json-tls
[info] skip: shrike does not subscribe to upload
[info] skip: shrike does not subscribe to api-4
[info] skip: shrike does not subscribe to api-16
[info] skip: shrike does not subscribe to static
[info] skip: shrike does not subscribe to async-db
[info] skip: shrike does not subscribe to crud
[info] skip: shrike does not subscribe to fortunes
[info] skip: shrike does not subscribe to baseline-h2
[info] skip: shrike does not subscribe to static-h2
[info] skip: shrike does not subscribe to baseline-h2c
[info] skip: shrike does not subscribe to json-h2c
[info] skip: shrike does not subscribe to baseline-h3
[info] skip: shrike does not subscribe to static-h3
[info] skip: shrike does not subscribe to gateway-64
[info] skip: shrike does not subscribe to gateway-h3
[info] skip: shrike does not subscribe to production-stack
[info] skip: shrike does not subscribe to unary-grpc
[info] skip: shrike does not subscribe to unary-grpc-tls
[info] skip: shrike does not subscribe to stream-grpc
[info] skip: shrike does not subscribe to stream-grpc-tls
[info] skip: shrike does not subscribe to echo-ws
[info] skip: shrike does not subscribe to echo-ws-pipeline
[info] rebuilding site/data/*.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/frameworks.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/current.json
[info] done
[info] restoring loopback MTU to 65536

A C# epoll engine with an IVTS-backed, RCA=true async handler, fixed the
Tokio way: the worker is a pure readiness notifier (epoll -> SignalReadable;
it never touches the socket) and the handler does its own recv() on the
thread pool (Connection.DoRecv). Only the handler touches the recv buffer,
so there is no driver/handler race — the principled fix for the data race
the worker-recv-then-handoff design had under RCA=true.

Mirrors Tokio/mio (reactor flips readiness + wakes the task; the task reads
on its worker thread). Serves baseline, pipelined, limited-conn; hand-rolled
HTTP/1.1 (CL + chunked bodies, keep-alive, pipelining, fragmented reads);
Connection: close sends a FIN. Validated 14/14 (every TCP-fragmentation
case), 0/100 fragmented POST, 8000 keep-alive with zero drops.
@MDA2AV
Copy link
Copy Markdown
Owner Author

MDA2AV commented Jun 7, 2026

/benchmark -f shrike

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 7, 2026

👋 /benchmark request received. A collaborator will review and approve the run.

@MDA2AV
Copy link
Copy Markdown
Owner Author

MDA2AV commented Jun 7, 2026

/benchmark -f shrike-tokio

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 7, 2026

👋 /benchmark request received. A collaborator will review and approve the run.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 7, 2026

Benchmark Results

Framework: shrike-tokio | Test: all tests

Test Conn RPS CPU Mem Δ RPS Δ Mem
baseline 512 2,449,996 6460.2% 89MiB NEW NEW
baseline 4096 2,702,650 6637.8% 118MiB NEW NEW
pipelined 512 35,509,203 6655.0% 58MiB NEW NEW
pipelined 4096 36,019,414 6573.6% 123MiB NEW NEW
limited-conn 512 13,621 0.0% 0MiB NEW NEW
limited-conn 4096 373,379 35.8% 160MiB NEW NEW
Full log
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency    282us    199us    540us   1.36ms   2.33ms

  73242 requests in 5.00s, 68109 responses
  Throughput: 13.62K req/s
  Bandwidth:  877.71KB/s
  Status codes: 2xx=68109, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 68073 / 68109 responses (99.9%)
  Reconnects: 4726371
  Errors: connect 4714044, read 0, timeout 0
  Per-template: 22864,22554,22655
  Per-template-ok: 22864,22554,22655
[info] CPU 0.0% | Mem 0MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     512 (8/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 5207324
  Errors: connect 5207304, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     512 (8/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 5392237
  Errors: connect 5392215, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

=== Best: 13621 req/s (CPU: 0.0%, Mem: 0MiB) ===
[info] input BW: 1.05MB/s (avg template: 81 bytes)
[info] saved results/limited-conn/512/shrike-tokio.json
httparena-bench-shrike-tokio
httparena-bench-shrike-tokio

==============================================
=== shrike-tokio / limited-conn / 4096c (tool=gcannon) ===
==============================================
[info] waiting for server...
[info] server ready

[run 1/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency   2.89ms   2.53ms   3.79ms   10.70ms   74.90ms

  2037836 requests in 5.00s, 1866895 responses
  Throughput: 373.20K req/s
  Bandwidth:  23.49MB/s
  Status codes: 2xx=1866895, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 1866895 / 1866895 responses (100.0%)
  Reconnects: 3642429
  Errors: connect 3296930, read 0, timeout 0
  Per-template: 623069,619851,623975
  Per-template-ok: 623069,619851,623975
[info] CPU 35.8% | Mem 160MiB

[run 2/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  1 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  12B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 4630693
  Errors: connect 4630685, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

[run 3/3]
gcannon v0.5.3
  Target:    localhost:8080/
  Threads:   64
  Conns:     4096 (64/thread)
  Pipeline:  1
  Req/conn:  10
  Templates: 3
  Expected:  200
  Duration:  5s


  Thread Stats   Avg      p50      p90      p99    p99.9
    Latency      0us      0us      0us      0us      0us

  0 requests in 5.00s, 0 responses
  Throughput: 0 req/s
  Bandwidth:  0B/s
  Status codes: 2xx=0, 3xx=0, 4xx=0, 5xx=0
  Latency samples: 0 / 0 responses (0.0%)
  Reconnects: 5195455
  Errors: connect 5195432, read 0, timeout 0
  Per-template: 0,0,0
  Per-template-ok: 0,0,0
[info] CPU 0.0% | Mem 0MiB

=== Best: 373379 req/s (CPU: 35.8%, Mem: 160MiB) ===
[info] input BW: 28.84MB/s (avg template: 81 bytes)
[info] saved results/limited-conn/4096/shrike-tokio.json
httparena-bench-shrike-tokio
httparena-bench-shrike-tokio
[info] skip: shrike-tokio does not subscribe to json
[info] skip: shrike-tokio does not subscribe to json-comp
[info] skip: shrike-tokio does not subscribe to json-tls
[info] skip: shrike-tokio does not subscribe to upload
[info] skip: shrike-tokio does not subscribe to api-4
[info] skip: shrike-tokio does not subscribe to api-16
[info] skip: shrike-tokio does not subscribe to static
[info] skip: shrike-tokio does not subscribe to async-db
[info] skip: shrike-tokio does not subscribe to crud
[info] skip: shrike-tokio does not subscribe to fortunes
[info] skip: shrike-tokio does not subscribe to baseline-h2
[info] skip: shrike-tokio does not subscribe to static-h2
[info] skip: shrike-tokio does not subscribe to baseline-h2c
[info] skip: shrike-tokio does not subscribe to json-h2c
[info] skip: shrike-tokio does not subscribe to baseline-h3
[info] skip: shrike-tokio does not subscribe to static-h3
[info] skip: shrike-tokio does not subscribe to gateway-64
[info] skip: shrike-tokio does not subscribe to gateway-h3
[info] skip: shrike-tokio does not subscribe to production-stack
[info] skip: shrike-tokio does not subscribe to unary-grpc
[info] skip: shrike-tokio does not subscribe to unary-grpc-tls
[info] skip: shrike-tokio does not subscribe to stream-grpc
[info] skip: shrike-tokio does not subscribe to stream-grpc-tls
[info] skip: shrike-tokio does not subscribe to echo-ws
[info] skip: shrike-tokio does not subscribe to echo-ws-pipeline
[info] rebuilding site/data/*.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/frameworks.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/baseline-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/limited-conn-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-4096.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/pipelined-512.json
[updated] /home/diogo/actions-runner/_work/HttpArena/HttpArena/site/data/current.json
[info] done
[info] restoring loopback MTU to 65536

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant