fix(dataplane): handle IPv4 fragments like Go decode4 + filter pre()#253
Conversation
The inbound packet filter had no IP fragment awareness. etherparse leaves the transport header `None` for any fragment (more-fragments set or a non-zero offset), so the dataplane read dst_port = 0 for it — and a normal ACL rule (e.g. tcp/443) never contains port 0, so EVERY fragment was silently dropped. That breaks inbound traffic that arrives fragmented (large UDP — DNS-over-UDP over MTU, QUIC — or any TCP/UDP fragmented by a low-MTU path), which is realistic on the 1280-MTU overlay, and diverges from Go which passes valid later fragments through. Mirror Go net/packet.decode4 + wgengine/filter pre() on the IPv4 path: read the fragment offset + more-fragments flag from the base header and classify before the ACL: - a non-first fragment at offset >= MIN_FRAG_BLKS (10 blocks = 80 bytes, Go minFragBlks) is ACCEPTED ahead of the ACL — Go maps it to ipproto.Fragment, which pre() admits. Stateless pass-through: the receiver's kernel discards it on reassembly timeout if the head fragment was filtered; - a non-first fragment at a smaller offset is DROPPED — it could overlap a transport header (RFC 1858), which Go demotes to unknown and drops; - a first fragment (offset 0) defers to the normal proto-switch/ACL on its real parsed port — except a fragmented TSMP (offset 0 with MF set) is dropped, since without the whole message it can't be a valid inter-node control packet (Go disallows it). No wrongful-accept: the classic fragment bypass (a later fragment evading L4 port matching) is not introduced — a fragment is admitted only at a Go-permitted offset, exactly as Go's pre() does. IPv6 fragment extension headers are out of scope (the tailnet is IPv4-only by default; a v6 fragment can't reach this path). Found by an adversarial packetfilter audit (tsr-u5mw). Tests: a valid later fragment is accepted under a deny-all ACL (proving the pre() pass-through, not the ACL); a low-offset fragment is dropped; a first fragment defers to the ACL; a fragmented TSMP is dropped while a non-fragmented TSMP still bypasses the ACL. Signed-off-by: Sergio <sergio@geiser.cloud>
|
Warning Review limit reached
More reviews will be available in 7 minutes and 50 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Per review (PR #253): lock the branch ordering — a TSMP fragment at offset >= MIN_FRAG_BLKS is accepted via the offset-based fragment pass-through, NOT dropped by the fragmented-TSMP rule (which is offset-0 only). Proves the later-fragment branch is proto-independent and wins over the TSMP-specific logic, matching Go mapping any offset>=minFragBlks to ipproto.Fragment regardless of the L4 proto byte. Signed-off-by: Sergio <sergio@geiser.cloud>
Release 0.39.0. Bundles the parity + anti-leak batch: magicsock CallMeMaybe immediate-ping (#246), peerless-STUN-stop (#247), STUN SOFTWARE+FINGERPRINT (#241); MagicDNS RD/RA+compression (#242) + SERVFAIL-not-NXDOMAIN (#248); DERP send rate-limit (#249); IPv4 fragment handling (#253); control Hostinfo/ProtoPortRange/NetInfo wire fixes (#244/#245); netcheck StunProber deletion (#250); forwarder subnet-SSRF doc+test (#252); control panic-hardening (#254).
What
The inbound packet filter had no IP fragment awareness.
etherparseleaves the transport headerNonefor any fragment (MF set or non-zero offset), so the dataplane readdst_port = 0— and a normal ACL rule (e.g.tcp/443) never contains port 0, so every fragment was silently dropped. That breaks inbound traffic arriving fragmented (large UDP — DNS-over-UDP over MTU, QUIC — or any TCP/UDP fragmented by a low-MTU path), realistic on the 1280-MTU overlay, and diverges from Go which passes valid later fragments through. Found by an adversarial packetfilter audit (tsr-u5mw).How — mirror Go
net/packet.decode4+wgengine/filter.pre()On the IPv4 inbound path, read the fragment offset + more-fragments flag from the base header (
b[6:8]) and classify before the ACL:MIN_FRAG_BLKS(10 blocks = 80 bytes, GominFragBlks = (60+20)/8) → ACCEPT ahead of the ACL. Go maps it toipproto.Fragment, whichpre()admits. Stateless pass-through — the receiver's kernel discards it on reassembly timeout if the head fragment was filtered.MIN_FRAG_BLKS→ DROP. Could overlap a transport header (RFC 1858); Go demotes tounknownand drops.Not a bypass
The classic fragment bypass (a later fragment evading L4 port matching) is not introduced — a fragment is admitted only at a Go-permitted offset, exactly as Go's
pre()does. The change makes the filter less restrictive only for the valid-later-fragment case Go already accepts; everything else stays dropped. IPv6 fragment extension headers are out of scope (IPv4-only tailnet; a v6 fragment can't reach this path).Tests
ipv4_fragment_handling_matches_go_decode4: valid later fragment accepted under a deny-all ACL (proves it's thepre()pass-through, not the ACL); low-offset fragment dropped (RFC 1858); first fragment defers to the ACL on its port; fragmented TSMP dropped while non-fragmented TSMP still bypasses the ACL.Local gates:
cargo test -p geiserx_ts_dataplane(10) +ts_runtime(328),clippy -D warnings(0),fmt,cargo run -p checks(anti-leak guard) all green.Created using Claude Code (Opus 4.8)