Skip to content

tun: coalesce via scatter-gather writes#52

Open
illotum wants to merge 2 commits into
tailscalefrom
illotum/scatter-gather-tun-writes
Open

tun: coalesce via scatter-gather writes#52
illotum wants to merge 2 commits into
tailscalefrom
illotum/scatter-gather-tun-writes

Conversation

@illotum
Copy link
Copy Markdown

@illotum illotum commented Apr 2, 2026

On Linux NativeTun will now write a vector of coalesced buffers via writev(2) instead of copying fragments into a single buffer:

Updates tailscale/corp#36989

This comment was marked as outdated.

@illotum illotum force-pushed the illotum/scatter-gather-tun-writes branch 6 times, most recently from 3ecf72e to 8ddcb0d Compare April 8, 2026 04:24
@illotum illotum requested a review from Copilot April 8, 2026 05:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tun/tun_linux.go Outdated
Comment thread tun/tun_linux.go
Comment thread tun/tun_linux.go
@illotum illotum force-pushed the illotum/scatter-gather-tun-writes branch 2 times, most recently from b3775d9 to fe0015f Compare April 8, 2026 17:09
@illotum illotum marked this pull request as ready for review April 8, 2026 17:20
@illotum illotum force-pushed the illotum/scatter-gather-tun-writes branch from fe0015f to 0c2c411 Compare April 8, 2026 17:20
@illotum illotum requested a review from jwhited April 8, 2026 17:21
@illotum illotum force-pushed the illotum/scatter-gather-tun-writes branch from 0c2c411 to 4a423c5 Compare April 13, 2026 02:16
On Linux NativeTun will write a vector of
coalesced buffers via writev(2) instead of copying
fragments into a single buffer.

Updates tailscale/corp#36989

Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Change-Id: Id1e9cd3cc9435c240b7a28ae6528cd7e6a6a6964
@illotum illotum force-pushed the illotum/scatter-gather-tun-writes branch from 4a423c5 to 407fbf5 Compare April 13, 2026 02:18
Comment thread tun/offload_linux.go
Copy link
Copy Markdown
Member

@jwhited jwhited left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits. This is a great improvement over the previous memcpy, and simplifies things for us in the future.

Comment thread tun/offload_linux.go
Comment on lines +78 to +82
virtioNetHdrIdx = 0
emptyVectorLen = 1
headPacketIdx = 1
singlePacketLen = 2
coalescedPacketsIdx = 2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a common prefix on these constants? Perhaps iovsVirtioNetHdrIndex... or similar.

Comment thread tun/offload_linux.go
Comment on lines +78 to +82
virtioNetHdrIdx = 0
emptyVectorLen = 1
headPacketIdx = 1
singlePacketLen = 2
coalescedPacketsIdx = 2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one took me a moment to understand: iovsCoalescedPacketsStartingIdx?

Comment thread tun/offload_linux.go
Comment on lines +91 to +93
// - iovs[i][0] is the pre-allocated virtio header
// - iovs[i][1] is the head packet with transport headers
// - iovs[i][2:] are coalesced payload fragments
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we clarify iovs[i] len? i.e. is len(iovs[i]) always at least 1? What can the reader assume about these lengths, if anything.

Comment thread tun/offload_linux.go
// Returns the index and a pointer to the new item's [][]byte.
//
// Do not use the returned pointer after the next call to next.
func (w *groToWrite) next() (int, *[][]byte) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 3 callsites are doing the same thing, can we simplify this to something like:

  func (w *groToWrite) appendIov(pkt []byte) int {
      n := len(w.iovs)
      if n < w.allocated {
          w.iovs = w.iovs[:n+1]
      } else {
          iov := make([][]byte, 1, conn.IdealBatchSize)
          iov[virtioNetHdrIdx] = make([]byte, virtioNetHdrLen)
          w.iovs = append(w.iovs, iov)
          w.allocated++
      }
      w.iovs[n] = append(w.iovs[n], pkt)
      return n
  }

That way we don't have to deal with passing a pointer and worrying about its lifetime / slice headers.

Comment thread tun/offload_linux.go
func udpPacketsCanCoalesce(pkt []byte, iphLen uint8, gsoSize uint16, item udpGROItem, bufs [][]byte, bufsOffset int) canCoalesce {
pktTarget := bufs[item.bufsIndex][bufsOffset:]
// described by item. iphLen and gsoSize describe pkt.
func udpPacketsCanCoalesce(pkt []byte, iphLen uint8, gsoSize uint16, item udpGROItem, wi *groToWrite) canCoalesce {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iphLen is unused

Comment thread tun/offload_linux.go
// We cannot have a larger packet following a smaller one.
return coalesceUnavailable
}
if int(item.iphLen)+udphLen+int(item.payloadLen)+int(gsoSize) > maxUint16 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this used to be implicit cap() and would have been easy to miss.

Comment thread tun/offload_linux.go
if len(wi.iovs[item.outputIdx]) >= maxScatterGatherFragments {
return coalesceUnavailable
}
pktTarget := wi.iovs[item.outputIdx][headPacketIdx]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as udp func, headPacket?

Comment thread tun/offload_linux_test.go
t.Errorf("duplicate toWrite value: %d", writeI)
}
seenWriteI[writeI] = true
wi := newGROToWrite()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we have automated fuzzing setup. Did you fuzz handleGRO?

Comment thread tun/offload_linux_test.go
}
}

func Test_tcpPacketsCanCoalesce(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, increasing coverage.

Comment thread tun/tun_linux.go
err := tun.tunRawConn.Write(func(fd uintptr) bool {
for {
n, werr = unix.Writev(int(fd), nb)
if werr == syscall.EINTR {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps unix.Write() returns these in a way that exact comparison is valid, but shouldn't we be using errors.Is() here and below for consistency?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants