tun: coalesce via scatter-gather writes#52
Conversation
3ecf72e to
8ddcb0d
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b3775d9 to
fe0015f
Compare
fe0015f to
0c2c411
Compare
0c2c411 to
4a423c5
Compare
On Linux NativeTun will write a vector of coalesced buffers via writev(2) instead of copying fragments into a single buffer. Updates tailscale/corp#36989 Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com> Change-Id: Id1e9cd3cc9435c240b7a28ae6528cd7e6a6a6964
4a423c5 to
407fbf5
Compare
jwhited
left a comment
There was a problem hiding this comment.
Mostly nits. This is a great improvement over the previous memcpy, and simplifies things for us in the future.
| virtioNetHdrIdx = 0 | ||
| emptyVectorLen = 1 | ||
| headPacketIdx = 1 | ||
| singlePacketLen = 2 | ||
| coalescedPacketsIdx = 2 |
There was a problem hiding this comment.
Can we use a common prefix on these constants? Perhaps iovsVirtioNetHdrIndex... or similar.
| virtioNetHdrIdx = 0 | ||
| emptyVectorLen = 1 | ||
| headPacketIdx = 1 | ||
| singlePacketLen = 2 | ||
| coalescedPacketsIdx = 2 |
There was a problem hiding this comment.
This one took me a moment to understand: iovsCoalescedPacketsStartingIdx?
| // - iovs[i][0] is the pre-allocated virtio header | ||
| // - iovs[i][1] is the head packet with transport headers | ||
| // - iovs[i][2:] are coalesced payload fragments |
There was a problem hiding this comment.
Can we clarify iovs[i] len? i.e. is len(iovs[i]) always at least 1? What can the reader assume about these lengths, if anything.
| // Returns the index and a pointer to the new item's [][]byte. | ||
| // | ||
| // Do not use the returned pointer after the next call to next. | ||
| func (w *groToWrite) next() (int, *[][]byte) { |
There was a problem hiding this comment.
All 3 callsites are doing the same thing, can we simplify this to something like:
func (w *groToWrite) appendIov(pkt []byte) int {
n := len(w.iovs)
if n < w.allocated {
w.iovs = w.iovs[:n+1]
} else {
iov := make([][]byte, 1, conn.IdealBatchSize)
iov[virtioNetHdrIdx] = make([]byte, virtioNetHdrLen)
w.iovs = append(w.iovs, iov)
w.allocated++
}
w.iovs[n] = append(w.iovs[n], pkt)
return n
}That way we don't have to deal with passing a pointer and worrying about its lifetime / slice headers.
| func udpPacketsCanCoalesce(pkt []byte, iphLen uint8, gsoSize uint16, item udpGROItem, bufs [][]byte, bufsOffset int) canCoalesce { | ||
| pktTarget := bufs[item.bufsIndex][bufsOffset:] | ||
| // described by item. iphLen and gsoSize describe pkt. | ||
| func udpPacketsCanCoalesce(pkt []byte, iphLen uint8, gsoSize uint16, item udpGROItem, wi *groToWrite) canCoalesce { |
| // We cannot have a larger packet following a smaller one. | ||
| return coalesceUnavailable | ||
| } | ||
| if int(item.iphLen)+udphLen+int(item.payloadLen)+int(gsoSize) > maxUint16 { |
There was a problem hiding this comment.
Nice, this used to be implicit cap() and would have been easy to miss.
| if len(wi.iovs[item.outputIdx]) >= maxScatterGatherFragments { | ||
| return coalesceUnavailable | ||
| } | ||
| pktTarget := wi.iovs[item.outputIdx][headPacketIdx] |
| t.Errorf("duplicate toWrite value: %d", writeI) | ||
| } | ||
| seenWriteI[writeI] = true | ||
| wi := newGROToWrite() |
There was a problem hiding this comment.
I don't believe we have automated fuzzing setup. Did you fuzz handleGRO?
| } | ||
| } | ||
|
|
||
| func Test_tcpPacketsCanCoalesce(t *testing.T) { |
| err := tun.tunRawConn.Write(func(fd uintptr) bool { | ||
| for { | ||
| n, werr = unix.Writev(int(fd), nb) | ||
| if werr == syscall.EINTR { |
There was a problem hiding this comment.
Perhaps unix.Write() returns these in a way that exact comparison is valid, but shouldn't we be using errors.Is() here and below for consistency?
On Linux NativeTun will now write a vector of coalesced buffers via writev(2) instead of copying fragments into a single buffer:
Updates tailscale/corp#36989