While profiling goffi's performance, I noticed that it consistently makes 1 allocation per FFI call.
This occurs in internal/syscall/syscall_unix_*.go because runtime_cgocall is imported via //go:linkname but lacks the //go:noescape compiler directive. Because of this, the local args := syscallArgs{...} struct escapes to the heap every time CallNFloat is executed.
Adding //go:noescape directly above the runtime_cgocall declarations across architecture implementations will prevent this escape and achieve true zero-allocation FFI calls.
While profiling
goffi's performance, I noticed that it consistently makes 1 allocation per FFI call.This occurs in
internal/syscall/syscall_unix_*.gobecauseruntime_cgocallis imported via//go:linknamebut lacks the//go:noescapecompiler directive. Because of this, the localargs := syscallArgs{...}struct escapes to the heap every timeCallNFloatis executed.Adding
//go:noescapedirectly above theruntime_cgocalldeclarations across architecture implementations will prevent this escape and achieve true zero-allocation FFI calls.