Skip to content

JIT: Secondary frame pointer for x64 unoptimized methods#128795

Draft
AndyAyersMS wants to merge 4 commits into
dotnet:mainfrom
AndyAyersMS:SecondaryFramePointer
Draft

JIT: Secondary frame pointer for x64 unoptimized methods#128795
AndyAyersMS wants to merge 4 commits into
dotnet:mainfrom
AndyAyersMS:SecondaryFramePointer

Conversation

@AndyAyersMS
Copy link
Copy Markdown
Member

No description provided.

AndyAyersMS and others added 4 commits May 28, 2026 17:55
On x64, addressing modes only encode a signed 8-bit displacement cheaply
(disp8, -128..+127); larger offsets require a 4-byte disp32. Methods with
large stack frames therefore emit many oversized stack-local references.

This adds an optional secondary frame/stack pointer, reserved in a
callee-saved register (RBX), offset by a configurable number of bytes from
the primary base. Stack locals that fall outside disp8 range of the primary
base but inside disp8 range of the secondary pointer are rewritten to use
the secondary pointer, shrinking those references from disp32 to disp8.

Gated behind the default-off config JitSecondFramePtr (the byte offset;
0x100 = 256) and restricted to OptimizationDisabled() (MinOpts/Tier0) on
x64 only. LSRA reserves the register; codegen sets it up in the prolog
after unwindEndProlog(); emit rewrites eligible SV refs to [rbx+disp8].

EH/funclet support: EH methods always use RBP frames on x64. The secondary
pointer is re-established (lea rbx,[rbp-offset]) only in filter funclet
prologs, since the VM's CallEHFilterFunclet restores only RBP. Catch/
finally/fault funclets need no re-establishment because CallEHFunclet
restores all nonvolatiles (including RBX) from the establisher context.

SuperPMI asmdiffs across all x64 collections (JitSecondFramePtr=0x100):
overall -7,353,695 bytes, 100% in MinOpts, FullOpts unchanged (0 diffs).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When an access is redirected through the secondary frame pointer
(REG_OPT_RSVD2/RBX) the emitted bytes use [rbx+disp8], but emitDispFrameRef
previously still printed the canonical [rbp/rsp+disp], so the listing did not
match the encoded instruction. Print the actual [rbx+disp8] operand and append
the canonical reference as a parenthesized suffix, e.g.

    mov qword ptr [rbx+0x78] (rbp-0x88), rax

Plumb the instruction through emitDispFrameRef so it can reuse
emitIsSecondFramePtrCandidate (the same decision emitOutputSV makes), keeping
display and emitted bytes in lockstep. Display-only change; no codegen impact.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a stack access is redirected through the secondary frame pointer
(REG_OPT_RSVD2), the disassembly now prints the real [rbx+disp8] operand
and emits the canonical frame reference (e.g. rbp-0x88) as an end-of-line
';' comment, rather than an inline parenthetical that could be misread as
sitting between operands.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The LSRA-time reservation keyed only off total frame size, so ~27% of
methods reserved RBX (push/lea/pop plus unwind data) but never used it: no
local actually landed in the secondary disp8 band. The band can't be tested
at LSRA, since REGALLOC-layout offsets have no base-register flag and are
inflated by an over-estimated callee-save area.

Reserve RBX at LSRA only as a cheap candidate (out of allocation), then make
the precise band-occupancy decision in genFinalizeFrame once FINAL offsets
are known. If nothing lands in the band, cancel the reservation so no
push/lea/unwind is emitted; otherwise mark the register modified and redo the
frame layout to account for the push.

aspnet2 asmdiffs: unused-RBX setups drop from 13 to 0; size win improves from
-3864 to -3982 bytes. No replay failures.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 30, 2026 02:27
@github-actions github-actions Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 30, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an AMD64-only "secondary frame pointer" optimization (gated by JitConfig.JitSecondFramePtr, default 0x100) for unoptimized methods with large frames. RBX is reserved during LSRA, conditionally established (RBP - offset or RSP + offset) in the prolog after a profitability check on FINAL frame offsets, restored in filter funclets, and used by the xarch emitter to redirect far stack accesses from [rbp/rsp + disp32] to [rbx + disp8] (saving 3 bytes per redirected access). Disassembly is updated to show the redirected operand plus a trailing canonical-frame-reference comment, and emitDispFrameRef gains an instruction parameter on all targets.

Changes:

  • Reserve RBX (REG_OPT_RSVD2) as candidate secondary FP during LSRA when MinOpts, large frame, EH-compatible, non-OSR; finalize/cancel in genFinalizeFrame via new genSecondFramePtrIsProfitable.
  • Emit prolog lea to establish RBX from RBP/RSP, re-establish it in AMD64 filter funclets, and redirect eligible stack-var encodings (non-EVEX/SSE38/3A/crc32) in emitInsSizeSVCalcDisp / emitOutputSV.
  • Plumb instruction ins through emitDispFrameRef on all targets, and render [rbx+disp8] ; rbp-0xNN in xarch disassembly.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/coreclr/jit/targetamd64.h Define REG_OPT_RSVD2 / RBM_OPT_RSVD2 as RBX.
src/coreclr/jit/jitconfigvalues.h Add JitSecondFramePtr config (default 256).
src/coreclr/jit/codegeninterface.h Add genSecondFramePtrReg/Offset/FPbased state (AMD64).
src/coreclr/jit/codegen.h Declare genSecondFramePtrIsProfitable; fix #endif comment for TARGET_XARCH.
src/coreclr/jit/lsra.cpp Reserve RBX candidate when conditions hold (MinOpts, fixed base, large frame, EH/OSR ok).
src/coreclr/jit/codegencommon.cpp Profitability check after FINAL layout; if profitable, mark RBX modified, redo layout, emit prolog lea.
src/coreclr/jit/codegenxarch.cpp Re-establish RBX in FILTER funclet prologs.
src/coreclr/jit/emitxarch.h Declare emitIsSecondFramePtrCandidate and display state for trailing comment.
src/coreclr/jit/emitxarch.cpp Redirect candidate check in size calc/output; modrm `0x40
src/coreclr/jit/emit.h Add instruction ins = INS_none parameter to emitDispFrameRef.
src/coreclr/jit/emitarm.cpp / emitarm64.cpp / emitloongarch64.cpp / emitriscv64.cpp Update emitDispFrameRef signature on other targets (parameter unused).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants