[Wasm RyuJIT] "Partial containment" for GT_LCL_ADDR#128680
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR experiments with Wasm RyuJIT containment for GT_LCL_ADDR under indirect loads/stores, folding local frame offsets into Wasm memory operands while still emitting the frame pointer push.
Changes:
- Marks
GT_LCL_ADDRaddresses of indirections as contained in Wasm lowering. - Adjusts Wasm codegen so contained
GT_LCL_ADDRstill emits its base address but skips the explicit offset add. - Emits folded offsets for
GT_INDandGT_STOREINDmemory operands.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/coreclr/jit/lowerwasm.cpp |
Adds containment marking for GT_LCL_ADDR operands of indirections. |
src/coreclr/jit/codegenwasm.cpp |
Implements partial-contained GT_LCL_ADDR codegen and folded load/store offsets. |
I don't think we need to break the lowering-codegen contract to get this optimization. My thinking on it was transforming: into: there are some tweaks needed to make this work in liveness and things like BTW, this is only a problem for
"No unsigned wrap" (LLVM terminology). The basic problem is avoid transforming this: into a WASM address mode. |
The tweaks are because this would 'erase' the reference to the var at +8 right, by replacing it with +0 +8? So the reference to I'm not sure I like that...
|
It needs a bit of thinking around how to represent the frame base while still referencing the original local (it's not negotiable the original local needs to stay referenced, of course). The saving grace I think is that we only need this new contract for the very small window from RA to codegen. The easiest (but not the most efficient) way is to introduce a new LEA-like node that'd have a contained LCL_ADDR and another operand as just the frame base (e. g. PHYS_REG), like so: it looks wacky though, maybe there are better idea out there. |
e27c7b7 to
4a4b002
Compare
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| case GT_PARTIALLY_CONTAINED_LCL_ADDR: | ||
| // WASM-TODO: Implement removal for dead partially contained lcl_addrs and their containers | ||
| if (TLiveness::EliminateDeadCode && node->IsUnusedValue()) | ||
| { | ||
| JITDUMP("WASM-TODO: Not removing unused partially contained local address:"); | ||
| DISPNODE(node); | ||
| } | ||
| break; |
There was a problem hiding this comment.
This isn't hit at all during a CG2 of S.P.CoreLib for me so it may be a theoretical problem
| if (addr->OperIs(GT_LCL_ADDR) && IsContainableLclAddr(addr->AsLclFld(), indirNode->Size() DEBUGARG(true /* disableAssertion */))) | ||
| { | ||
| // An indir through a lcl_addr should never have a multiply-used address since it doesn't need a null check | ||
| // and can't fault. If it does, this transform is incorrect. | ||
| assert((addr->gtLIRFlags & LIR::Flags::MultiplyUsed) == LIR::Flags::None); |

Experimenting with an approach for folding offsets into GT_IND and GT_STOREIND. Seems to work so far and doesn't complicate the code much, but the need to change codegenwasm.cpp:702 makes me wonder if this approach is fundamentally doomed...
cc @AndyAyersMS @SingleAccretion thoughts? I found a comment somewhere suggesting we need a new "nuw" frontend node but I'm not sure what a nuw is and whether that is the right solution for these scenarios.
LEAs are the next candidate but they seem trickier to get right. I'm not sure how common 'partially containable' LEAs and LCL_ADDRs are in a larger codebase, might do some instrumented runs over SPC to find out.