[AMDGPU] comgr: rewrite s_sethalt to s_nop for GFX1250 A0#3087
Open
jammm wants to merge 1 commit into
Open
Conversation
Rewrites every s_sethalt to s_nop 0 in-place (4-byte overwrite, no length change, no offset shifts). Defensive: LLVM almost never emits s_sethalt in production code (only via the int_amdgcn_s_sethalt intrinsic, mostly in debug builds), so this pass is dormant on most inputs -- it exists for defense-in-depth against any input ELF, compiler-emitted or hand-written, that ships s_sethalt. The replacement bytes come from LS.SNopBytes (pre-encoded at initLLVM() time, the same source the splitter uses for trampoline padding). Wired into b0a0.cpp dispatcher AFTER applyVop3pxWrapPatch (the order doesn't matter functionally -- sethalt-fix and the wrap pass don't interact -- but keeping all the A0-only post-loop passes adjacent makes the dispatch order easy to read). LIT test (hotswap-sethalt-fix.s): 4 cases -- bare s_sethalt is neutralized, s_sethalt before VOP3PX2 is neutralized, multiple s_sethalt instances are all neutralized, kernels without s_sethalt are not modified.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR splits out the
s_sethalthotswap fix from #2519 into its own branch/PR.For the GFX1250 B0-to-A0 hotswap path, COMGR now detects in-shader
s_sethaltinstructions and rewrites them in-place tos_nop 0. This avoids the A0 LD_SCALE/WMMA clause-break hazard caused by a shader halt immediately before scale/WMMA execution.Details
comgr-hotswap-patch-sethalt-fix.cpps_sethaltinstructionsLS.SNopBytesreplacement bytes.textoffset for each neutralized instructionThis only handles in-code-object
s_sethalt. External halt mechanisms such as host/debugger SQ halt commands are outside COMGR’s code-object rewrite path and are not handled here.Tests
Added
hotswap-sethalt-fix.s, covering:s_sethaltrewrites_sethaltimmediately before a VOP3PX2 instructions_sethaltinstructions in one kernels_sethaltValidation run locally: