[compiler-rt][AMDGPU] Build the device profile runtime#3023
Conversation
80c9b37 to
d54f3c8
Compare
jhuber6
left a comment
There was a problem hiding this comment.
Didn't this file get deleted upstream, why's it still here? You're probably looking for AMDGPU.cmake.
There was a problem hiding this comment.
Pull request overview
Enables building the compiler-rt profile runtime for AMDGPU device (amdgcn-amd-amdhsa) builds by turning on the profile component and selecting the baremetal profile configuration in the GPU cache file. This provides the missing device-side libclang_rt.profile.a needed for HIP device PGO instrumentation/linking.
Changes:
- Turn on
COMPILER_RT_BUILD_PROFILEincompiler-rt/cmake/caches/GPU.cmake. - Enable
COMPILER_RT_PROFILE_BAREMETALto build the minimal/baremetal profile runtime appropriate for GPU device environments.
It's in TheRock build, it was removed upstream, but then reverted back in amd-staging because build scripts still reference it. As best i can tell: |
Can we fix that instead of perpetuating its zombie existence? I unfortunately don't know how to make changes to The Rock. |
|
Superseded by the consolidation approach: #3027 (remove redundant GPU.cmake) + ROCm/TheRock#6055 (point amdgcn runtimes cache at AMDGPU.cmake). Closing in favor of those. |
What
Enable the compiler-rt profile component for the amdgcn-amd-amdhsa runtimes build in GPU.cmake (COMPILER_RT_BUILD_PROFILE + COMPILER_RT_PROFILE_BAREMETAL ON).
Why
GPU.cmake had COMPILER_RT_BUILD_PROFILE OFF, so the device profile runtime (amdgcn-amd-amdhsa/libclang_rt.profile.a) was never built and HIP device PGO had no device runtime to link/collect. Matches upstream AMDGPU.cmake.
Test
Built TheRock; confirmed amdgcn-amd-amdhsa/libclang_rt.profile.a is produced and defines __llvm_profile_instrument_gpu and __llvm_profile_sections.