[ROCm] Avoid unsupported BF16 llm_int8 registration by austin1997 · Pull Request #55 · ROCm/Paddle

austin1997 · 2026-06-23T05:50:19Z

PR Category

Custom Device

PR Types

Bug fixes

Description

This PR avoids advertising BF16 support for the CUDA-only llm_int8_linear GPU kernel on ROCm.

The existing kernel implementation only provides a CUDA path and the non-CUDA path throws Unimplemented at runtime. On ROCm, the BF16 dtype registration allowed BF16 inputs to dispatch into that unsupported path. This change keeps the kernel touch symbol available for the generated code, but only registers phi::bfloat16 for CUDA builds. ROCm now reports BF16 llm_int8_linear as an unregistered kernel instead of entering the CUDA-only implementation.

The Python test skip condition is also centralized and updated so llm_int8_linear tests are skipped on ROCm.

Validation:

env TARGET=SKYLAKEX ninja -j 160 paddle_python
ROCm BF16 llm_int8_linear repro now fails with NotFound for the BF16 kernel instead of Unimplemented from the CUDA-only implementation
python3.12 -m unittest -v test_llm_int8_linear.py (9 skipped on ROCm)
prek run --files paddle/phi/kernels/gpu/llm_int8_linear_kernel.cu test/quantization/test_llm_int8_linear.py
git diff --check

是否引起精度变化

否

[ROCm] Avoid unsupported BF16 llm_int8 registration

66a48d8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm] Avoid unsupported BF16 llm_int8 registration#55

[ROCm] Avoid unsupported BF16 llm_int8 registration#55
austin1997 wants to merge 1 commit into
ROCm:paddle_hackthonfrom
austin1997:rocm-bf16-llm-int8-registration

austin1997 commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

austin1997 commented Jun 23, 2026

PR Category

PR Types

Description

是否引起精度变化

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant