MoE AITER Triton Kernels Integration by amirumoAMD · Pull Request #1044 · ROCm/ATOM

amirumoAMD · 2026-06-02T14:56:00Z

Motivation

Replace triton_kernels module with aiter kernels. Add support for gpt-oss a8w4.

Technical Details

matmul_ogs now replaced by a16w4 moe gemm from aiter. custom routing redirects to updated expanded/unified aiter routing function. correlates to changes on aiter branch amemoore/gfx950-moe-triton-integration.

Test Plan

lm_eval matches expected result of non-triton run for mxfp4 weight models (DSr1 mxfp4, gpt-oss a8w4, gpt-oss regular).

Test Result

All three match

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

…ration, compiles on profile_offline similar to without triton enabled

…16w4 + expt data setup changes

…ports from triton_kernels

…ue to routing from topk handling

…ctors to ds bug

…ting

amirumoAMD force-pushed the amemoore/gfx950-moe-triton-integration branch from f18e6a3 to 1b9fd3b Compare June 2, 2026 17:30

amirumoAMD marked this pull request as ready for review June 2, 2026 17:47

amirumoAMD changed the title ~~Amemoore/gfx950 moe triton integration~~ MoE AITER Triton Kernels Integration Jun 3, 2026

valarLip reviewed Jun 3, 2026

View reviewed changes

Comment thread atom/model_ops/moe.py Outdated

amirumoAMD force-pushed the amemoore/gfx950-moe-triton-integration branch from 6087e58 to 9268382 Compare June 3, 2026 13:37

amirumoAMD requested a review from valarLip June 3, 2026 14:33

amirumoAMD mentioned this pull request Jun 3, 2026

Integrate DS R1 GroupedTopk + Sigmoid Routing Into DS Routing ROCm/aiter#3522

Merged

1 task

amirumoAMD added 23 commits June 5, 2026 21:26

decoupled and phased out oai triton, initial setup for moe a4w4 integ…

ded4415

…ration, compiles on profile_offline similar to without triton enabled

Draft of a8w4 and a4w4 integration

a86d485

integrated model input scales for quant

625e2e3

moe a8w4 integration working and passing lm_eval for tp 1

b597333

patches to mxfp4 handling, triton lm_eval matches asm

c4df153

some cleanup

dc43b4a

Cleanup

da824d6

cleanup remaining comments

84129eb

minor integrations to replace defaulting gpt-oss to a4w4 to use new a…

1b6a5fc

…16w4 + expt data setup changes

Cleanup after rebase

8e32b8b

patches after correctness errors from after rebase, phased out new im…

b738dc5

…ports from triton_kernels

patched gpt-oss, some patches to deepseek. narrowed down deepseek iss…

9dab540

…ue to routing from topk handling

re-integrated fused routing from topk + narrowed down contributing fa…

e6d95b3

…ctors to ds bug

replaced routing from topk + triton softmax routing with deepseek rou…

15863ad

…ting

ds lm_eval passing with num_concurrent=1

06c6131

envs

60fe27e

lm_eval passing

c230991

clean

44235dd

clean

e22b3da

reformat for black

cb6d42f

patch to include block_m

4866ad2

x_dtype/quant_dtype fix to include gfx942

b9be4c7

black formatting error

1d794ec

amirumoAMD added 11 commits June 5, 2026 21:26

name change for unified ds routing

d4b9e89

swiglu add residual

18b8938

remove stray env variable

e9a261c

change routing name to unified routing

d3d7a21

slight change to x_q_dtype + fix for residual rename on aiter kernel

bd0ef27

black

364a113

env var should be disabled for triton atom run, change fixed

8fd4cc8

black

64a78e6

review change

cb68b42

add regular gemm + remove fused shared experts routing handling

4446c8d

comment change

71d4458

amirumoAMD force-pushed the amemoore/gfx950-moe-triton-integration branch from 785a0bc to 71d4458 Compare June 5, 2026 21:27

added support for silu triton a4w4

d6d2728

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MoE AITER Triton Kernels Integration#1044

MoE AITER Triton Kernels Integration#1044
amirumoAMD wants to merge 35 commits into
mainfrom
amemoore/gfx950-moe-triton-integration

amirumoAMD commented Jun 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

amirumoAMD commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

amirumoAMD commented Jun 2, 2026 •

edited

Loading