[op_tests] Refactor MoE legacy UT into per-quant smoke sweep by zhiding512 · Pull Request #3585 · ROCm/aiter

zhiding512 · 2026-06-07T06:02:48Z

Replace the global CLI-default sweep in test_moe_2stage.py with a QUANT_DEFAULTS table that pins a representative production shape (dim/E/topk/pad/preshuffle/act/strict_accuracy) per quant triple. CLI flags (-dim/-e/-k/-hip/-p) still override the defaults globally when supplied.

_iter_legacy_cases now drives a single itertools.product loop off the per-quant config instead of per-triple if/elif branches.
Kernel-forced activations (a16w4 -> Swiglu, a16wi4 -> Silu) are encoded in the table and ignore -a; other quants honor -a.
strict_accuracy is gated on per quant (enabled for the fp4-weight a4w4 / a8w4-mxfp paths, warn-only elsewhere).
test_fmoe now compares only the real (un-padded) model_dim region, since some kernels leave the padded tail uninitialized/NaN.

Replace the global CLI-default sweep in test_moe_2stage.py with a QUANT_DEFAULTS table that pins a representative production shape (dim/E/topk/pad/preshuffle/act/strict_accuracy) per quant triple. CLI flags (-dim/-e/-k/-hip/-p) still override the defaults globally when supplied. - _iter_legacy_cases now drives a single itertools.product loop off the per-quant config instead of per-triple if/elif branches. - Kernel-forced activations (a16w4 -> Swiglu, a16wi4 -> Silu) are encoded in the table and ignore -a; other quants honor -a. - strict_accuracy is gated on per quant (enabled for the fp4-weight a4w4 / a8w4-mxfp paths, warn-only elsewhere). - test_fmoe now compares only the real (un-padded) model_dim region, since some kernels leave the padded tail uninitialized/NaN. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-07T06:03:27Z

🏷️ CI Guide

Runs automatically on every PR:

✅ Pre-checks (submodule verification, code formatting)
✅ Aiter op tests (gfx942 + gfx950)
✅ Triton tests on MI35X (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label	Tests
`ci:triton-300x`	Run an additional Triton test job on MI300X in PRs; main branch always runs both MI35X and MI300X
`ci:sglang`	SGLang integration tests: DeepSeek-R1-MXFP4 accuracy, Qwen 3.5 accuracy
`ci:atom`	ATOM benchmark: DeepSeek-R1-0528, GPT-OSS-120B
`ci:atom_full`	ATOM accuracy suite for PR and main models from ATOM `models_accuracy.json`
`ci:vllm`	vLLM benchmark: GPT-OSS-120B, DeepSeek-R1-0528, Kimi-K2.5
`ci:all`	All standard extended tests (excludes `ci:atom_full`)

Only add ci:atom_full for FlyDSL or Triton upgrades.
Add labels via the sidebar or gh pr edit 3585 --add-label <label>

Copilot

Pull request overview

This PR refactors the legacy MoE 2-stage op smoke sweep in op_tests/test_moe_2stage.py from a global CLI-default parameter grid into a per-quant configuration table (QUANT_DEFAULTS), aiming to exercise representative production-like shapes per quantization triple while still allowing CLI flags to override defaults.

Changes:

Introduces QUANT_DEFAULTS and rewrites _iter_legacy_cases() to generate cases via a unified itertools.product loop driven by per-quant defaults.
Encodes kernel-imposed activation constraints in the per-quant table (e.g., a16w4→Swiglu, a16wi4→Silu) and gates strict_accuracy per-quant.
Updates test_fmoe accuracy checking to compare only the unpadded model_dim region (avoiding NaNs from uninitialized padded tails).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    real_model_dim = model_dim - hidden_pad
+    out2_ref = out2_ref[:, :real_model_dim]
+    out2_ck = out2_ck[:, :real_model_dim]


+    help="""Whether to use pre-shuffle weight mode. If unset, each quant uses
+    its per-quant default (only a4w4 varies preshuffle; others require shuffled
+    weights for correctness).


…_legacy_ut

zhiding512 requested review from a team and Copilot June 7, 2026 06:02

Copilot started reviewing on behalf of zhiding512 June 7, 2026 06:02 View session

Merge branch 'main' into zhimding/refactor_moe_legacy_ut

6a11050

Copilot AI reviewed Jun 7, 2026

View reviewed changes

zhiding512 added 2 commits June 8, 2026 03:10

format code

1f91d7e

Merge remote-tracking branch 'origin/main' into zhimding/refactor_moe…

92eddc2

…_legacy_ut

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[op_tests] Refactor MoE legacy UT into per-quant smoke sweep#3585

[op_tests] Refactor MoE legacy UT into per-quant smoke sweep#3585
zhiding512 wants to merge 4 commits into
mainfrom
zhimding/refactor_moe_legacy_ut

zhiding512 commented Jun 7, 2026

Uh oh!

github-actions Bot commented Jun 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhiding512 commented Jun 7, 2026

Uh oh!

github-actions Bot commented Jun 7, 2026

🏷️ CI Guide

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants