[AMD] improve dsr1 fp4 disagg perf on mi355x by billishyahao · Pull Request #983 · SemiAnalysisAI/InferenceX

billishyahao · 2026-03-31T06:47:44Z

The new patch is adding the following optimization:

"Bump SGL mori image to March 27"
"Add more low latency sweep configs"
"Enable v2 mxfp4 DSR1 0528 model"
"Enable fp4 disp feature on mori"

…transformers v5 Transformers v5 incorrectly rebuilds pre_tokenizer/decoder components for models like DeepSeek-R1 that use LlamaTokenizerFast with a non-Llama tokenizer architecture. The sglang server fixes this at startup, but the benchmark client loads the tokenizer without these fixes, causing a ~5x token count inflation (e.g. 7000 tokens -> 35000 tokens) and false performance regressions in TTFT and throughput benchmarks. Apply the same tokenizer fixes (pre_tokenizer/decoder restoration and add_bos_token recovery) that sglang server applies, so client and server tokenize identically. No-op on transformers v4. Made-with: Cursor

github-actions · 2026-03-31T06:47:54Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-03-31T06:47:54Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

functionstackx · 2026-03-31T07:01:42Z

.github/configs/amd-master.yaml


 dsr1-fp8-mi355x-sglang-disagg:
-  image: rocm/sgl-dev:sglang-0.5.9-rocm720-mi35x-mori-0227-2
+  image: rocm/sgl-dev:sglang-0.5.9-rocm720-mi35x-mori-0327


hi @billishyahao

in early march, you said that after consulting with @HaiShaw and others in the org that by End of March, you would be using upstream images. Can u please update this use upstream nightly images instead of second class forks?

lets ensure that we work towards amd being an first class platform on sglang instead of continuing to submit second class forks

billishyahao and others added 12 commits March 16, 2026 08:36

[AMD] add dsr1 mxfp4 v2 sweep points

0383696

fix

18e05b1

change mtp model to fp8

0bd347f

change fp8 image

754e53c

bump image to 0327

f29f2d0

remove specv2

a44c7eb

consolidate dsr1 fp4 configs

2514136

Merge remote-tracking branch 'inf/main' into amd/mi355x-dsfp4-march15

3b4d4ab

bump fp8 image to 0327

682a4ab

fix crash

64bf100

fix env

c44e175

billishyahao requested a review from a team March 31, 2026 06:47

billishyahao requested a review from chunfangamd as a code owner March 31, 2026 06:47

github-project-automation bot added this to InferenceMAX Board Mar 31, 2026

billishyahao added 2 commits March 31, 2026 06:50

cleanup

0a41f89

add perf change log

7282748

functionstackx reviewed Mar 31, 2026

View reviewed changes

add deprecate comments

e6d4b32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] improve dsr1 fp4 disagg perf on mi355x#983

[AMD] improve dsr1 fp4 disagg perf on mi355x#983
billishyahao wants to merge 15 commits intomainfrom
amd/mi355x-dsfp4-march30

billishyahao commented Mar 31, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

functionstackx Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

billishyahao commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

github-actions bot commented Mar 31, 2026

Uh oh!

functionstackx Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

billishyahao commented Mar 31, 2026 •

edited

Loading