-
Notifications
You must be signed in to change notification settings - Fork 117
starter task: MVP port mi355 deepseek disagg recipe to mi300 #982
Copy link
Copy link
Open
Description
after porting mi355 to mi325, port to mi300
- (mi355 disagg fp8 deepseek for non-mtp & mtp) port over to mi325 (CDNA3)
InferenceX/.github/configs/amd-master.yaml
Lines 506 to 556 in 41147ad
dsr1-fp8-mi355x-sglang-disagg: image: rocm/sgl-dev:sglang-0.5.9-rocm720-mi35x-mori-0227-2 model: deepseek-ai/DeepSeek-R1-0528 model-prefix: dsr1 runner: mi355x-disagg precision: fp8 framework: sglang-disagg multinode: true disagg: true seq-len-configs: - isl: 1024 osl: 1024 search-space: # non-MTP configurations # "Top of curve" (1 prefill workers each at DEP8 and 1 decode workers at DEP16) - spec-decoding: "none" conc-list: [ 1024, 2048 ] prefill: num-worker: 1 tp: 8 ep: 1 dp-attn: false additional-settings: - "PREFILL_NODES=1" decode: num-worker: 1 tp: 8 ep: 8 dp-attn: true additional-settings: - "DECODE_NODES=2" - "DECODE_MTP_SIZE=0" # "Middle of curve" (1 prefill workers each at TP8 and 2 decode workers at DEP8) - spec-decoding: "none" conc-list: [ 1536, 1024, 512 ] prefill: num-worker: 1 tp: 8 ep: 1 dp-attn: false additional-settings: - "PREFILL_NODES=1" decode: num-worker: 2 tp: 8 ep: 8 dp-attn: true additional-settings: - "DECODE_NODES=2" - "DECODE_MTP_SIZE=0" - https://github.com/SemiAnalysisAI/InferenceX/blob/main/benchmarks/multi_node/dsr1_fp8_mi355x_sglang-disagg.sh (that then calls generate sweep py which calls this launcher script) . This uses (
image: rocm/sgl-dev:sglang-0.5.9-rocm720-mi35x-mori-0227-2but @JordanNanos u probably need to find the mi30x evquilaent of this. check the upstream nightly images have MoRI included https://hub.docker.com/r/lmsysorg/sglang-daily/tags, if not build using this. https://github.com/akao-amd/sglang/blob/main/docker/rocm.Dockerfile . ensure that u build it with the correct NIC) - which calls the files in here https://github.com/SemiAnalysisAI/InferenceX/tree/main/benchmarks/multi_node/amd_utils (which is based on bill's repo, it might be easier as first attempt to use bill's repo to locally run it https://github.com/billishyahao/sglang_disagg without the abstractions of runners/generate config .py/etc)
probably start doing 1k/1k on 1P1D first since it is an faster debugging loop
Reactions are currently unavailable
Metadata
Metadata
Assignees
Type
Projects
Status
No status