starter task: MVP port mi355 deepseek disagg recipe to mi300

 after porting mi355 to mi325, port to mi300
1. https://github.com/SemiAnalysisAI/InferenceX/blob/41147ad860b2d04b3fad8553d02d88a7b7e89c46/.github/configs/amd-master.yaml#L506-L556 (mi355 disagg fp8 deepseek for non-mtp & mtp) port over to mi325 (CDNA3)
2. https://github.com/SemiAnalysisAI/InferenceX/blob/main/benchmarks/multi_node/dsr1_fp8_mi355x_sglang-disagg.sh (that then calls generate sweep py which calls this launcher script) . This uses (`image: rocm/sgl-dev:sglang-0.5.9-rocm720-mi35x-mori-0227-2` but @JordanNanos u probably need to find the mi30x evquilaent of this. check the upstream nightly images have MoRI included https://hub.docker.com/r/lmsysorg/sglang-daily/tags, if not build using this. https://github.com/akao-amd/sglang/blob/main/docker/rocm.Dockerfile . ensure that u build it with the correct NIC)
3. which calls the files in here https://github.com/SemiAnalysisAI/InferenceX/tree/main/benchmarks/multi_node/amd_utils (which is based on bill's repo, it might be easier as first attempt to use bill's repo to locally run it https://github.com/billishyahao/sglang_disagg without the abstractions of runners/generate config .py/etc)

probably start doing 1k/1k on 1P1D first since it is an faster debugging loop

	dsr1-fp8-mi355x-sglang-disagg:
	image: rocm/sgl-dev:sglang-0.5.9-rocm720-mi35x-mori-0227-2
	model: deepseek-ai/DeepSeek-R1-0528
	model-prefix: dsr1
	runner: mi355x-disagg
	precision: fp8
	framework: sglang-disagg
	multinode: true
	disagg: true
	seq-len-configs:
	- isl: 1024
	osl: 1024
	search-space:
	# non-MTP configurations
	# "Top of curve" (1 prefill workers each at DEP8 and 1 decode workers at DEP16)
	- spec-decoding: "none"
	conc-list: [ 1024, 2048 ]
	prefill:
	num-worker: 1
	tp: 8
	ep: 1
	dp-attn: false
	additional-settings:
	- "PREFILL_NODES=1"
	decode:
	num-worker: 1
	tp: 8
	ep: 8
	dp-attn: true
	additional-settings:
	- "DECODE_NODES=2"
	- "DECODE_MTP_SIZE=0"

	# "Middle of curve" (1 prefill workers each at TP8 and 2 decode workers at DEP8)
	- spec-decoding: "none"
	conc-list: [ 1536, 1024, 512 ]
	prefill:
	num-worker: 1
	tp: 8
	ep: 1
	dp-attn: false
	additional-settings:
	- "PREFILL_NODES=1"
	decode:
	num-worker: 2
	tp: 8
	ep: 8
	dp-attn: true
	additional-settings:
	- "DECODE_NODES=2"
	- "DECODE_MTP_SIZE=0"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

starter task: MVP port mi355 deepseek disagg recipe to mi300 #982

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

starter task: MVP port mi355 deepseek disagg recipe to mi300 #982

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions