[vLLM-ATOM] Enable DBO for vLLM plugin#1103
Conversation
Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: zejunchen-zejun <zejun.chen@amd.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
| """Relax vLLM's DeepEP-only gate so ATOM plugin mode can run DBO over mori. | ||
|
|
||
| vLLM has a hard-assert that when DBO is enabled, the all2all backend is one | ||
| of the two DeepEP backends (deepep_low_latency or deepep_high_throughput) |
There was a problem hiding this comment.
this hard assert still exist in vLLM 0.22.0 version?
There was a problem hiding this comment.
Yes in vLLM v0.22.0 this still exists
| _orig_post_init(self) | ||
| finally: | ||
| if spoofed: | ||
| pc.all2all_backend = restore_backend |
There was a problem hiding this comment.
We restore the all2all backend to avoid vllm create another mori all2all manager, after that, the sys will have 2 mori all2all managers, while it could not happen as there could be an unsupported error when specifying --all2all-backend=mori --enable-dbo because it is not supported in vllm for DBO for now.
Could you have a recipe about atom-vllm DBO usage? The users may specify a mori all2all manager when launch vllm server, but got error, what argument should atom-vllm frontend user specify
There was a problem hiding this comment.
Actually the restore_backend here is set to falling back to AgRs on vLLM side for all cases. So when users explicitly specify mori as all2all_backend, on vLLM side it does not construct another mori a2a manager. Effectively all all2all_backend specified at vLLM frontend will be swapped out by the lightweight AgRs, but sure I can use a recipe to articulate this.
|
Thank you for help enabling DBO.
|
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Motivation
This PR enables DBO for vLLM-ATOM. This PR requires DP+EP enablement and currently contains changes from the enablement PR.
Technical Details
Test Plan
Test Result
deepseek-ai/DeepSeek-R1-0528 DP8+EP+DBO
openai/gpt-oss-120b DP2+EP+DBO
Submission Checklist