[WIP] B200 Minimax FP8 vllm upgrade by kedarpotdar-nv · Pull Request #947 · SemiAnalysisAI/InferenceX

kedarpotdar-nv · 2026-03-26T01:12:31Z

No description provided.

github-actions · 2026-03-26T01:12:43Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

functionstackx · 2026-03-28T21:04:13Z

@claude solve the merge conflicts

Klaud-Cold · 2026-03-28T21:04:35Z

Claude finished @functionstackx's task in 2m 0s —— View job

Merge Conflicts Resolved

Identify merge conflicts with origin/main
Resolve conflicts in 3 files:
- .github/configs/nvidia-master.yaml — kept PR's expanded search space (tp:2/tp:4 with wider conc ranges, tp:4 ep:4, tp:8, and 1k/8k seq-len config)
- benchmarks/single_node/minimaxm2.5_fp8_b200.sh — kept PR's --kv-cache-dtype fp8 and --stream-interval 20 alongside main's --no-enable-prefix-caching
- perf-changelog.yaml — kept all new entries from main (PRs Disable prefix cache for kimi vllm configs #926, Disable prefix minimax #966, Separate eval-only workflow and change to 8k1k #911) and the PR's entry ([WIP] B200 Minimax FP8 vllm upgrade #947)
Push resolved changes

Pushed commit f12f56d to nv/minimax-vllm018.
|

Resolve merge conflicts in 3 files: - .github/configs/nvidia-master.yaml: keep expanded search space from PR - benchmarks/single_node/minimaxm2.5_fp8_b200.sh: keep --kv-cache-dtype fp8 and --stream-interval 20 with --no-enable-prefix-caching from main - perf-changelog.yaml: keep both main's new entries and PR's entry Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx

hi @faradawn @kedarpotdar-nv can u update & merge ur existing recipe PR first before we merge this inferencex PR? vllm-project/recipes#272

usually we allow the inferencex PR to merge first as the recipe PR usually merges as an fast follow, but the reason why the recipes PR needs to merge first is cuz the previous PR's recipe update PR still didnt merge..

functionstackx · 2026-03-29T02:24:41Z

@claude remove isl 1024 osl: 8192

Klaud-Cold · 2026-03-29T02:25:00Z

Claude finished @functionstackx's task in 57s —— View job

Remove ISL 1024 OSL 8192 Configuration

Identify where ISL 1024 / OSL 8192 is configured
Remove the seq-len config block from nvidia-master.yaml (lines 3101-3105)
Update perf-changelog.yaml with removal note
Commit and push changes

Removed the ISL 1024 / OSL 8192 seq-len config (with tp:2 and tp:4 search-space entries) from minimaxm2.5-fp8-b200-vllm in nvidia-master.yaml. The 1k1k and 8k1k configs remain.

Pushed commit 83706ae to nv/minimax-vllm018.

functionstackx · 2026-03-29T02:25:40Z

perf didnt change much

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx · 2026-03-29T02:30:20Z

validated run here https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23694341288

functionstackx · 2026-03-30T06:57:04Z

@kedarpotdar-nv can we not use an nightly image here? Minimax M2 arch been around for multiple months now. And from looking at the master.yaml files, minimax is all using release image

Update nvidia-master.yaml

dc27fa1

github-project-automation bot added this to InferenceMAX Board Mar 26, 2026

kedarpotdar-nv added the NVIDIA label Mar 26, 2026

vllm version bump

9ee612f

kedarpotdar-nv requested a review from jgangani as a code owner March 26, 2026 01:19

kedarpotdar-nv requested a review from a team March 26, 2026 01:19

add perf changelog

a27edee

kedarpotdar-nv added the sweep-enabled label Mar 26, 2026

kedarpotdar-nv and others added 3 commits March 27, 2026 12:19

update search space and configs

a33dc21

Merge branch 'main' into nv/minimax-vllm018

ecaac19

fix typo in VLLM_USE_DEEP_GEMM

89acdf2

functionstackx requested changes Mar 29, 2026

View reviewed changes

functionstackx removed the sweep-enabled label Mar 29, 2026

Remove ISL 1024 / OSL 8192 seq-len config for minimaxm2.5-fp8-b200-vllm

83706ae

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

kedarpotdar-nv and others added 2 commits March 29, 2026 23:03

update image

e405b60

Merge branch 'main' into nv/minimax-vllm018

e4333e9

kedarpotdar-nv added the sweep-enabled label Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] B200 Minimax FP8 vllm upgrade#947

[WIP] B200 Minimax FP8 vllm upgrade#947
kedarpotdar-nv wants to merge 10 commits intomainfrom
nv/minimax-vllm018

kedarpotdar-nv commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

functionstackx commented Mar 28, 2026

Uh oh!

Klaud-Cold commented Mar 28, 2026 •

edited

Loading

Uh oh!

functionstackx left a comment

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 •

edited

Loading

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kedarpotdar-nv commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

functionstackx commented Mar 28, 2026

Uh oh!

Klaud-Cold commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge Conflicts Resolved

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Remove ISL 1024 OSL 8192 Configuration

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Klaud-Cold commented Mar 28, 2026 •

edited

Loading

Klaud-Cold commented Mar 29, 2026 •

edited

Loading