Skip to content

[WIP] B200 Minimax FP8 vllm upgrade#947

Open
kedarpotdar-nv wants to merge 10 commits intomainfrom
nv/minimax-vllm018
Open

[WIP] B200 Minimax FP8 vllm upgrade#947
kedarpotdar-nv wants to merge 10 commits intomainfrom
nv/minimax-vllm018

Conversation

@kedarpotdar-nv
Copy link
Copy Markdown
Collaborator

No description provided.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

@kedarpotdar-nv kedarpotdar-nv requested a review from a team March 26, 2026 01:19
@functionstackx
Copy link
Copy Markdown
Contributor

@claude solve the merge conflicts

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 28, 2026

Claude finished @functionstackx's task in 2m 0s —— View job


Merge Conflicts Resolved

Pushed commit f12f56d to nv/minimax-vllm018.
|

Resolve merge conflicts in 3 files:
- .github/configs/nvidia-master.yaml: keep expanded search space from PR
- benchmarks/single_node/minimaxm2.5_fp8_b200.sh: keep --kv-cache-dtype fp8 and --stream-interval 20 with --no-enable-prefix-caching from main
- perf-changelog.yaml: keep both main's new entries and PR's entry

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @faradawn @kedarpotdar-nv can u update & merge ur existing recipe PR first before we merge this inferencex PR? vllm-project/recipes#272

usually we allow the inferencex PR to merge first as the recipe PR usually merges as an fast follow, but the reason why the recipes PR needs to merge first is cuz the previous PR's recipe update PR still didnt merge..

@functionstackx
Copy link
Copy Markdown
Contributor

@claude remove isl 1024 osl: 8192

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 29, 2026

Claude finished @functionstackx's task in 57s —— View job


Remove ISL 1024 OSL 8192 Configuration

  • Identify where ISL 1024 / OSL 8192 is configured
  • Remove the seq-len config block from nvidia-master.yaml (lines 3101-3105)
  • Update perf-changelog.yaml with removal note
  • Commit and push changes

Removed the ISL 1024 / OSL 8192 seq-len config (with tp:2 and tp:4 search-space entries) from minimaxm2.5-fp8-b200-vllm in nvidia-master.yaml. The 1k1k and 8k1k configs remain.

Pushed commit 83706ae to nv/minimax-vllm018.

@functionstackx
Copy link
Copy Markdown
Contributor

perf didnt change much

image

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Contributor

@kedarpotdar-nv can we not use an nightly image here? Minimax M2 arch been around for multiple months now. And from looking at the master.yaml files, minimax is all using release image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants