[CONTEXT PARALLEL] add CP for mamba2 by mayank31398 · Pull Request #482 · open-lm-engine/lm-engine

mayank31398 · 2026-06-26T00:57:22Z

No description provided.

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>

gemini-code-assist

Code Review

This pull request introduces context parallel (CP) support for the Mamba2 sequence mixer block, implementing a serial prefix scan over CP ranks to compute correct initial SSM states in both the PyTorch-native and CUDA-based forward paths. It also adds corresponding integration tests. The review feedback highlights two key improvements: removing a redundant addition in the CUDA path that unnecessarily triggers a backward pass on unused scan outputs, and expanding the test parametrization to cover the PyTorch-native path in addition to the Triton kernel path.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-26T00:59:46Z

+                        dt_softplus=True,
+                        **dt_limit_kwargs,
+                    )
+                    ssm_state_zero = ssm_state_zero + scan_output_zero.sum().to(ssm_state_zero.dtype) * 0


Adding scan_output_zero.sum() * 0 to ssm_state_zero forces PyTorch's autograd to execute the backward pass of mamba_chunk_scan_combined for scan_output_zero. Since scan_output_zero is not used elsewhere and its gradient contribution is zero, this redundant backward pass doubles the computation time of the scan kernel during training. You should remove this addition to avoid the performance bottleneck.

Suggested change

ssm_state_zero = ssm_state_zero + scan_output_zero.sum().to(ssm_state_zero.dtype) * 0

# ssm_state_zero is already tracked by autograd; no need to add scan_output_zero

gemini-code-assist · 2026-06-26T00:59:46Z

+from ..utils import skip_test_if_device_unavailable, slow_test
+
+
+@pytest.mark.parametrize("use_mamba2_ssm", [True])


The test currently only parametrizes use_mamba2_ssm with [True], which means only the Triton kernel path is tested. Since context parallel support was also added to the PyTorch-native path (_torch_forward), you should include False in the parameter list to ensure both paths are covered by the test suite.

Suggested change

@pytest.mark.parametrize("use_mamba2_ssm", [True])

@pytest.mark.parametrize("use_mamba2_ssm", [True, False])

mayank31398 added 6 commits June 25, 2026 11:44

merge

7d1eb40

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>

Merge branch 'main' into mamba

ac48ac7

merge

3b09daf

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>

merge

0a2f8f0

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>

merge

de09f09

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>

merge

b0f84fa

Signed-off-by: Mayank Mishra <mayank31398@gmail.com>

gemini-code-assist Bot reviewed Jun 26, 2026

View reviewed changes

mayank31398 merged commit df3f94a into main Jun 26, 2026
2 checks passed

mayank31398 deleted the mamba branch June 26, 2026 01:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CONTEXT PARALLEL] add CP for mamba2#482

[CONTEXT PARALLEL] add CP for mamba2#482
mayank31398 merged 6 commits into
mainfrom
mamba

mayank31398 commented Jun 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	ssm_state_zero = ssm_state_zero + scan_output_zero.sum().to(ssm_state_zero.dtype) * 0
	# ssm_state_zero is already tracked by autograd; no need to add scan_output_zero

		from ..utils import skip_test_if_device_unavailable, slow_test


		@pytest.mark.parametrize("use_mamba2_ssm", [True])

Uh oh!

Conversation

mayank31398 commented Jun 26, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant