[CI] Add GLM 4.7 to Megatron Models CI

# Summary

SkyRL uses a custom MegatronBridge implementation for GLM 4.7 :

https://github.com/NovaSky-AI/SkyRL/blob/33f18badd5cbd0d5da7594d79dcadf399281b0b2/skyrl/backends/skyrl_train/workers/megatron/model_bridges.py#L1-L32

Given the current speed of dependency upgrades to support new models, it would be best to add GLM 4.7 to CI to ensure no regressions. We currently have a Megatron Models CI :

https://github.com/NovaSky-AI/SkyRL/blob/main/ci/gpu_ci_run_skyrl_train_megatron_models.sh

Which basically runs this script: 

https://github.com/NovaSky-AI/SkyRL/blob/33f18badd5cbd0d5da7594d79dcadf399281b0b2/tests/backends/skyrl_train/gpu/gpu_ci/megatron/test_megatron_models.py#L1-L5

We should add a tiny GLM 4.7 model as well to the same CI workflow. 

	"""Register megatron-bridge implementations for model architectures not yet
	supported upstream.

	Import this module at the top of ``megatron_worker.py`` so that bridges are
	registered before any ``AutoBridge.from_hf_pretrained`` call.

	All registrations are guarded by a top-level ``try/except ImportError`` so that
	the rest of the codebase still works in CPU-only (no megatron-bridge) environments.
	"""

	try:
	from megatron.bridge.models.conversion.model_bridge import MegatronModelBridge
	from megatron.bridge.models.deepseek.deepseek_v3_bridge import DeepSeekV3Bridge
	from megatron.bridge.models.hf_pretrained.causal_lm import PreTrainedCausalLM
	from megatron.core.models.gpt.gpt_model import GPTModel

	@MegatronModelBridge.register_bridge(
	source="Glm4MoeLiteForCausalLM",
	target=GPTModel,
	)
	class GLM47FlashBridge(DeepSeekV3Bridge):
	"""Bridge for GLM-4.7-Flash (Glm4MoeLiteForCausalLM).

	GLM-4.7-Flash is architecturally identical to DeepSeek-V3 (MLA + MoE)
	but its HF config differs in rope_scaling format:
	- DeepSeek: rope_scaling has factor/mscale/mscale_all_dim, top-level rope_theta
	- GLM-4.7-Flash: rope_scaling has rope_theta/rope_type, no mscale fields

	We reuse DeepSeekV3Bridge.provider_bridge() (which sets all critical
	TP/MoE/MLA provider attributes) by temporarily normalizing the HF config
	rope fields so the base CONFIG_MAPPING can handle them.
	"""

	"""
	Run with:
	uv run --isolated --extra dev --extra megatron -- pytest -s tests/backends/skyrl_train/gpu/gpu_ci/megatron/test_megatron_models.py
	"""

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Add GLM 4.7 to Megatron Models CI #1681

Summary

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[CI] Add GLM 4.7 to Megatron Models CI #1681

Description

Summary

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions