Skip to content

[CI] Add GLM 4.7 to Megatron Models CI #1681

@SumanthRH

Description

@SumanthRH

Summary

SkyRL uses a custom MegatronBridge implementation for GLM 4.7 :

"""Register megatron-bridge implementations for model architectures not yet
supported upstream.
Import this module at the top of ``megatron_worker.py`` so that bridges are
registered before any ``AutoBridge.from_hf_pretrained`` call.
All registrations are guarded by a top-level ``try/except ImportError`` so that
the rest of the codebase still works in CPU-only (no megatron-bridge) environments.
"""
try:
from megatron.bridge.models.conversion.model_bridge import MegatronModelBridge
from megatron.bridge.models.deepseek.deepseek_v3_bridge import DeepSeekV3Bridge
from megatron.bridge.models.hf_pretrained.causal_lm import PreTrainedCausalLM
from megatron.core.models.gpt.gpt_model import GPTModel
@MegatronModelBridge.register_bridge(
source="Glm4MoeLiteForCausalLM",
target=GPTModel,
)
class GLM47FlashBridge(DeepSeekV3Bridge):
"""Bridge for GLM-4.7-Flash (Glm4MoeLiteForCausalLM).
GLM-4.7-Flash is architecturally identical to DeepSeek-V3 (MLA + MoE)
but its HF config differs in rope_scaling format:
- DeepSeek: rope_scaling has factor/mscale/mscale_all_dim, top-level rope_theta
- GLM-4.7-Flash: rope_scaling has rope_theta/rope_type, no mscale fields
We reuse DeepSeekV3Bridge.provider_bridge() (which sets all critical
TP/MoE/MLA provider attributes) by temporarily normalizing the HF config
rope fields so the base CONFIG_MAPPING can handle them.
"""

Given the current speed of dependency upgrades to support new models, it would be best to add GLM 4.7 to CI to ensure no regressions. We currently have a Megatron Models CI :

https://github.com/NovaSky-AI/SkyRL/blob/main/ci/gpu_ci_run_skyrl_train_megatron_models.sh

Which basically runs this script:

"""
Run with:
uv run --isolated --extra dev --extra megatron -- pytest -s tests/backends/skyrl_train/gpu/gpu_ci/megatron/test_megatron_models.py
"""

We should add a tiny GLM 4.7 model as well to the same CI workflow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions