fix: fix quant config read logic in model loading#1119
Conversation
Signed-off-by: Phi-C <chenxjhit@163.com>
| module_prefix = matching_name.split("shared_expert", 1)[0] | ||
| shared_expert_prefix = layer_prefix + matching_name.rstrip(".") | ||
| routed_expert_prefix = layer_prefix + f"{module_prefix}experts" | ||
| model_quant_config = getattr(getattr(model, "args", None), "quant_config", None) |
There was a problem hiding this comment.
maybe let's force all models have "quant_config"
There was a problem hiding this comment.
It seems not easy to unify the read logic in one path, since 1) for plugins, "model.quant_config" means vllm/sglang's quant_config, which is different from "model.atom_config.quant_config"; 2) for dsv4, it uses "model.args.quant_config".
There was a problem hiding this comment.
how about unified to "model.atom_config.quant_config", you can feel free to change dsv4's code for this target
There was a problem hiding this comment.
how about unified to "model.atom_config.quant_config", you can feel free to change dsv4's code for this target
For models in ATOM/atom/models, we use "self.config" for atom config, if we want to unified to "model.atom_config.quant_config", it means we have to change all these into "self.atom_config". Maybe we can keep "model.atom_config.quant_config" and "model.quant_config" in this PR to avoid too many modifications, and change all models's "config" to "atom_config" in another PR to unify the read path if necessary?
Signed-off-by: Phi-C <chenxjhit@163.com>
Motivation
Fix quant config read procedure in #958. Without this modification, ATOM SGLang benchmark will fail (e.g. https://github.com/ROCm/ATOM/actions/runs/27054539290/job/79856431418).
Technical Details
Test Plan
Test Result
Submission Checklist