Skip to content

Pull requests: NVIDIA/Model-Optimizer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Quantize lm_head + embedding for Nemotron-H, add NVFP4 W4A16 recipe
#1327 opened Apr 22, 2026 by ajrasane Contributor Loading…
3 of 5 tasks
Update the DMD2 at the first stage
#1326 opened Apr 22, 2026 by jingyu-ml Contributor Draft
Add Nemotron-Nano-9B-v2 → Pruned 7B Minitron pruning and distillation results and steps to reproduce cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1325 opened Apr 22, 2026 by kevalmorabia97 Collaborator Loading…
[NVBug 6102977] Add _disable_use_cache context manager to fix PTQ AttributeError on custom configs bug Something isn't working cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1324 opened Apr 22, 2026 by meenchen Contributor Loading…
Fix lm_eval version checking cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1321 opened Apr 22, 2026 by kevalmorabia97 Collaborator Loading…
Fix PTQ for VLMs with image calibration cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1318 opened Apr 22, 2026 by LianaMikael Contributor Loading…
Update vLLM deployment docs for heterogeneous models cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1317 opened Apr 22, 2026 by grzegorz-k-karch Contributor Loading…
Jingyux/vsa diffusion
#1315 opened Apr 22, 2026 by jingyu-ml Contributor Draft
Support NVFP4 W4A16 quantization
#1313 opened Apr 22, 2026 by hychiang-git Contributor Loading…
Add disable_sensitive_layers field to QuantizeConfig
#1312 opened Apr 22, 2026 by mxinO Contributor Draft
fix: bug hf_ptq.py max_length setting ignored for LLMs cherry-pick-0.44.0 After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1311 opened Apr 22, 2026 by michaelfeil Contributor Loading…
fix: load PTQ checkpoints from before use_sequential→layerwise rename
#1310 opened Apr 21, 2026 by realAsma Contributor Loading…
3 tasks done
fix: preserve q/k/v quantizer mapping in AST attention patching
#1307 opened Apr 21, 2026 by Brumbelow Loading…
4 tasks done
Reorg the sparse/quant/common kernel dir
#1303 opened Apr 20, 2026 by jingyu-ml Contributor Loading…
[1/3][Refactor]: File reorg; deprecate ParallelDraft
#1296 opened Apr 19, 2026 by h-guo18 Contributor Loading…
[2/3][Feat]: Offline DFlash training
#1295 opened Apr 19, 2026 by h-guo18 Contributor Loading…
1 task done
[OMNIML-3349] Add FP8 MHA quantization support for HuggingFace ViT
#1289 opened Apr 17, 2026 by ajrasane Contributor Loading…
5 tasks done
keep deploy cases and Eagle fixes for merge
#1287 opened Apr 17, 2026 by nvSiruiW Loading…
Update excluded modules for Qwen3.5 dense PTQ
#1284 opened Apr 17, 2026 by amukkara Loading…
Add qwen3 moe experts only test
#1274 opened Apr 16, 2026 by cjluo-nv Collaborator Loading…
SpecDec Bench: April Update
#1272 opened Apr 16, 2026 by IzzyPutterman Contributor Loading…
ProTip! Exclude everything labeled bug with -label:bug.