-
Notifications
You must be signed in to change notification settings - Fork 366
Pull requests: NVIDIA/Model-Optimizer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Quantize lm_head + embedding for Nemotron-H, add NVFP4 W4A16 recipe
#1327
opened Apr 22, 2026 by
ajrasane
Contributor
Loading…
3 of 5 tasks
Add Nemotron-Nano-9B-v2 → Pruned 7B Minitron pruning and distillation results and steps to reproduce
cherry-pick-0.44.0
After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1325
opened Apr 22, 2026 by
kevalmorabia97
Collaborator
Loading…
[NVBug 6102977] Add _disable_use_cache context manager to fix PTQ AttributeError on custom configs
bug
Something isn't working
cherry-pick-0.44.0
After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1324
opened Apr 22, 2026 by
meenchen
Contributor
Loading…
Fix NVFP4 quantization for Qwen3.x MoE models (4 silent-failure bugs)
#1323
opened Apr 22, 2026 by
erictinkeredapps
Loading…
Fix lm_eval version checking
cherry-pick-0.44.0
After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1321
opened Apr 22, 2026 by
kevalmorabia97
Collaborator
Loading…
Add demo (Puzzletron and Minitron guide) in Model-Optimizer/examples/pruning/ with README and notebooks
documentation
Improvements or additions to documentation
#1320
opened Apr 22, 2026 by
achidiac-nv
Loading…
Fix PTQ for VLMs with image calibration
cherry-pick-0.44.0
After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1318
opened Apr 22, 2026 by
LianaMikael
Contributor
Loading…
Update vLLM deployment docs for heterogeneous models
cherry-pick-0.44.0
After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1317
opened Apr 22, 2026 by
grzegorz-k-karch
Contributor
Loading…
fix: bug hf_ptq.py max_length setting ignored for LLMs
cherry-pick-0.44.0
After code freeze, cherry-pick to release branch for next rc (bulk update). Only for bug fixes / doc
#1311
opened Apr 22, 2026 by
michaelfeil
Contributor
Loading…
fix: load PTQ checkpoints from before use_sequential→layerwise rename
#1310
opened Apr 21, 2026 by
realAsma
Contributor
Loading…
3 tasks done
fix: preserve q/k/v quantizer mapping in AST attention patching
#1307
opened Apr 21, 2026 by
Brumbelow
Loading…
4 tasks done
Reorg the sparse/quant/common kernel dir
#1303
opened Apr 20, 2026 by
jingyu-ml
Contributor
Loading…
[1/3][Refactor]: File reorg; deprecate ParallelDraft
#1296
opened Apr 19, 2026 by
h-guo18
Contributor
Loading…
[2/3][Feat]: Offline DFlash training
#1295
opened Apr 19, 2026 by
h-guo18
Contributor
Loading…
1 task done
[OMNIML-3349] Add FP8 MHA quantization support for HuggingFace ViT
#1289
opened Apr 17, 2026 by
ajrasane
Contributor
Loading…
5 tasks done
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.