Skip to content

COPY OF 2345#2354

Draft
xiaoyu-work wants to merge 13 commits intomainfrom
xiaoyu/qwen3-vl
Draft

COPY OF 2345#2354
xiaoyu-work wants to merge 13 commits intomainfrom
xiaoyu/qwen3-vl

Conversation

@xiaoyu-work
Copy link
Collaborator

Describe your changes

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

hanbitmyths and others added 13 commits February 26, 2026 11:19
- graph_surgeries.py: add QwenVL-specific graph surgery passes for
  vision embedding merge and positional encoding fixup
- rtn_quantization.py: extend RTN quantization for multimodal models,
  handle vision encoder exclusion patterns
- cast_chain_elimination.py: new pass to eliminate redundant Cast chains
  in Dynamo-exported models (fp32->fp16->fp32 patterns)
- olive_config.json: register new passes
…surgery passes

- rtn_quantization.py: Parameterize bits through quantization methods to support 8-bit Gather
- common.py: Fix ByteSize() crash for >2GB models, fix FOLDED_FROM_KEY import
- graph_surgeries.py: Add ReciprocalMulToDiv, DeduplicateSubgraphInitializers, DeduplicateNodes
- Apply ruff format to 4 files (cast_chain_elimination.py,
  rtn_quantization.py, test_graph_surgeries.py, test_rtn_quantization.py)
- Fix _pack_int8_to_int4 reshape bug: replace global flatten+pack with
  axis-aware _pack_int4_along_axis that correctly packs zero_point when
  k_blocks is small (e.g. 1), avoiding ValueError on reshape
- Fix test_rtn_quantization_pass_gather assertion: GatherBlockQuantized
  always uses quantize_axis=data_rank-1, not pass_config['axis']
The upstream tuning_strategies.md page no longer exists, causing the
Sphinx linkcheck to fail with -W (warnings-as-errors).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants