Skip to content

Fix smart_resize to use actual image dimensions instead of fixed config values#1032

Merged
apsonawane merged 2 commits intomainfrom
sunghcho/qwen3-vl
Mar 5, 2026
Merged

Fix smart_resize to use actual image dimensions instead of fixed config values#1032
apsonawane merged 2 commits intomainfrom
sunghcho/qwen3-vl

Conversation

@hanbitmyths
Copy link
Collaborator

Problem:
In Resize::Compute(), smart_resize was called with the fixed config dimensions (height_, width_) instead of the actual input image dimensions (h, w). This caused all images to be resized to the same target size regardless of their original aspect ratio, producing incorrect preprocessing results that diverged from HuggingFace's smart_resize behavior.

Fix:
Pass the original image dimensions to smart_resize() so the target size is computed based on the actual aspect ratio, snapping to the nearest patch grid boundary while respecting min_pixels / max_pixels constraints. This matches the HuggingFace Qwen2VLImageProcessor behavior exactly.

…nfig dimensions

Pass actual image (h, w) to smart_resize() instead of fixed config (height_, width_).
This preserves the aspect ratio and matches HuggingFace smart_resize behavior,
significantly improving accuracy on vision-language benchmarks.
@apsonawane apsonawane merged commit 8a41849 into main Mar 5, 2026
37 checks passed
@apsonawane apsonawane deleted the sunghcho/qwen3-vl branch March 5, 2026 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants