-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes #27705
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Lukas Geiger <[email protected]>
| max_seqlen, seqlens = ( | ||
| torch.zeros(1, device=cu_seqlens.device), | ||
| torch.zeros(1, device=cu_seqlens.device), | ||
| ) | ||
| max_seqlen = torch.zeros([], device=cu_seqlens.device) | ||
| seqlens = torch.zeros(1, device=cu_seqlens.device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_seqlen is actually a scalar value so this updates the dummy values to match this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request addresses a runtime error with torch.compile in Qwen3VL and Qwen3Omni models. The error was caused by passing a Python int instead of a torch.Tensor for the max_seqlen argument in the vision attention mechanism.
The changes correctly modify the compute_attn_mask_seqlen methods in qwen2_5_vl.py, qwen3_omni_moe_thinker.py, and qwen3_vl.py to return torch.Tensor objects for max_seqlen and seqlens, instead of Python scalars or lists. This is achieved by removing .item() and .tolist() calls. The type hints for the forward methods are also updated accordingly.
These changes are consistent, targeted, and directly resolve the reported torch.compile issue. The code quality is good, and I have no further suggestions.
Lucaskabela
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @lgeiger! I was planning on making may way through to Qwen3, but didn't realize this PR had (indirectly) impacted those files
Given that but no test failed, do we have CI coverage for Qwen3 models? (cc @ywang96 @ProExpertProg @zou3519 )
If not it might be worth adding it to the basic initialization suite of tests
No worries! The addition of torch.compile to the ViT part is amazing. Looking forward to your Qwen3 PRs 🎉 |
|
Also I would definitely welcome adding torch.compile to the vision component of Qwen3, so let me know if you need review/support on that or otherwise want to own it :) It should be relatively straightforward The only "gotcha" is that right now the torch.compile integration compiles each instantiation of the |
ywang96
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this! Stamping it now and I will test Qwen3-Omni on my end.
Given that but no test failed, do we have CI coverage for Qwen3 models?
Yea we didn't add them onto CI back then because the model support PR was merged before the model release, but we should definitely add one now that the smaller version of the models are available. I will take that as an action item
Signed-off-by: Roger Wang <[email protected]>
|
@lgeiger @Lucaskabela I added an fix for the missing attribute of |
Signed-off-by: Roger Wang <[email protected]>
…roject#27705) Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…roject#27705) Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…roject#27705) Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
…roject#27705) Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>
Purpose
#23207 Broke Qwen3VL since it relies on
Qwen2_5_VisionAttentionas well but still used the old signature:This PR updates Qwen3VL and Qwen3Omni to fix this.
Test Plan
Note: I didn't test Qwen3OmniMoE since I didn't have a large enough GPU on hand today, but it's the same code changes as for Qwen3VL so I believe it should be fine.
/cc @Lucaskabela @ywang96
I'm happy to make a followup PR to add
torch.compilesupport to the other modules in Qwen3VL tomorrow once I get a chance to run some benchmarks.