[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes #27705

lgeiger · 2025-10-29T02:01:36Z

Purpose

#23207 Broke Qwen3VL since it relies on Qwen2_5_VisionAttention as well but still used the old signature:

RuntimeError: vllm::flash_attn_maxseqlen_wrapper() Expected a value of type 'Tensor' for argument 'max_seqlen' but instead found type 'int'

This PR updates Qwen3VL and Qwen3Omni to fix this.

Test Plan

vllm serve Qwen/Qwen3-VL-2B-Instruct-FP8

Note: I didn't test Qwen3OmniMoE since I didn't have a large enough GPU on hand today, but it's the same code changes as for Qwen3VL so I believe it should be fine.

/cc @Lucaskabela @ywang96

I'm happy to make a followup PR to add torch.compile support to the other modules in Qwen3VL tomorrow once I get a chance to run some benchmarks.

Signed-off-by: Lukas Geiger <[email protected]>

lgeiger · 2025-10-29T02:02:24Z

vllm/model_executor/models/qwen2_5_vl.py

-        max_seqlen, seqlens = (
-            torch.zeros(1, device=cu_seqlens.device),
-            torch.zeros(1, device=cu_seqlens.device),
-        )
+        max_seqlen = torch.zeros([], device=cu_seqlens.device)
+        seqlens = torch.zeros(1, device=cu_seqlens.device)


max_seqlen is actually a scalar value so this updates the dummy values to match this.

gemini-code-assist

Code Review

This pull request addresses a runtime error with torch.compile in Qwen3VL and Qwen3Omni models. The error was caused by passing a Python int instead of a torch.Tensor for the max_seqlen argument in the vision attention mechanism.

The changes correctly modify the compute_attn_mask_seqlen methods in qwen2_5_vl.py, qwen3_omni_moe_thinker.py, and qwen3_vl.py to return torch.Tensor objects for max_seqlen and seqlens, instead of Python scalars or lists. This is achieved by removing .item() and .tolist() calls. The type hints for the forward methods are also updated accordingly.

These changes are consistent, targeted, and directly resolve the reported torch.compile issue. The code quality is good, and I have no further suggestions.

Lucaskabela

Thanks for the fix @lgeiger! I was planning on making may way through to Qwen3, but didn't realize this PR had (indirectly) impacted those files

Given that but no test failed, do we have CI coverage for Qwen3 models? (cc @ywang96 @ProExpertProg @zou3519 )

If not it might be worth adding it to the basic initialization suite of tests

lgeiger · 2025-10-29T02:30:12Z

Thanks for the fix @lgeiger! I was planning on making may way through to Qwen3, but didn't realize this PR had (indirectly) impacted those files

No worries! The addition of torch.compile to the ViT part is amazing. Looking forward to your Qwen3 PRs 🎉

Lucaskabela · 2025-10-29T02:30:58Z

Also I would definitely welcome adding torch.compile to the vision component of Qwen3, so let me know if you need review/support on that or otherwise want to own it :) It should be relatively straightforward

The only "gotcha" is that right now the torch.compile integration compiles each instantiation of the VisionBlock separately - I am planning on putting out a fix for that shortly but in the meanwhile don't be surprised to see 30+ compile logs if applying torch.compile to the VisionBlock (it shouldn't effect correctness or runtime, but adds slightly to compile time and is not needed hence why I plan to fix this soon in followup)

ywang96

Thanks for fixing this! Stamping it now and I will test Qwen3-Omni on my end.

Given that but no test failed, do we have CI coverage for Qwen3 models?

Yea we didn't add them onto CI back then because the model support PR was merged before the model release, but we should definitely add one now that the smaller version of the models are available. I will take that as an action item

Signed-off-by: Roger Wang <[email protected]>

ywang96 · 2025-10-29T03:19:48Z

@lgeiger @Lucaskabela I added an fix for the missing attribute of self.vllm_config for Qwen3Omni since this model's encoder execution in fact inherits those from Qwen2.5Omni - something to also keep in mind when we add the support for Qwen3-VL, and I wonder if we should just make this an interface requirement for multimodal models.

Signed-off-by: Roger Wang <[email protected]>

…roject#27705) Signed-off-by: Lukas Geiger <[email protected]> Signed-off-by: Roger Wang <[email protected]> Co-authored-by: Roger Wang <[email protected]>

[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes

70d5ece

Signed-off-by: Lukas Geiger <[email protected]>

lgeiger requested a review from sighingnow as a code owner October 29, 2025 02:01

lgeiger commented Oct 29, 2025

View reviewed changes

mergify bot added the qwen Related to Qwen models label Oct 29, 2025

gemini-code-assist bot reviewed Oct 29, 2025

View reviewed changes

Lucaskabela approved these changes Oct 29, 2025

View reviewed changes

ywang96 approved these changes Oct 29, 2025

View reviewed changes

ywang96 mentioned this pull request Oct 29, 2025

[Qwen3 Omni] Fix attribution error in qwen3 omni #27708

Closed

5 tasks

ywang96 added 2 commits October 28, 2025 20:12

add

54f7b08

Signed-off-by: Roger Wang <[email protected]>

Merge branch 'main' into fix-qwen3

f462fb6

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 29, 2025

precommit

fa6b59e

Signed-off-by: Roger Wang <[email protected]>

ywang96 enabled auto-merge (squash) October 29, 2025 05:19

ywang96 merged commit 0d8161b into vllm-project:main Oct 29, 2025
54 checks passed

DarkLight1337 mentioned this pull request Oct 29, 2025

[VLM] Add Qwen3-VL generation test #25185

Merged

5 tasks

lgeiger deleted the fix-qwen3 branch October 29, 2025 08:53

JartX mentioned this pull request Oct 29, 2025

[Misc][qwen2_5_vl][torch.compile] Enable supports_torch_compile on generic nn.Module and demonstrate speedup on Qwen Vision model #23207

Merged

4 tasks

Kay-Tian mentioned this pull request Oct 29, 2025

vLLM PR #27705 变更核心文件提醒 Kay-Tian/vllm#60

Closed

ywang96 added this to the v0.11.1 milestone Oct 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes #27705

[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes #27705

Uh oh!

lgeiger commented Oct 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

lgeiger Oct 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Lucaskabela left a comment

Uh oh!

lgeiger commented Oct 29, 2025

Uh oh!

Lucaskabela commented Oct 29, 2025

Uh oh!

ywang96 left a comment

Uh oh!

ywang96 commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes #27705

[Model] Fix Qwen3VL and Qwen3Omni after torch.compile changes #27705

Uh oh!

Conversation

lgeiger commented Oct 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Uh oh!

lgeiger Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Lucaskabela left a comment

Choose a reason for hiding this comment

Uh oh!

lgeiger commented Oct 29, 2025

Uh oh!

Lucaskabela commented Oct 29, 2025

Uh oh!

ywang96 left a comment

Choose a reason for hiding this comment

Uh oh!

ywang96 commented Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lgeiger commented Oct 29, 2025 •

edited by github-actions bot

Loading