[0.11.0][Perf] Add padding vision tower for Qwen2_5_Omni #4041

Semmer2 · 2025-11-06T12:55:56Z

Replace Qwen2_5_Omni_Thinker model's original vision tower Qwen2_5_VisionTransformer with AscendQwen2_5_VisionTransoformer. It pads attention qkv weights and bias for better performance.

What this PR does / why we need it?

This PR repalce the vision tower in Qwen2.5-Omni-Thinker model, Qwen2_5_VisionTransformer, with AscendQwen2_5_VisionTransformer, which use QKV padding for padding performance.

Does this PR introduce any user-facing change?

No

How was this patch tested?

vLLM version: v0.11.0rc3
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

github-actions · 2025-11-06T12:56:25Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces performance optimizations for the Qwen2.5-Omni-Thinker model on Ascend hardware by replacing the standard vision tower with an optimized version that uses QKV padding. The changes involve registering the new Ascend-specific model, implementing the model class which uses AscendQwen2_5_VisionTransformer, and updating the weight loading logic to support the new attn.qkv naming and apply padding.

My review has identified a critical issue in the new model's initialization that would cause a runtime error, and a high-severity issue regarding the robustness of the weight padding logic, which currently only handles one type of QKV layer naming. Please see the detailed comments for suggestions.

vllm_ascend/models/qwen2_5_omni_thinker.py

vllm_ascend/models/qwen2_5_vl.py

Replace Qwen2_5_Omni_Thinker model's original vision tower Qwen2_5_VisionTransformer with AscendQwen2_5_VisionTransoformer. It pads attention qkv weights and bias for better performance. Signed-off-by: Ting FU <[email protected]>

MengqingCao

I'm okay with merging this into 0.11.0-dev, lgtm

Semmer2 force-pushed the 0.11.0-dev branch from 27f8800 to 4210225 Compare November 6, 2025 12:57

gemini-code-assist bot reviewed Nov 6, 2025

View reviewed changes

vllm_ascend/models/qwen2_5_omni_thinker.py Outdated Show resolved Hide resolved

vllm_ascend/models/qwen2_5_vl.py Show resolved Hide resolved

Semmer2 force-pushed the 0.11.0-dev branch 3 times, most recently from 0cb33c6 to e8c0771 Compare November 7, 2025 09:56

[0.11.0][Perf] Add padding vision tower for Qwen2_5_Omni

e8c0771

Replace Qwen2_5_Omni_Thinker model's original vision tower Qwen2_5_VisionTransformer with AscendQwen2_5_VisionTransoformer. It pads attention qkv weights and bias for better performance. Signed-off-by: Ting FU <[email protected]>

MengqingCao approved these changes Nov 8, 2025

View reviewed changes

MengqingCao merged commit f984256 into vllm-project:v0.11.0-dev Nov 8, 2025
16 checks passed

gcanlin mentioned this pull request Nov 8, 2025

[Refactor] Remove redundant TP gather/split in split_qkv in QwenVL vllm-project/vllm#28271

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[0.11.0][Perf] Add padding vision tower for Qwen2_5_Omni #4041

[0.11.0][Perf] Add padding vision tower for Qwen2_5_Omni #4041

Uh oh!

Semmer2 commented Nov 6, 2025

Uh oh!

github-actions bot commented Nov 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

MengqingCao left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[0.11.0][Perf] Add padding vision tower for Qwen2_5_Omni #4041

[0.11.0][Perf] Add padding vision tower for Qwen2_5_Omni #4041

Uh oh!

Conversation

Semmer2 commented Nov 6, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Nov 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

MengqingCao left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants