Skip to content

Conversation

@angelayi
Copy link
Contributor

@angelayi angelayi commented Aug 29, 2025

Purpose

Fixes #21101

Previously the fix_functionalization pass for rotary_embedding was written assuming the graph would have a specific pattern, where the rotary_embedding surrounded with slice scatters. This fails in the Qwen3-4B model where the graph no longer has this pattern. This PR adds a check for the slice_scatter pattern above, otherwise it will directly replace the auto_functionalized op with the op.

Test Plan

The issue runs successfully

Test Result

@angelayi
Copy link
Contributor Author

@ProExpertProg it seems like the existing test_functionalization does not run on master and is not tested on ci. Do you have any suggestions on how to verify this change? I've manually verified the graph for Qwen3-4B has been modified after the pass, and the test case's TinyLlama-1.1B-Chat-v1.0-FP8-e2e graph has also been modified the the same way as before.

@angelayi angelayi marked this pull request as ready for review August 29, 2025 21:20
@ProExpertProg
Copy link
Collaborator

Could you add the test to CI? I'll take a closer look later

@angelayi
Copy link
Contributor Author

The test is failing because "'LLMEngine' object has no attribute 'model_executor'" -- sorry, do you know what's the proper way to get the model from this config?

@angelayi
Copy link
Contributor Author

@ProExpertProg I updated this PR, also fixed the test_functionalization test case in (#24376)

Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this! Just a few nits for the unit test

@ProExpertProg ProExpertProg added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 1, 2025
@angelayi angelayi force-pushed the fix_func branch 2 times, most recently from 734275f to d2c827c Compare October 2, 2025 17:21
@simon-mo simon-mo merged commit 7cfa4b2 into vllm-project:main Oct 3, 2025
45 of 47 checks passed
tomeras91 pushed a commit to tomeras91/vllm that referenced this pull request Oct 6, 2025
karan pushed a commit to karan/vllm that referenced this pull request Oct 6, 2025
southfreebird pushed a commit to southfreebird/vllm that referenced this pull request Oct 7, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Enable cutom_op of rotary_embedding goes error for Qwen3-4B

3 participants