-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
[Bugfix] SharedStorage Connector for V1 PD multimodal #21611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
An excellent pull request that addresses an important bug in multimodal KV caching. The changes to propagate the mm_hashes are well-implemented.
My review focuses on a critical security vulnerability and a related correctness issue in the new cache key generation logic. I've provided a detailed explanation and a code suggestion to resolve these issues, ensuring the system is both secure and robust.
Once the feedback is addressed, this PR will be a great addition to the codebase.
vllm/distributed/kv_transfer/kv_connector/v1/shared_storage_connector.py
Outdated
Show resolved
Hide resolved
58a52ff to
bc68bef
Compare
|
@DarkLight1337 @ywang96 could you please review this PR? |
|
Thanks for fixing, can you add a unit test to verify? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use the tmp_path pytest fixture which automatically handles create/delete directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestions.
Fixed. pls review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, LGTM given tests pass
|
The pre-commit check failed, but it is triggered by the file fused_moe_modular_kernel.md that is not involved in this PR. Is this normal? |
|
OK let's ignore it then |
|
Other than the gpu memory error in v1-test, the failed CI are irrelevants to the files invovled in this PR. Like I didn't see any other test script with Qwen2.5-VL-3B-Instruct adjust the |
Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]>
Head branch was pushed to by a user without write access
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]> Signed-off-by: x22x22 <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]> Signed-off-by: Noam Gat <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]> Signed-off-by: Paul Pak <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]>
…1611) Signed-off-by: fake0fan <[email protected]> Signed-off-by: herotai214 <[email protected]> Co-authored-by: herotai214 <[email protected]>
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Support V1 PD multimodal input without wrongly reusing kv (issue 21175)
#21175
It naively concatenates the mm_hashes to form a unique key;
Can optimize this in the future if needed.
Purpose
It brings in the mm_hashes from the Request to the connector to form a proper key; that can help distinguish kv for different mm input. It fix the problems about wrongly reusing the kv mentioned in the issue.
Support multiple multimodal inputs as well.
Test Plan
By running pytest test_shared_strorage_connector.py, it run the test through the test_shared_storage_connector_hashes(); pytest -s to see full print records for a better understanding and investigation.
The generated KV folders and files will be stored in Path("storage_1/") during the test, which is located at where you execute this ut test file.
There are 11 designed input cases to test the feature. Excepted length (number of folders in the storage path), which is the number of unique hashes form, and the information about each input case, is explicit mentioned and printed in the script. There are 6 unique combinations in the 11 input cases, so there should be 6 unique hashes in the folder at the end. The code also checks whether the number of hashes is correct after each llm generate.
Note that the order in important to test it, therefore I did not make them separate test cases in pytest, but use a for loop to execute them one by one.
Test Result
All tests passed successfully!