-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
skip fusedmoe layer for start_load_kv #21378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly addresses a crash that occurs when using FusedMoE layers with KV cache transfer by skipping layers that lack a kv_cache attribute. The fix is logical and directly solves the reported issue.
I've added one critical comment to enhance the robustness of the fix. The current implementation using hasattr could still lead to a TypeError if the kv_cache attribute exists but is None. My suggestion is to use getattr for a more defensive check that handles both missing attributes and None values, preventing potential runtime crashes.
vllm/distributed/kv_transfer/kv_connector/v1/p2p/p2p_nccl_connector.py
Outdated
Show resolved
Hide resolved
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
8bf97e4 to
a8fca79
Compare
|
@DarkLight1337 @KuntaiDu PTAL, thanks |
Signed-off-by: calvin chen <[email protected]>
a8fca79 to
afb09b0
Compare
|
Thanks, test this pr by model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR LGTM.
The code is overlapping with #21428 but it's for a different connector. Maybe we can deduplicate the code in the future.
Signed-off-by: calvin chen <[email protected]>
Signed-off-by: calvin chen <[email protected]> Signed-off-by: x22x22 <[email protected]>
Signed-off-by: calvin chen <[email protected]>
Signed-off-by: calvin chen <[email protected]>
Signed-off-by: calvin chen <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
Signed-off-by: calvin chen <[email protected]> Signed-off-by: Noam Gat <[email protected]>
Signed-off-by: calvin chen <[email protected]> Signed-off-by: Paul Pak <[email protected]>
Signed-off-by: calvin chen <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
Signed-off-by: calvin chen <[email protected]>
Signed-off-by: calvin chen <[email protected]>
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.Purpose
fix: #21359
Test Plan
Test Result
(Optional) Documentation Update