-
-
Notifications
You must be signed in to change notification settings - Fork 10.7k
Revert "[Feature] Integrate new deepgemm (#19820)" #20049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "[Feature] Integrate new deepgemm (#19820)" #20049
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @yewentao256, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request is titled as a revert of a previous deepgemm
integration, citing an unmerged upstream dependency. However, upon inspection of the changes, it appears to be an update and refinement of the existing deepgemm
integration rather than a full undo. The changes primarily involve adapting to new deepgemm
API conventions and reorganizing related utility functions within the codebase, particularly within benchmark and test files.
Highlights
- Stated Intent vs. Actual Changes: While this pull request is titled as a revert of a previous
deepgemm
integration, the actual changes implemented suggest an update and refinement of the existingdeepgemm
integration, adapting to new API conventions and internal organization, rather than undoing the feature. - DeepGEMM API Migration: Core
deepgemm
function calls (e.g.,fp8_gemm_nt
,fp8_m_grouped_gemm_nt_masked
,m_grouped_fp8_gemm_nt_contiguous
) have been updated to their newgemm_fp8_fp8_bf16_nt
andm_grouped_gemm_fp8_fp8_bf16_nt_contiguous
counterparts across various files, reflecting changes in thedeepgemm
library's API. - FP8 Utility Refactoring: Common FP8 quantization utility functions like
per_block_cast_to_fp8
,per_token_cast_to_fp8
,ceil_div
, andget_col_major_tma_aligned_tensor
have been relocated, re-implemented locally within test/benchmark files, or are now directly imported from thedeep_gemm
module, streamlining their usage and reducing redundancy. - Benchmark Script Improvements: The
benchmark_fp8_block_dense_gemm.py
script has been enhanced with expanded benchmark shapes, improved formatting, and additional comments, making the performance evaluation more comprehensive and readable.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
Oh sorry I didn't know that the PR was dependent on the upgrade. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This PR reverts the integration of the deepgemm feature. The changes appear correct, and I've suggested removing some commented-out code.
@WoosukKwon No worries, actually it is my fault, I should mark it for not merged. Let's see if DeepGemm will merge the pr quickly, if they don't and someone met the interface error in vllm, we can directly merge this pr to fix |
This reverts commit c6e3bba. Signed-off-by: yewentao256 <[email protected]>
Signed-off-by: yewentao256 <[email protected]>
ba4eb71
to
8f58030
Compare
@WoosukKwon They said they will merge the pr after 2~3 weeks, let's merge this pr first, thanks! |
This reverts commit c6e3bba.
Purpose
#19820
This pr should be merged after deepseek-ai/DeepGEMM#112
I have asked the author to merge quickly, but if they don't, we should temporily revert this pr.