Skip to content

Conversation

bnellnm
Copy link
Contributor

@bnellnm bnellnm commented Sep 8, 2025

Purpose

Enable DeepGEMM kernels by default.

Test Plan

Automated tests.

Test Result

cc @tlrmchlsmth , @yewentao256 , @mgoin , @simon-mo , @WoosukKwon , @youkaichao

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to enable DeepGEMM by default. However, the current change only updates a type hint and does not alter the runtime default value. A critical change is needed in vllm/envs.py to actually enable DeepGEMM by default as intended.

Copy link
Member

@tlrmchlsmth tlrmchlsmth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on enabling deepgemm by default since we have fallbacks in place.

Do we need to check if DeepGEMM is installed? And should we add DeepGEMM as a requirement?

Signed-off-by: Bill Nell <[email protected]>
@bnellnm
Copy link
Contributor Author

bnellnm commented Sep 8, 2025

+1 on enabling deepgemm by default since we have fallbacks in place.

Do we need to check if DeepGEMM is installed? And should we add DeepGEMM as a requirement?

DeepGEMM is in the docker file and afaik, everything checks is_deep_gemm_supported or has_deep_gemm. If the package is not present vllm should fallback to triton/cutlass.

@simon-mo
Copy link
Collaborator

simon-mo commented Sep 9, 2025

@youkaichao can deepgemm make a pypi release?

Copy link
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think it is a good idea to enable DeepGEMM by default

@tlrmchlsmth tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 17, 2025
@tlrmchlsmth tlrmchlsmth enabled auto-merge (squash) September 17, 2025 20:00
@mgoin mgoin added the deepseek Related to DeepSeek models label Sep 17, 2025
@vllm-bot vllm-bot merged commit 4ac510f into vllm-project:main Sep 18, 2025
46 of 48 checks passed
@mgoin mgoin deleted the deep-gemm-on branch September 18, 2025 03:19
845473182 pushed a commit to dsxsteven/vllm_splitPR that referenced this pull request Sep 18, 2025
…litPR into model_register

* 'model_register' of https://github.com/dsxsteven/vllm_splitPR: (138 commits)
  Retrieve `sliding_window` from text config in Gemma3 MM (vllm-project#25085)
  [Docs] Fix API Reference (vllm-project#25140)
  [Kernel] Better inf handling for grouped topk cu (vllm-project#24886)
  [CLI] Use streaming in CLI chat and completion commands (vllm-project#23769)
  [benchmark] add peak throughput metrics and plot (vllm-project#23867)
  [Spec Decode] Efficient padded speculation (vllm-project#24539)
  [V0 Deprecation] Remove more V0 tests (vllm-project#25117)
  [EPLB] Add EPLB support for hunyuan_v1 (vllm-project#23078)
  [XPU] Whisper model support on XPU Platform (vllm-project#25123)
  Mark prompt logprobs as incompatible with prompt embeds at API level (vllm-project#25077)
  [Model] enable data parallel for InternVL vision encoder (vllm-project#23909)
  [Kernels] Overlap shared experts with combine instead of dispatch (vllm-project#24254)
  [Bugfix][Qwen3-Next] add prefixes to shared_expert in qwen3-next and mlp in qwen2moe to successfully load ignored params in quantized models (vllm-project#24960)
  [Core][MM] Cleanup `MultiModalCache` (vllm-project#25006)
  [Docs] Clean up the contributing README (vllm-project#25099)
  [MM Encoder] Apply DP ViT for Qwen3-VL model series (vllm-project#24955)
  [Kernels] Enable DeepGEMM by default (vllm-project#24462)
  [V0 Deprecation] Skip PP test (vllm-project#25128)
  [V0 Deprecation] Remove misc V0 tests (vllm-project#25118)
  [V0 Deprecation] Remove V0 Tracing & Metrics tests (vllm-project#25115)
  ...
debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deepseek Related to DeepSeek models ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants