[Kernels] Enable DeepGEMM by default #24462

bnellnm · 2025-09-08T19:58:40Z

Purpose

Enable DeepGEMM kernels by default.

Test Plan

Automated tests.

Test Result

cc @tlrmchlsmth , @yewentao256 , @mgoin , @simon-mo , @WoosukKwon , @youkaichao

Signed-off-by: Bill Nell <[email protected]>

gemini-code-assist

Code Review

This pull request aims to enable DeepGEMM by default. However, the current change only updates a type hint and does not alter the runtime default value. A critical change is needed in vllm/envs.py to actually enable DeepGEMM by default as intended.

vllm/envs.py

tlrmchlsmth

+1 on enabling deepgemm by default since we have fallbacks in place.

Do we need to check if DeepGEMM is installed? And should we add DeepGEMM as a requirement?

vllm/envs.py

Signed-off-by: Bill Nell <[email protected]>

bnellnm · 2025-09-08T20:28:59Z

+1 on enabling deepgemm by default since we have fallbacks in place.

Do we need to check if DeepGEMM is installed? And should we add DeepGEMM as a requirement?

DeepGEMM is in the docker file and afaik, everything checks is_deep_gemm_supported or has_deep_gemm. If the package is not present vllm should fallback to triton/cutlass.

simon-mo · 2025-09-09T00:34:44Z

@youkaichao can deepgemm make a pypi release?

yewentao256

+1, I think it is a good idea to enable DeepGEMM by default

…litPR into model_register * 'model_register' of https://github.com/dsxsteven/vllm_splitPR: (138 commits) Retrieve `sliding_window` from text config in Gemma3 MM (vllm-project#25085) [Docs] Fix API Reference (vllm-project#25140) [Kernel] Better inf handling for grouped topk cu (vllm-project#24886) [CLI] Use streaming in CLI chat and completion commands (vllm-project#23769) [benchmark] add peak throughput metrics and plot (vllm-project#23867) [Spec Decode] Efficient padded speculation (vllm-project#24539) [V0 Deprecation] Remove more V0 tests (vllm-project#25117) [EPLB] Add EPLB support for hunyuan_v1 (vllm-project#23078) [XPU] Whisper model support on XPU Platform (vllm-project#25123) Mark prompt logprobs as incompatible with prompt embeds at API level (vllm-project#25077) [Model] enable data parallel for InternVL vision encoder (vllm-project#23909) [Kernels] Overlap shared experts with combine instead of dispatch (vllm-project#24254) [Bugfix][Qwen3-Next] add prefixes to shared_expert in qwen3-next and mlp in qwen2moe to successfully load ignored params in quantized models (vllm-project#24960) [Core][MM] Cleanup `MultiModalCache` (vllm-project#25006) [Docs] Clean up the contributing README (vllm-project#25099) [MM Encoder] Apply DP ViT for Qwen3-VL model series (vllm-project#24955) [Kernels] Enable DeepGEMM by default (vllm-project#24462) [V0 Deprecation] Skip PP test (vllm-project#25128) [V0 Deprecation] Remove misc V0 tests (vllm-project#25118) [V0 Deprecation] Remove V0 Tracing & Metrics tests (vllm-project#25115) ...

Signed-off-by: Bill Nell <[email protected]>

Signed-off-by: Bill Nell <[email protected]> Signed-off-by: charlifu <[email protected]>

[Kernels] Enable DeepGEMM by default

d6e7403

Signed-off-by: Bill Nell <[email protected]>

gemini-code-assist bot reviewed Sep 8, 2025

View reviewed changes

vllm/envs.py Show resolved Hide resolved

tlrmchlsmth reviewed Sep 8, 2025

View reviewed changes

vllm/envs.py Show resolved Hide resolved

obey ai masters

db1bc18

Signed-off-by: Bill Nell <[email protected]>

yewentao256 reviewed Sep 9, 2025

View reviewed changes

simon-mo approved these changes Sep 17, 2025

View reviewed changes

tlrmchlsmth approved these changes Sep 17, 2025

View reviewed changes

tlrmchlsmth added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 17, 2025

tlrmchlsmth enabled auto-merge (squash) September 17, 2025 20:00

mgoin approved these changes Sep 17, 2025

View reviewed changes

mgoin added the deepseek Related to DeepSeek models label Sep 17, 2025

vllm-bot merged commit 4ac510f into vllm-project:main Sep 18, 2025
46 of 48 checks passed

mgoin deleted the deep-gemm-on branch September 18, 2025 03:19

debroy-rh pushed a commit to debroy-rh/vllm that referenced this pull request Sep 19, 2025

[Kernels] Enable DeepGEMM by default (vllm-project#24462)

bffe9dd

Signed-off-by: Bill Nell <[email protected]>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Kernels] Enable DeepGEMM by default (vllm-project#24462)

fa4950f

Signed-off-by: Bill Nell <[email protected]>

charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025

[Kernels] Enable DeepGEMM by default (vllm-project#24462)

6beb93b

Signed-off-by: Bill Nell <[email protected]> Signed-off-by: charlifu <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Kernels] Enable DeepGEMM by default #24462

[Kernels] Enable DeepGEMM by default #24462

Uh oh!

bnellnm commented Sep 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

tlrmchlsmth left a comment

Uh oh!

Uh oh!

bnellnm commented Sep 8, 2025 •

edited

Loading

Uh oh!

simon-mo commented Sep 9, 2025

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Kernels] Enable DeepGEMM by default #24462

[Kernels] Enable DeepGEMM by default #24462

Uh oh!

Conversation

bnellnm commented Sep 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bnellnm commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

simon-mo commented Sep 9, 2025

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bnellnm commented Sep 8, 2025 •

edited by github-actions bot

Loading

bnellnm commented Sep 8, 2025 •

edited

Loading