We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent b56b733 commit 9f13e27Copy full SHA for 9f13e27
vllm/model_executor/layers/fused_moe/modular_kernel.py
@@ -510,7 +510,7 @@ def workspace_shapes(
510
511
Inputs:
512
- M_chunk: current number of tokens due to chunking, otherwise same as
513
- M_full.
+ M_full, generally used for intermediate workspace shapes.
514
- M_full: full number of tokens, generally used to compute output shape.
515
- N: Row (or column) dimension of expert weights.
516
- K: hidden dimension
0 commit comments