fix: MoE autotune fallback failed to query default heuristic #5520

rosenrodt · 2025-06-26T13:45:37Z

fix: MoE autotune fallback failed to query default heuristic

Description

Hot fix for #5207

#5207 intended to expose algo_id (the different kernels) of FC1/FC2 TrtllmGenBatchedGemmRunner to MoE::Runner in order to facilitate autotuning. Without auto-tune, MoE::Runner should fall back to default algo_id supplied by FC1/FC2
TrtllmGenBatchedGemmRunner.

This PR fixes an issue where the fallback mechanism had ignored the default algo_id supplied by FC1/FC2.

Test Coverage

Correctness tests is already in place. But no test coverage yet for performance issue.

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

Signed-off-by: Anthony Chang <[email protected]>

rosenrodt · 2025-06-26T13:45:52Z

/bot run

tensorrt-cicd · 2025-06-26T13:51:21Z

PR_Github #10039 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-26T16:26:17Z

PR_Github #10039 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7409 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

…5520) Signed-off-by: Anthony Chang <[email protected]>

fix: autotune fallback failed to query default heuristic

b7a3952

Signed-off-by: Anthony Chang <[email protected]>

rosenrodt requested a review from DomBrown June 26, 2025 13:45

DomBrown approved these changes Jun 26, 2025

View reviewed changes

DomBrown merged commit de7cd0d into NVIDIA:main Jun 26, 2025
4 checks passed

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 9, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

4f975b6

…5520) Signed-off-by: Anthony Chang <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

0ef6d57

…5520) Signed-off-by: Anthony Chang <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

78f9506

…5520) Signed-off-by: Anthony Chang <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

8ea373f

…5520) Signed-off-by: Anthony Chang <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

4a5944c

…5520) Signed-off-by: Anthony Chang <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

39c7f15

…5520) Signed-off-by: Anthony Chang <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

afd11cf

…5520) Signed-off-by: Anthony Chang <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025

fix: MoE autotune fallback failed to query default heuristic (NVIDIA#…

1e38791

…5520) Signed-off-by: Anthony Chang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: MoE autotune fallback failed to query default heuristic #5520

fix: MoE autotune fallback failed to query default heuristic #5520

Uh oh!

rosenrodt commented Jun 26, 2025

Uh oh!

rosenrodt commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

Uh oh!

Uh oh!

fix: MoE autotune fallback failed to query default heuristic #5520

fix: MoE autotune fallback failed to query default heuristic #5520

Uh oh!

Conversation

rosenrodt commented Jun 26, 2025

fix: MoE autotune fallback failed to query default heuristic

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

rosenrodt commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

Uh oh!

Uh oh!