[perf] improve XQA-MLA perf #5468

lowsfer · 2025-06-25T07:47:07Z

Improve sm120 XQA-MLA perf with better latency hiding, test_wait and simplified intra-CGA data transfer

Improve sm120 XQA-MLA perf with better latency hiding, test_wait and simplified intra-CGA data transfer Signed-off-by: Yao Yao <[email protected]>

lowsfer · 2025-06-25T08:43:15Z

/bot run

tensorrt-cicd · 2025-06-25T08:48:19Z

PR_Github #9849 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-25T12:02:48Z

PR_Github #9849 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #7265 completed with status: 'FAILURE'

lowsfer · 2025-06-26T07:21:26Z

/bot run

tensorrt-cicd · 2025-06-26T07:26:34Z

PR_Github #9997 [ run ] triggered by Bot

tensorrt-cicd · 2025-06-26T09:03:49Z

PR_Github #9997 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7373 completed with status: 'SUCCESS'

Signed-off-by: Yao Yao <[email protected]>

lowsfer force-pushed the xqa-mla branch from d6ed160 to 280c569 Compare June 25, 2025 07:48

[perf] improve XQA-MLA perf

5fccc4a

Improve sm120 XQA-MLA perf with better latency hiding, test_wait and simplified intra-CGA data transfer Signed-off-by: Yao Yao <[email protected]>

lowsfer force-pushed the xqa-mla branch from 5a43570 to 5fccc4a Compare June 25, 2025 08:42

NVIDIA deleted a comment from tensorrt-cicd Jun 25, 2025

lowsfer requested a review from ming-wei June 25, 2025 09:41

ming-wei approved these changes Jun 25, 2025

View reviewed changes

lowsfer merged commit 0788c5d into NVIDIA:main Jun 26, 2025
3 checks passed

lowsfer deleted the xqa-mla branch June 26, 2025 10:09

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 9, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

c59fbe6

Signed-off-by: Yao Yao <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

085289e

Signed-off-by: Yao Yao <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

88db2d3

Signed-off-by: Yao Yao <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

2a8d668

Signed-off-by: Yao Yao <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

66d8424

Signed-off-by: Yao Yao <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

27773a0

Signed-off-by: Yao Yao <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

2538af2

Signed-off-by: Yao Yao <[email protected]>

dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025

[perf] improve XQA-MLA perf (NVIDIA#5468)

335be03

Signed-off-by: Yao Yao <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[perf] improve XQA-MLA perf #5468

[perf] improve XQA-MLA perf #5468

Uh oh!

lowsfer commented Jun 25, 2025

Uh oh!

lowsfer commented Jun 25, 2025

Uh oh!

tensorrt-cicd commented Jun 25, 2025

Uh oh!

tensorrt-cicd commented Jun 25, 2025

Uh oh!

lowsfer commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

Uh oh!

Uh oh!

[perf] improve XQA-MLA perf #5468

[perf] improve XQA-MLA perf #5468

Uh oh!

Conversation

lowsfer commented Jun 25, 2025

Uh oh!

lowsfer commented Jun 25, 2025

Uh oh!

tensorrt-cicd commented Jun 25, 2025

Uh oh!

tensorrt-cicd commented Jun 25, 2025

Uh oh!

lowsfer commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

tensorrt-cicd commented Jun 26, 2025

Uh oh!

Uh oh!

Uh oh!