chore: Mass integration of release/0.20. #4732

shaharmor98 · 2025-05-28T11:13:26Z

Mass integration of release/0.20 to main.

) * Restore per-channel pre-quant Signed-off-by: Barry Kang <[email protected]> * Update TRT test script Signed-off-by: Barry Kang <[email protected]> * Fix pre-commit Signed-off-by: Barry Kang <[email protected]> --------- Signed-off-by: Barry Kang <[email protected]>

Signed-off-by: Ivy Zhang <[email protected]>

Signed-off-by: Yiqing Yan <[email protected]>

…e memory and log more memory information (NVIDIA#4660) Signed-off-by: Hui Gao <[email protected]>

Signed-off-by: nv-guomingz <[email protected]>

…d weight loading in fused moe. (NVIDIA#4699) Signed-off-by: Yuxian Qiu <[email protected]>

Signed-off-by: Balaram Buddharaju <[email protected]>

shaharmor98 · 2025-05-28T11:16:45Z

/bot run

tensorrt-cicd · 2025-05-28T11:22:54Z

PR_Github #6768 [ run ] triggered by Bot

yuxianq

LGTM for my part.

tensorrt-cicd · 2025-05-28T13:23:48Z

PR_Github #6768 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #4933 completed with status: 'FAILURE'

Barry-Delaney

LGTM. Please remember updating the internal commit ID before merge this PR.

amirkl94 · 2025-05-29T10:49:11Z

tensorrt_llm/_torch/modules/fused_moe.py

-            # expert_idx is the local slot index of current rank
-            expert_idx = local_slot_id
+        max_workers = min(
+            (self.expert_end - self.expert_start) * 2,


@yuxianq the pipeline fails as self.expert_end was removed in a PR in main, how do you suggest to solve this?
#4495

Please use self.expert_size_per_partition instead of (self.expert_end - self.expert_start)

@hlu1 My PR #4699 may conflict with your PR #4790 in this mass integration.
@amirkl94 If Hao's PR merged first, we should cherry-pick this change to tensorrt_llm/_torch/modules/fused_moe/quantization.py in https://github.com/NVIDIA/TensorRT-LLM/pull/4790/files#diff-19b05de4a4dd136814f3e04d4ed51c2e4f2389c7b0b2a6bca49195150ebadd66R87 instead.

Barry-Delaney and others added 7 commits May 28, 2025 09:08

tests: waive and unwaive QA test cases (NVIDIA#4644)

c88aa2d

Signed-off-by: Ivy Zhang <[email protected]>

[TRTLLM-5326] - Fix test coverage report generation (NVIDIA#4691)

cd57c3f

Signed-off-by: Yiqing Yan <[email protected]>

fix: [nvbug5300494] Use runtime total gpu memory to calculate kv cach…

138dc0c

…e memory and log more memory information (NVIDIA#4660) Signed-off-by: Hui Gao <[email protected]>

fix:https://nvbugs/5305692 update invalid links in doc. (NVIDIA#4698)

55183a4

Signed-off-by: nv-guomingz <[email protected]>

fix: [nvbugs/5289912][nvbugs/5232406] use thread pool for multi-threa…

20de60b

…d weight loading in fused moe. (NVIDIA#4699) Signed-off-by: Yuxian Qiu <[email protected]>

fix: Mistral Small vision encoder with BS>1 (NVIDIA#4713)

19933f0

Signed-off-by: Balaram Buddharaju <[email protected]>

shaharmor98 requested review from a team as code owners May 28, 2025 11:13

shaharmor98 requested review from juney-nvidia and hyukn May 28, 2025 11:13

amirkl94 requested a review from Barry-Delaney May 28, 2025 11:19

shaharmor98 requested review from crazydemo, nv-guomingz, yuxianq, brb-nv, yiqingy0 and HuiGao-NV May 28, 2025 11:44

shaharmor98 changed the title ~~Release 0.20 to main~~ chore: Mass integration of release/0.20. May 28, 2025

nv-guomingz approved these changes May 28, 2025

View reviewed changes

yuxianq approved these changes May 28, 2025

View reviewed changes

crazydemo approved these changes May 28, 2025

View reviewed changes

yiqingy0 approved these changes May 29, 2025

View reviewed changes

Barry-Delaney reviewed May 29, 2025

View reviewed changes

Barry-Delaney approved these changes May 29, 2025

View reviewed changes

amirkl94 reviewed May 29, 2025

View reviewed changes

shaharmor98 closed this Aug 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: Mass integration of release/0.20. #4732

chore: Mass integration of release/0.20. #4732

Uh oh!

shaharmor98 commented May 28, 2025 •

edited

Loading

Uh oh!

shaharmor98 commented May 28, 2025

Uh oh!

tensorrt-cicd commented May 28, 2025

Uh oh!

yuxianq left a comment

Uh oh!

tensorrt-cicd commented May 28, 2025

Uh oh!

Barry-Delaney left a comment

Uh oh!

amirkl94 May 29, 2025

Uh oh!

yuxianq May 29, 2025

Uh oh!

yuxianq May 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

chore: Mass integration of release/0.20. #4732

chore: Mass integration of release/0.20. #4732

Uh oh!

Conversation

shaharmor98 commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shaharmor98 commented May 28, 2025

Uh oh!

tensorrt-cicd commented May 28, 2025

Uh oh!

yuxianq left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented May 28, 2025

Uh oh!

Barry-Delaney left a comment

Choose a reason for hiding this comment

Uh oh!

amirkl94 May 29, 2025

Choose a reason for hiding this comment

Uh oh!

yuxianq May 29, 2025

Choose a reason for hiding this comment

Uh oh!

yuxianq May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shaharmor98 commented May 28, 2025 •

edited

Loading

yuxianq May 30, 2025 •

edited

Loading