Skip to content

Conversation

@Barry-Delaney
Copy link
Collaborator

This PR restores the per-expert pre-quant scale kernel into the original per-channel one in MoE modules to fix the broken ModelOpt Mixtral-AWQ support.

@Barry-Delaney Barry-Delaney self-assigned this May 21, 2025
@Barry-Delaney Barry-Delaney requested a review from a team as a code owner May 21, 2025 15:36
@Barry-Delaney
Copy link
Collaborator Author

/bot run

@Barry-Delaney Barry-Delaney requested a review from Tracin May 21, 2025 15:36
@tensorrt-cicd
Copy link
Collaborator

PR_Github #6041 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6041 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #23 completed with status: 'FAILURE'

@Barry-Delaney Barry-Delaney force-pushed the user/barry/fix_prequant_release branch from 12aac5f to 8dee53e Compare May 22, 2025 02:05
@Barry-Delaney
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6082 [ run ] triggered by Bot

@Barry-Delaney
Copy link
Collaborator Author

/bot kill

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6085 [ kill ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6082 [ run ] completed with state ABORTED

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6085 [ kill ] completed with state SUCCESS
Successfully killed previous jobs for commit 8dee53e

@Barry-Delaney
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6114 [ run ] triggered by Bot

@Barry-Delaney
Copy link
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6116 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6114 [ run ] completed with state ABORTED

@Tracin
Copy link
Collaborator

Tracin commented May 22, 2025

LGTM

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6116 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #35 completed with status: 'FAILURE'

@Barry-Delaney Barry-Delaney force-pushed the user/barry/fix_prequant_release branch from f37b3ca to dec4eb5 Compare May 22, 2025 11:41
@Barry-Delaney
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6145 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6145 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #39 completed with status: 'FAILURE'

@Barry-Delaney Barry-Delaney force-pushed the user/barry/fix_prequant_release branch from dec4eb5 to 039f4be Compare May 22, 2025 17:46
@Barry-Delaney
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6171 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6171 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #44 completed with status: 'FAILURE'

Signed-off-by: Barry Kang <[email protected]>
Signed-off-by: Barry Kang <[email protected]>
@Barry-Delaney Barry-Delaney force-pushed the user/barry/fix_prequant_release branch from 039f4be to a970958 Compare May 23, 2025 01:20
@Barry-Delaney
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6194 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #6194 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #50 completed with status: 'SUCCESS'

@Barry-Delaney Barry-Delaney requested a review from litaotju May 23, 2025 04:43
@litaotju litaotju merged commit 26793e3 into NVIDIA:release/0.20 May 23, 2025
3 checks passed
shaharmor98 pushed a commit to shaharmor98/tekit that referenced this pull request May 28, 2025
)

* Restore per-channel pre-quant

Signed-off-by: Barry Kang <[email protected]>

* Update TRT test script

Signed-off-by: Barry Kang <[email protected]>

* Fix pre-commit

Signed-off-by: Barry Kang <[email protected]>

---------

Signed-off-by: Barry Kang <[email protected]>
omera-nv pushed a commit to omera-nv/TensorRT-LLM that referenced this pull request Jun 3, 2025
)

* Restore per-channel pre-quant

Signed-off-by: Barry Kang <[email protected]>

* Update TRT test script

Signed-off-by: Barry Kang <[email protected]>

* Fix pre-commit

Signed-off-by: Barry Kang <[email protected]>

---------

Signed-off-by: Barry Kang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants