Skip to content

Conversation

hyukn
Copy link
Collaborator

@hyukn hyukn commented Jun 4, 2025

This is due to the number of autotuning warmup requests exceeding the limit of max_batch_size. This PR will limit the batch size according to multiple size constraints.
Autotuning is disabled due to a race condition that was triggered in the CI. This can be fixed by #4565. This has been locally checked on main. Will enable the config on the main after being merged.

@hyukn hyukn requested a review from litaotju June 4, 2025 11:15
@hyukn hyukn changed the title [5310329] fix Fix warmup phase batch size out of range. [5310329] fix: Fix warmup phase batch size out of range. Jun 4, 2025
@hyukn hyukn marked this pull request as ready for review June 4, 2025 11:15
@hyukn hyukn requested a review from a team as a code owner June 4, 2025 11:15
@hyukn
Copy link
Collaborator Author

hyukn commented Jun 4, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7511 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7511 [ run ] completed with state FAILURE
/LLM/release-0.20/L0_MergeRequest_PR pipeline #167 completed with status: 'FAILURE'

@hyukn hyukn force-pushed the fix/5310329 branch 2 times, most recently from e620c09 to ff59276 Compare June 4, 2025 13:24
@hyukn
Copy link
Collaborator Author

hyukn commented Jun 4, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7524 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7524 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #168 completed with status: 'FAILURE'

@hyukn
Copy link
Collaborator Author

hyukn commented Jun 4, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7548 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7548 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #170 completed with status: 'FAILURE'

@hyukn
Copy link
Collaborator Author

hyukn commented Jun 5, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7591 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7591 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #173 completed with status: 'FAILURE'

@hyukn hyukn force-pushed the fix/5310329 branch 2 times, most recently from 322b26d to 5f6bbbb Compare June 5, 2025 06:58
@hyukn
Copy link
Collaborator Author

hyukn commented Jun 5, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7652 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7652 [ run ] completed with state FAILURE
/LLM/release-0.20/L0_MergeRequest_PR pipeline #174 completed with status: 'FAILURE'

@litaotju litaotju self-requested a review June 5, 2025 10:05
@hyukn
Copy link
Collaborator Author

hyukn commented Jun 5, 2025

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7708 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7708 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #179 completed with status: 'SUCCESS'

@hyukn hyukn enabled auto-merge (squash) June 6, 2025 01:05
@hyukn
Copy link
Collaborator Author

hyukn commented Jun 6, 2025

Based on the successful run before rebase, I will reuse the pipeline.

@hyukn
Copy link
Collaborator Author

hyukn commented Jun 6, 2025

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7830 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #7830 [ reuse-pipeline ] completed with state SUCCESS
Release Check Pipeline #1134 failed
Reusing PR_Github #7708 for commit a25aa7e

@litaotju litaotju disabled auto-merge June 6, 2025 04:25
@litaotju litaotju merged commit fa20ffc into NVIDIA:release/0.20 Jun 6, 2025
2 of 3 checks passed
pcastonguay pushed a commit to pcastonguay/TensorRT-LLM that referenced this pull request Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants