[nvbugs/5274894] fix: Moving finished context requests to generation #4576

Funatiq · 2025-05-22T12:17:38Z

Description

Unfinished chunked context requests appear at end of context requests vector.
Replaced std::find_if with std::partition to find the correct position to move finished context requests to generation.

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]

Launch build/test pipelines. All previously running jobs will be killed.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-[Post-Merge]-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

- Unfinished chunked context requests appear at end of context requests vector. - Replaced std::find_if with std::partition to find the correct position to move finished context requests to generation. Signed-off-by: Robin Kobus <[email protected]>

Funatiq · 2025-05-22T12:17:44Z

/bot run

tensorrt-cicd · 2025-05-22T12:23:20Z

PR_Github #6150 [ run ] triggered by Bot

tensorrt-cicd · 2025-05-22T15:05:38Z

PR_Github #6150 [ run ] completed with state SUCCESS
/LLM/release-0.20/L0_MergeRequest_PR pipeline #40 completed with status: 'SUCCESS'

…eration (NVIDIA#4576)" This reverts commit d39bcb6. Signed-off-by: Robin Kobus <[email protected]>

…rformance (#4608) * Revert "[nvbugs/5274894] fix: Moving finished context requests to generation (#4576)" This reverts commit d39bcb6. Signed-off-by: Robin Kobus <[email protected]> * fix: Sort requests for functional correctness and performance - Moved sorting related logic to a dedicated function for better clarity and maintainability. - Enhanced sorting logic to separate finished context requests from ongoing ones before sorting by Lora task ID. - Updated function documentation to reflect the sorting behavior and its purpose. Signed-off-by: Robin Kobus <[email protected]> --------- Signed-off-by: Robin Kobus <[email protected]>

Funatiq requested a review from a team as a code owner May 22, 2025 12:17

Funatiq changed the title ~~fix: Moving finished context requests to generation~~ [nvbugs/5274894] fix: Moving finished context requests to generation May 22, 2025

MartinMarciniszyn approved these changes May 22, 2025

View reviewed changes

Funatiq merged commit d39bcb6 into NVIDIA:release/0.20 May 22, 2025
3 checks passed

Funatiq deleted the dev/fix_moving_chunked branch May 22, 2025 15:49

Funatiq mentioned this pull request May 23, 2025

[nvbugs/5274894] fix: Sort requests for functional correctness and performance #4608

Merged

Funatiq added a commit to Funatiq/TensorRT-LLM that referenced this pull request May 23, 2025

Revert "[nvbugs/5274894] fix: Moving finished context requests to gen…

8f3b2e0

…eration (NVIDIA#4576)" This reverts commit d39bcb6. Signed-off-by: Robin Kobus <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[nvbugs/5274894] fix: Moving finished context requests to generation #4576

[nvbugs/5274894] fix: Moving finished context requests to generation #4576

Uh oh!

Funatiq commented May 22, 2025

Uh oh!

Funatiq commented May 22, 2025

Uh oh!

tensorrt-cicd commented May 22, 2025

Uh oh!

tensorrt-cicd commented May 22, 2025

Uh oh!

Uh oh!

Uh oh!

[nvbugs/5274894] fix: Moving finished context requests to generation #4576

[nvbugs/5274894] fix: Moving finished context requests to generation #4576

Uh oh!

Conversation

Funatiq commented May 22, 2025

Description

Test Coverage

GitHub Bot Help

kill

skip

reuse-pipeline

Uh oh!

Funatiq commented May 22, 2025

Uh oh!

tensorrt-cicd commented May 22, 2025

Uh oh!

tensorrt-cicd commented May 22, 2025

Uh oh!

Uh oh!

Uh oh!