Skip to content

Conversation

Tabrizian
Copy link
Member

@Tabrizian Tabrizian commented Jun 24, 2025

Add support for input output token counts

For streaming case, it would include the tokens in each response.
For non-streaming case, it would include the total tokens.

@Tabrizian Tabrizian force-pushed the user/imant/inputoutputcount branch 3 times, most recently from 6584afe to 740498f Compare June 24, 2025 18:31
@Tabrizian
Copy link
Member Author

/bot run --stage-list "A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2, A100X-Triton-[Post-Merge]-1, A100X-Triton-[Post-Merge]-2, B200_PCIe-Triton-[Post-Merge]-1"

@Tabrizian Tabrizian changed the title Add support for input output token counts [nvbugs/5309940] Add support for input output token counts Jun 24, 2025
@tensorrt-cicd
Copy link
Collaborator

PR_Github #9745 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9745 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7179 (Partly Tested) completed with status: 'FAILURE'

@Tabrizian
Copy link
Member Author

/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9750 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9750 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7184 (Partly Tested) completed with status: 'FAILURE'

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9760 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9760 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7193 completed with status: 'FAILURE'

@Tabrizian
Copy link
Member Author

/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2,A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9904 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9904 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7311 (Partly Tested) completed with status: 'FAILURE'

@Tabrizian Tabrizian force-pushed the user/imant/inputoutputcount branch from 740498f to 6670f0e Compare June 25, 2025 20:41
@Tabrizian
Copy link
Member Author

/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2,A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2" --disable-fail-fast

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9915 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #9915 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7318 (Partly Tested) completed with status: 'FAILURE'

@Tabrizian
Copy link
Member Author

/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2,A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2" --disable-fail-fast

Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
@Tabrizian Tabrizian force-pushed the user/imant/inputoutputcount branch from 00047df to 52dc623 Compare June 26, 2025 23:36
@tensorrt-cicd
Copy link
Collaborator

PR_Github #10075 [ run ] triggered by Bot

@Tabrizian Tabrizian enabled auto-merge (squash) June 27, 2025 00:46
@tensorrt-cicd
Copy link
Collaborator

PR_Github #10075 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7439 (Partly Tested) completed with status: 'FAILURE'

@Tabrizian
Copy link
Member Author

/bot run --stage-list "A30-Triton-[Post-Merge]-2"

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10171 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10171 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #7509 (Partly Tested) completed with status: 'SUCCESS'

@Tabrizian
Copy link
Member Author

/bot reuse-pipeline

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10176 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #10176 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #10171 (Partly Tested) for commit 5989ab5

@Tabrizian Tabrizian merged commit 26b953e into NVIDIA:main Jun 27, 2025
3 checks passed
@Tabrizian Tabrizian deleted the user/imant/inputoutputcount branch June 27, 2025 20:41
Shunkangz pushed a commit to Shunkangz/TensorRT-LLM that referenced this pull request Jul 2, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 9, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 10, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
dominicshanshan pushed a commit to dominicshanshan/TensorRT-LLM that referenced this pull request Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants