-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[nvbugs/5309940] Add support for input output token counts #5445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6584afe
to
740498f
Compare
/bot run --stage-list "A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2, A100X-Triton-[Post-Merge]-1, A100X-Triton-[Post-Merge]-2, B200_PCIe-Triton-[Post-Merge]-1" |
PR_Github #9745 [ run ] triggered by Bot |
PR_Github #9745 [ run ] completed with state |
/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2" |
PR_Github #9750 [ run ] triggered by Bot |
PR_Github #9750 [ run ] completed with state |
PR_Github #9760 [ run ] triggered by Bot |
PR_Github #9760 [ run ] completed with state |
/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2,A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2" |
PR_Github #9904 [ run ] triggered by Bot |
PR_Github #9904 [ run ] completed with state |
740498f
to
6670f0e
Compare
/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2,A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2" --disable-fail-fast |
PR_Github #9915 [ run ] triggered by Bot |
PR_Github #9915 [ run ] completed with state |
6670f0e
to
00047df
Compare
/bot run --stage-list "A100X-Triton-[Post-Merge]-1,A100X-Triton-[Post-Merge]-2,A30-Triton-[Post-Merge]-1, A30-Triton-[Post-Merge]-2" --disable-fail-fast |
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
00047df
to
52dc623
Compare
PR_Github #10075 [ run ] triggered by Bot |
PR_Github #10075 [ run ] completed with state |
/bot run --stage-list "A30-Triton-[Post-Merge]-2" |
PR_Github #10171 [ run ] triggered by Bot |
PR_Github #10171 [ run ] completed with state |
/bot reuse-pipeline |
PR_Github #10176 [ reuse-pipeline ] triggered by Bot |
PR_Github #10176 [ reuse-pipeline ] completed with state |
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Signed-off-by: Iman Tabrizian <[email protected]>
Add support for input output token counts
For streaming case, it would include the tokens in each response.
For non-streaming case, it would include the total tokens.