forked from NVIDIA/TensorRT-LLM
-
Notifications
You must be signed in to change notification settings - Fork 0
rebase #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
rebase #1
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add pytorch backend team Signed-off-by: Kevin Chen * Update .github/CODEOWNERS Co-authored-by: Yanchao Lu Signed-off-by: juney-nvidia <[email protected]> --------- Signed-off-by: Kevin Chen Signed-off-by: juney-nvidia <[email protected]> Co-authored-by: juney-nvidia <[email protected]> Co-authored-by: Yanchao Lu
…rf-tests (cpp) (#4499) add low concurrency perf tests Signed-off-by: Venky <[email protected]>
* Adding two-shot allreduce kernel and mnnvl multicasting buffergit gffe Signed-off-by: Shiyu Li <[email protected]> Adding comments Signed-off-by: Shiyu Li <[email protected]> Add unittest of the twoshot kernel. Signed-off-by: Shiyu Li <[email protected]> Update dispatch logic Signed-off-by: Shiyu Li <[email protected]> Use cpu barrier instead of GPU at init Signed-off-by: Shiyu Li <[email protected]> Merge dispatch logic fix Signed-off-by: Shiyu Li <[email protected]> Update the kernel to use GPU-managed buffer Signed-off-by: Shiyu Li <[email protected]> * Refine Signed-off-by: Zongfei Jing <[email protected]> * Clean code Signed-off-by: Zongfei Jing <[email protected]> * Fix compile error Signed-off-by: Zongfei Jing <[email protected]> * Fix issue Signed-off-by: Zongfei Jing <[email protected]> * Clean up Signed-off-by: Zongfei Jing <[email protected]> * Simplify AllReduce interface Signed-off-by: Zongfei Jing <[email protected]> * Rename Signed-off-by: Zongfei Jing <[email protected]> * Fix warning Signed-off-by: Zongfei Jing <[email protected]> * Tidy code Signed-off-by: Zongfei Jing <[email protected]> * Rename Signed-off-by: Zongfei Jing <[email protected]> * Fix compile error Signed-off-by: Zongfei Jing <[email protected]> * Refine Signed-off-by: Zongfei Jing <[email protected]> * Skip ut for no_fusion Signed-off-by: Zongfei Jing <[email protected]> * Refine Signed-off-by: Zongfei Jing <[email protected]> --------- Signed-off-by: Shiyu Li <[email protected]> Signed-off-by: Zongfei Jing <[email protected]> Co-authored-by: Shiyu Li <[email protected]>
…r DGX (#4451) Signed-off-by: Dom Brown <[email protected]>
Signed-off-by: Nikita Korobov <[email protected]>
Signed-off-by: Aurelien Chartier <[email protected]>
* agentConnection Signed-off-by: Chuang Zhu <[email protected]> recv Signed-off-by: Chuang Zhu <[email protected]> agentState Signed-off-by: Chuang Zhu <[email protected]> NIXL interfaces Signed-off-by: Shixiaowei02 <[email protected]> update cmakelists Signed-off-by: Shixiaowei02 <[email protected]> nixl improve Signed-off-by: Chuang Zhu <[email protected]> remove cppzmq Signed-off-by: Chuang Zhu <[email protected]> fix Signed-off-by: Chuang Zhu <[email protected]> transferAgent remove register Signed-off-by: Chuang Zhu <[email protected]> work for cache Test Signed-off-by: Chuang Zhu <[email protected]> reduce sleep time Signed-off-by: Chuang Zhu <[email protected]> fix test Signed-off-by: Chuang Zhu <[email protected]> intergarte Signed-off-by: Chuang Zhu <[email protected]> nixl env Signed-off-by: Chuang Zhu <[email protected]> fix rebase error Signed-off-by: Chuang Zhu <[email protected]> cpp test Signed-off-by: Chuang Zhu <[email protected]> stash for send metaData Signed-off-by: Chuang Zhu <[email protected]> loadRemoteMD after fetchRemoteMD Signed-off-by: Chuang Zhu <[email protected]> workaround for mixed gen and context Signed-off-by: Chuang Zhu <[email protected]> test_env Signed-off-by: Chuang Zhu <[email protected]> avoid port conflict in test Signed-off-by: Chuang Zhu <[email protected]> * format Signed-off-by: Chuang Zhu <[email protected]> * use std::string Signed-off-by: Chuang Zhu <[email protected]> * typo Signed-off-by: Chuang Zhu <[email protected]> * fix transferAgentTest Signed-off-by: Chuang Zhu <[email protected]> --------- Signed-off-by: Chuang Zhu <[email protected]>
* partition LlmArgs Signed-off-by: Superjomn <[email protected]> * update backend Signed-off-by: Superjomn <[email protected]> --------- Signed-off-by: Superjomn <[email protected]>
Add phi-4-mini CLI acc test Signed-off-by: moraxu <[email protected]>
Add all_reduce.py script to test Signed-off-by: Kaiyu Xie <[email protected]>
* feat: add dataset support for benchmark_core_model with LLMAPI Signed-off-by: Aurelien Chartier <[email protected]>
#3972) * Remove waived cases * Remove test cases of not supported feature Signed-off-by: Hui Gao <[email protected]>
…#4349) Cherry-pick #3856 Signed-off-by: Kaiyu Xie <[email protected]> Co-authored-by: Dhruv Singal <[email protected]>
* Add tritonrelease container Signed-off-by: Iman Tabrizian <[email protected]> * Review comments Signed-off-by: Iman Tabrizian <[email protected]> * Update docker/Makefile Co-authored-by: Martin Marciniszyn Mehringer <[email protected]> Signed-off-by: Iman Tabrizian <[email protected]> --------- Signed-off-by: Iman Tabrizian <[email protected]> Signed-off-by: Iman Tabrizian <[email protected]> Co-authored-by: Martin Marciniszyn Mehringer <[email protected]>
Signed-off-by: Chuang Zhu <[email protected]>
waive hanging cases Signed-off-by: Ruodi <[email protected]>
* update waive list Signed-off-by: xinhe-nv <[email protected]> * fix test issues Signed-off-by: xinhe-nv <[email protected]> --------- Signed-off-by: xinhe-nv <[email protected]>
* clean up _merge_dummy_request method of PyExecutor Signed-off-by: junq <[email protected]> * fix ci Signed-off-by: junq <[email protected]> * clean Signed-off-by: junq <[email protected]> * update comment Signed-off-by: junq <[email protected]> --------- Signed-off-by: junq <[email protected]>
stash for debug broken promise Signed-off-by: Chuang Zhu <[email protected]>
Signed-off-by: Robin Kobus <[email protected]>
…4446) ultra Signed-off-by: Venky Ganesh <[email protected]>
[fix] Fix chunked prefill + overlap scheduler Signed-off-by: Mike Iovine <[email protected]>
* Integrate chunked attention kernels Signed-off-by: Mike Iovine <[email protected]> * Fix cache key Signed-off-by: Mike Iovine <[email protected]> * Fix lint Signed-off-by: Mike Iovine <[email protected]> --------- Signed-off-by: Mike Iovine <[email protected]>
Signed-off-by: nv-guomingz <[email protected]>
clean up _gather_dp_requests_num method of PyExecutor Signed-off-by: junq <[email protected]>
…er (#4573) fix moe possible race cond and add bypass worker thread for no updates Signed-off-by: Dongxu Yang <[email protected]>
* support mcp # Conflicts: # tensorrt_llm/scaffolding/worker.py Signed-off-by: wu1du2 <[email protected]> * move all into contrib/mcp # Conflicts: # examples/scaffolding/contrib/mcp/mcptest.py # tensorrt_llm/scaffolding/__init__.py # tensorrt_llm/scaffolding/contrib/__init__.py # tensorrt_llm/scaffolding/contrib/mcp/__init__.py # tensorrt_llm/scaffolding/contrib/mcp/mcp_controller.py # tensorrt_llm/scaffolding/task.py # tensorrt_llm/scaffolding/worker.py Signed-off-by: wu1du2 <[email protected]> * support sandbox, websearch # Conflicts: # examples/scaffolding/contrib/mcp/mcptest.py # examples/scaffolding/contrib/mcp/weather/weather.py # tensorrt_llm/scaffolding/contrib/mcp/mcp_controller.py # tensorrt_llm/scaffolding/contrib/mcp/mcp_utils.py # tensorrt_llm/scaffolding/contrib/mcp/mcp_worker.py # tensorrt_llm/scaffolding/worker.py Signed-off-by: wu1du2 <[email protected]> * remove pics Signed-off-by: wu1du2 <[email protected]> * pre-commit fix # Conflicts: # tensorrt_llm/scaffolding/contrib/mcp/__init__.py # tensorrt_llm/scaffolding/contrib/mcp/mcp_utils.py # tensorrt_llm/scaffolding/contrib/mcp/mcp_worker.py Signed-off-by: wu1du2 <[email protected]> * fix spell Signed-off-by: wu1du2 <[email protected]> * rebase Signed-off-by: wu1du2 <[email protected]> --------- Signed-off-by: wu1du2 <[email protected]>
* feat: Enabling dis serving with TRT backend with Python runtime Signed-off-by: Patrice Castonguay <[email protected]> * Fixing formatting Signed-off-by: Patrice Castonguay <[email protected]> * Fixing disagg mtp test Signed-off-by: Patrice Castonguay <[email protected]> --------- Signed-off-by: Patrice Castonguay <[email protected]>
Signed-off-by: nv-guomingz <[email protected]>
…o test (#4515) unwaive Signed-off-by: Enwei Zhu <[email protected]>
Signed-off-by: Arthur Rasmusson <[email protected].> Co-authored-by: Robin Kobus <[email protected]> Co-authored-by: Aurelien Chartier <[email protected]>
Signed-off-by: Yuanjing Xue <[email protected]>
Signed-off-by: Superjomn <[email protected]>
Signed-off-by: Yiqing Yan <[email protected]>
Signed-off-by: QI JUN <[email protected]> Co-authored-by: Yanchao Lu <[email protected]>
…image groovy and support NGC images (#4294) Signed-off-by: ZhanruiSunCh <[email protected]> Signed-off-by: Zhanrui Sun <[email protected]> Co-authored-by: Yanchao Lu <[email protected]>
Signed-off-by: ruodil <[email protected]>
Signed-off-by: Superjomn <[email protected]>
Signed-off-by: Yuanjing Xue <[email protected]>
…4689) Signed-off-by: QI JUN <[email protected]>
Signed-off-by: Jhao-Ting Chen <[email protected]> Co-authored-by: Haohang Huang <[email protected]>
Signed-off-by: Mike Iovine <[email protected]>
Signed-off-by: Robin Kobus <[email protected]>
Signed-off-by: Chenfei Zhang <[email protected]> Signed-off-by: Yilin Fan <[email protected]> Co-authored-by: Chenfei Zhang <[email protected]>
Signed-off-by: Hao Lu <[email protected]@users.noreply.github.com> Co-authored-by: Hao Lu <[email protected]@users.noreply.github.com>
) Signed-off-by: Jinyang Yuan <[email protected]>
Signed-off-by: Jun Yang <[email protected]>
Signed-off-by: Aurelien Chartier <[email protected]>
Signed-off-by: Thor Johnsen <[email protected]>
Signed-off-by: xinhe-nv <[email protected]>
Signed-off-by: Enwei Zhu <[email protected]>
Signed-off-by: Chuang Zhu <[email protected]>
Signed-off-by: Zheng Duan <[email protected]>
Signed-off-by: ixlmar <[email protected]>
Signed-off-by: ixlmar <[email protected]>
…4729) Signed-off-by: QI JUN <[email protected]>
Signed-off-by: Pengyun Lin <[email protected]>
…lama model. (#4758) Signed-off-by: Yuxian Qiu <[email protected]>
Signed-off-by: Tao Li
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR title
Please write the PR title by following template:
[JIRA ticket link/nvbug link/github issue link][fix/feat/doc/infra/...] <summary of this PR>
For example, assume I have a PR hope to support a new feature about cache manager of Jira TRTLLM-1000 ticket, it would be like
[TRTLLM-1000][feat] Support a new feature about cache manager
Description
Please explain the issue and the solution in short.
Test Coverage
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...
Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]
to print this help message.See details below for each supported subcommand.
run [--disable-fail-fast --skip-test --stage-list "A10-1, xxx" --gpu-type "A30, H100_PCIe" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-[Post-Merge]-1, xxx"]
Launch build/test pipelines. All previously running jobs will be killed.
--disable-fail-fast
(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test
(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-1, xxx"
(OPTIONAL) : Only run the specified test stages. Examples: "A10-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"
(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--only-multi-gpu-test
(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test
(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test
(OPTIONAL) : Force run the multi-GPU tests. Will also run L0 pre-merge pipeline.--post-merge
(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-[Post-Merge]-1, xxx"
(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-[Post-Merge]-1, xxx".kill
kill
Kill all running builds associated with pull request.
skip
skip --comment COMMENT
Skip testing for latest commit on pull request.
--comment "Reason for skipping build/test"
is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.