-
Notifications
You must be signed in to change notification settings - Fork 663
fix: TRTLLM container -copy NVIDIA torch lib into uv venv location #2597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/ok to test 918326d |
WalkthroughRefactors container/Dockerfile.trtllm to rebase Python packages into a venv, adjust TensorRT-LLM installation with custom indices and conditional Triton, add uninstall steps for conflicting packages, install local wheelhouse artifacts, and insert uv pip list checkpoints. Runtime stage mirrors venv packaging and relocations. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Builder as Docker Build
participant SysPy as System Python (dist-packages)
participant Venv as /opt/dynamo/venv (site-packages)
participant Index as TRT-LLM Index / Extra Wheel
participant Wheelhouse as Local Wheelhouse
participant Runtime as Runtime Image
Builder->>SysPy: Install base deps (no-cache)
Builder->>SysPy: Uninstall networkx, packaging, torch, triton
note right of Builder: uv pip list (debug)
Builder->>Venv: Copy torch/torchvision/triton/... from dist-packages
Builder->>/usr/local/lib: Copy lib*
note over Venv: Consolidate runtime to venv
Builder->>Index: Install cuda-python (>=12,<13)
Builder->>Index: Install TRT-LLM via extra-index/wheel var
alt amd64
Builder->>Index: Install triton==3.3.1
end
note right of Builder: uv pip list (debug)
Builder->>Wheelhouse: Install ai_dynamo*, ai_dynamo_runtime*, nixl* (no-cache)
note right of Builder: uv pip list (debug)
Builder-->>Runtime: Copy venv and packaged libs
note over Runtime: Use venv-based dependencies
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
container/Dockerfile.trtllm (1)
135-151: Missingtrtllm_wheelbuild stage causes COPY to failThe
COPY --from=trtllm_wheel . /trtllm_wheel/instruction incontainer/Dockerfile.trtllmrefers to a stage alias that isn’t defined anywhere, so the build will always error on that line. You must either:
Add a multistage build stage named
trtllm_wheelbefore line 135, for example:FROM python:3.10-slim AS trtllm_wheel # …install/build the TRTLLM wheel here…Or guard the
COPYbehind your existingHAS_TRTLLM_CONTEXTcheck so that it’s only executed if the wheel stage was actually built.Please update
container/Dockerfile.trtllmaccordingly to prevent build failures.
🧹 Nitpick comments (2)
container/Dockerfile.trtllm (2)
411-414: Avoid uninstalling widely used pure-Python deps; make uninstall idempotent and non-interactiveUninstalling packaging and networkx is unnecessary and can create a mixed “copied” vs “installed” state that confuses tooling. If the goal is only to prefer NVIDIA’s torch/triton, restrict the uninstall to those packages and make it non-interactive and idempotent.
Apply this diff:
- uv pip install --no-cache --requirement /tmp/requirements.txt && \ - echo "uninstall (networkx packaging torch triton) as we will use NVIDIA's versions later" && \ - uv pip uninstall networkx packaging torch triton + uv pip install --no-cache --requirement /tmp/requirements.txt && \ + echo "uninstalling (torch triton) to prefer NGC-provided builds" && \ + uv pip uninstall -y torch triton || true
440-440: Drop or gate uv pip list debug hooks to keep layers leanThese no-op runs add build time and layers. Either remove them or guard behind a DEBUG arg.
Apply this diff at each listed line:
-RUN uv pip list # TODO remove this line +RUN if [ "${DEBUG_PIP_LIST:-0}" = "1" ]; then uv pip list; fiAdd this near the top of the runtime stage (outside this hunk):
# at the start of the Runtime stage ARG DEBUG_PIP_LIST=0Also applies to: 464-464, 485-485, 492-492, 500-501
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
container/Dockerfile.trtllm(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build and Test - dynamo
| COPY --from=build /usr/local/lib/lib* /usr/local/lib/ | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torch /usr/local/lib/python3.12/dist-packages/torch | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torch-${TORCH_VER}.dist-info /usr/local/lib/python3.12/dist-packages/torch-${TORCH_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchgen /usr/local/lib/python3.12/dist-packages/torchgen | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision /usr/local/lib/python3.12/dist-packages/torchvision | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision-${TORCHVISION_VER}.dist-info /usr/local/lib/python3.12/dist-packages/torchvision-${TORCHVISION_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision.libs /usr/local/lib/python3.12/dist-packages/torchvision.libs | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/setuptools /usr/local/lib/python3.12/dist-packages/setuptools | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/setuptools-${SETUPTOOLS_VER}.dist-info /usr/local/lib/python3.12/dist-packages/setuptools-${SETUPTOOLS_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/functorch /usr/local/lib/python3.12/dist-packages/functorch | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/triton /usr/local/lib/python3.12/dist-packages/triton | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/pytorch_triton-${PYTORCH_TRITON_VER}.dist-info /usr/local/lib/python3.12/dist-packages/pytorch_triton-${PYTORCH_TRITON_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/jinja2 /usr/local/lib/python3.12/dist-packages/jinja2 | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/jinja2-${JINJA2_VER}.dist-info /usr/local/lib/python3.12/dist-packages/jinja2-${JINJA2_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/networkx /usr/local/lib/python3.12/dist-packages/networkx | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/networkx-${NETWORKX_VER}.dist-info /usr/local/lib/python3.12/dist-packages/networkx-${NETWORKX_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/sympy /usr/local/lib/python3.12/dist-packages/sympy | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/sympy-${SYMPY_VER}.dist-info /usr/local/lib/python3.12/dist-packages/sympy-${SYMPY_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/packaging /usr/local/lib/python3.12/dist-packages/packaging | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/packaging-${PACKAGING_VER}.dist-info /usr/local/lib/python3.12/dist-packages/packaging-${PACKAGING_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn /usr/local/lib/python3.12/dist-packages/flash_attn | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn-${FLASH_ATTN_VER}.dist-info /usr/local/lib/python3.12/dist-packages/flash_attn-${FLASH_ATTN_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn_2_cuda.cpython-312-*-linux-gnu.so /usr/local/lib/python3.12/dist-packages/ | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torch /opt/dynamo/venv/lib/python3.12/site-packages/torch | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torch-${TORCH_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/torch-${TORCH_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchgen /opt/dynamo/venv/lib/python3.12/site-packages/torchgen | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision /opt/dynamo/venv/lib/python3.12/site-packages/torchvision | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision-${TORCHVISION_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/torchvision-${TORCHVISION_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision.libs /opt/dynamo/venv/lib/python3.12/site-packages/torchvision.libs | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/setuptools /opt/dynamo/venv/lib/python3.12/site-packages/setuptools | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/setuptools-${SETUPTOOLS_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/setuptools-${SETUPTOOLS_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/functorch /opt/dynamo/venv/lib/python3.12/site-packages/functorch | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/triton /opt/dynamo/venv/lib/python3.12/site-packages/triton | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/pytorch_triton-${PYTORCH_TRITON_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/pytorch_triton-${PYTORCH_TRITON_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/jinja2 /opt/dynamo/venv/lib/python3.12/site-packages/jinja2 | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/jinja2-${JINJA2_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/jinja2-${JINJA2_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/networkx /opt/dynamo/venv/lib/python3.12/site-packages/networkx | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/networkx-${NETWORKX_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/networkx-${NETWORKX_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/sympy /opt/dynamo/venv/lib/python3.12/site-packages/sympy | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/sympy-${SYMPY_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/sympy-${SYMPY_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/packaging /opt/dynamo/venv/lib/python3.12/site-packages/packaging | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/packaging-${PACKAGING_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/packaging-${PACKAGING_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn /opt/dynamo/venv/lib/python3.12/site-packages/flash_attn | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn-${FLASH_ATTN_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/flash_attn-${FLASH_ATTN_VER}.dist-info | ||
| COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn_2_cuda.cpython-312-*-linux-gnu.so /opt/dynamo/venv/lib/python3.12/site-packages/ | ||
| RUN uv pip list # TODO remove this line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Brittle wholesale copy from system dist-packages into venv; reduce surface to essentials and avoid ABI drift
Copying system-wide libs and many third-party packages (setuptools, packaging, networkx, sympy, jinja2, triton) into the venv is fragile:
- Hard-coded dist-info versions will break when the base image updates.
- /usr/local/lib/lib* can clobber runtime CUDA/OS libs and cause subtle ABI issues.
- Mixing copied files with uv-managed packages makes upgrades, uninstalls, and debugging harder.
Prefer copying only the PyTorch stack you explicitly need (torch/torchgen/torchvision/functorch and flash_attn artifacts) and let uv manage generic Python deps. Also avoid copying triton from NGC if you intend to pin a specific triton for TRT-LLM later.
Slim the COPY set like this:
-RUN uv pip list # TODO remove this line
-COPY --from=build /usr/local/lib/lib* /usr/local/lib/
COPY --from=build /usr/local/lib/python3.12/dist-packages/torch /opt/dynamo/venv/lib/python3.12/site-packages/torch
COPY --from=build /usr/local/lib/python3.12/dist-packages/torch-${TORCH_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/torch-${TORCH_VER}.dist-info
COPY --from=build /usr/local/lib/python3.12/dist-packages/torchgen /opt/dynamo/venv/lib/python3.12/site-packages/torchgen
COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision /opt/dynamo/venv/lib/python3.12/site-packages/torchvision
COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision-${TORCHVISION_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/torchvision-${TORCHVISION_VER}.dist-info
COPY --from=build /usr/local/lib/python3.12/dist-packages/torchvision.libs /opt/dynamo/venv/lib/python3.12/site-packages/torchvision.libs
-COPY --from=build /usr/local/lib/python3.12/dist-packages/setuptools /opt/dynamo/venv/lib/python3.12/site-packages/setuptools
-COPY --from=build /usr/local/lib/python3.12/dist-packages/setuptools-${SETUPTOOLS_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/setuptools-${SETUPTOOLS_VER}.dist-info
COPY --from=build /usr/local/lib/python3.12/dist-packages/functorch /opt/dynamo/venv/lib/python3.12/site-packages/functorch
-COPY --from=build /usr/local/lib/python3.12/dist-packages/triton /opt/dynamo/venv/lib/python3.12/site-packages/triton
-COPY --from=build /usr/local/lib/python3.12/dist-packages/pytorch_triton-${PYTORCH_TRITON_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/pytorch_triton-${PYTORCH_TRITON_VER}.dist-info
-COPY --from=build /usr/local/lib/python3.12/dist-packages/jinja2 /opt/dynamo/venv/lib/python3.12/site-packages/jinja2
-COPY --from=build /usr/local/lib/python3.12/dist-packages/jinja2-${JINJA2_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/jinja2-${JINJA2_VER}.dist-info
-COPY --from=build /usr/local/lib/python3.12/dist-packages/networkx /opt/dynamo/venv/lib/python3.12/site-packages/networkx
-COPY --from=build /usr/local/lib/python3.12/dist-packages/networkx-${NETWORKX_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/networkx-${NETWORKX_VER}.dist-info
-COPY --from=build /usr/local/lib/python3.12/dist-packages/sympy /opt/dynamo/venv/lib/python3.12/site-packages/sympy
-COPY --from=build /usr/local/lib/python3.12/dist-packages/sympy-${SYMPY_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/sympy-${SYMPY_VER}.dist-info
-COPY --from=build /usr/local/lib/python3.12/dist-packages/packaging /opt/dynamo/venv/lib/python3.12/site-packages/packaging
-COPY --from=build /usr/local/lib/python3.12/dist-packages/packaging-${PACKAGING_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/packaging-${PACKAGING_VER}.dist-info
COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn /opt/dynamo/venv/lib/python3.12/site-packages/flash_attn
COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn-${FLASH_ATTN_VER}.dist-info /opt/dynamo/venv/lib/python3.12/site-packages/flash_attn-${FLASH_ATTN_VER}.dist-info
COPY --from=build /usr/local/lib/python3.12/dist-packages/flash_attn_2_cuda.cpython-312-*-linux-gnu.so /opt/dynamo/venv/lib/python3.12/site-packages/
-RUN uv pip list # TODO remove this lineIf you want to future-proof the dist-info names, switch to a single RUN cp -a inside the runtime stage that uses wildcard globs (e.g., torch*, torchvision*, flash_attn*) rather than hard-coded versions.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In container/Dockerfile.trtllm around lines 441-464, the PR currently copies a
large set of system dist-packages and /usr/local/lib/lib* into the venv using
hard-coded dist-info names, which is brittle and risks ABI drift; replace those
COPY lines so you only bring the PyTorch runtime artifacts the image actually
needs (torch, torchgen, torchvision, functorch, flash_attn and the compiled .so
artifacts) and stop copying generic packages (setuptools, networkx, sympy,
packaging, jinja2, triton, etc.) or system libs. Implement this by removing the
/usr/local/lib/lib* COPY and the hard-coded dist-info COPYs, and instead either
(A) use wildcarded COPY/RUN cp -a inside the runtime stage to copy torch*
torchvision* functorch flash_attn and their associated .so files and dist-info
globs, or (B) only copy explicit package directories for the minimal PyTorch set
and let uv handle generic Python deps; also avoid copying triton from NGC unless
you intend to pin it, and remove the temporary "uv pip list" RUN line.
| RUN uv pip list # TODO remove this line | ||
| RUN uv pip install --no-cache "cuda-python>=12,<13" && \ | ||
| uv pip install --no-cache --extra-index-url "${TENSORRTLLM_INDEX_URL}" "${TENSORRTLLM_PIP_WHEEL}" && \ | ||
| uv pip install --no-cache \ | ||
| /workspace/wheelhouse/ai_dynamo_runtime*cp312*.whl \ | ||
| /workspace/wheelhouse/ai_dynamo*any.whl \ | ||
| /workspace/wheelhouse/nixl*.whl | ||
| RUN uv pip list # TODO remove this line | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parity with build stage: ensure compatible triton for TRT-LLM on amd64
In build you pin triton==3.3.1 for amd64, but in runtime you don’t. Given TRT-LLM’s sensitivity to triton versions, this can lead to import/runtime failures if the copied PyTorch triton doesn’t match. Align runtime with build by installing triton==3.3.1 on amd64 after TRT-LLM.
Apply this diff:
-RUN uv pip list # TODO remove this line
-RUN uv pip install --no-cache "cuda-python>=12,<13" && \
- uv pip install --no-cache --extra-index-url "${TENSORRTLLM_INDEX_URL}" "${TENSORRTLLM_PIP_WHEEL}" && \
- uv pip install --no-cache \
+RUN uv pip install --no-cache "cuda-python>=12,<13" && \
+ uv pip install --no-cache --extra-index-url "${TENSORRTLLM_INDEX_URL}" "${TENSORRTLLM_PIP_WHEEL}" && \
+ if [ "$ARCH" = "amd64" ]; then uv pip install --no-cache "triton==3.3.1"; fi && \
+ uv pip install --no-cache \
/workspace/wheelhouse/ai_dynamo_runtime*cp312*.whl \
/workspace/wheelhouse/ai_dynamo*any.whl \
/workspace/wheelhouse/nixl*.whl
-RUN uv pip list # TODO remove this line📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN uv pip list # TODO remove this line | |
| RUN uv pip install --no-cache "cuda-python>=12,<13" && \ | |
| uv pip install --no-cache --extra-index-url "${TENSORRTLLM_INDEX_URL}" "${TENSORRTLLM_PIP_WHEEL}" && \ | |
| uv pip install --no-cache \ | |
| /workspace/wheelhouse/ai_dynamo_runtime*cp312*.whl \ | |
| /workspace/wheelhouse/ai_dynamo*any.whl \ | |
| /workspace/wheelhouse/nixl*.whl | |
| RUN uv pip list # TODO remove this line | |
| RUN uv pip install --no-cache "cuda-python>=12,<13" && \ | |
| uv pip install --no-cache --extra-index-url "${TENSORRTLLM_INDEX_URL}" "${TENSORRTLLM_PIP_WHEEL}" && \ | |
| if [ "$ARCH" = "amd64" ]; then uv pip install --no-cache "triton==3.3.1"; fi && \ | |
| uv pip install --no-cache \ | |
| /workspace/wheelhouse/ai_dynamo_runtime*cp312*.whl \ | |
| /workspace/wheelhouse/ai_dynamo*any.whl \ | |
| /workspace/wheelhouse/nixl*.whl |
🤖 Prompt for AI Agents
In container/Dockerfile.trtllm around lines 485 to 493, the runtime stage does
not pin triton while the build stage pins triton==3.3.1 for amd64, which can
cause import/runtime failures; add a conditional install right after the TRT-LLM
pip installs that, when building for amd64, runs pip install --no-cache
triton==3.3.1 (using the same uv pip invocation pattern and environment) so the
runtime triton version matches the build; implement the condition using the
existing architecture/build-arg mechanism in the Dockerfile (e.g., check
TARGETARCH or similar) and keep the command order so triton is installed after
the TRT-LLM wheels are installed.
|
This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Overview:
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
New Features
Refactor
Chores
Bug Fixes