-
Couldn't load subscription status.
- Fork 660
build: Add TensorRT-LLM to optional dependency and corresponding instructions #2113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- `dynamo-run` is now a dev tool, replaced by python-only UX. - `mock_worker` always was a dev tool and should not be in the wheel. - `metrics` will be in the container but not the wheel. - `libdynamo_llm_capi.so` is no longer used, since vllm merged KV events upstream. This is a nice change because packaging binaries in a Python wheel is a pain. You have to worry about glibc versions and any other dylib dependencies. And CUDA. And it bloats the wheel.
This reverts commit c32cdb6.
This reverts commit a989a8f.
Signed-off-by: Anant Sharma <[email protected]>
WalkthroughThis change set restructures optional dependency groups in Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Dockerfile
participant WheelBuilder
participant RuntimeImage
User->>Dockerfile: Build container
Dockerfile->>WheelBuilder: Build ai-dynamo wheel (no hatch_build.py)
WheelBuilder-->>Dockerfile: Wheel with only Python packages
Dockerfile->>RuntimeImage: Install ai-dynamo wheel, copy metrics binary
RuntimeImage-->>User: Ready-to-use runtime (no C API bindings, simplified deps)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
container/Dockerfile.sglang (1)
119-120: Same duplication of-Ddisable_gds_backend=trueas TRT-LLM imageSee comment on TRT-LLM Dockerfile – consider a shared ARG or script.
Also applies to: 158-159
🧹 Nitpick comments (8)
pyproject.toml (2)
83-86: Duplicateduvloopentries across extrasEvery optional group re-declares
uvloop. When two extras are installed together, pip may pick incompatible versions if the dependency ever becomes range-bounded. Add a single root-level entry or use an internal table to centralise the requirement.
93-102: Static include list will miss future back-endsHard-coding every package path means adding a new backend requires touching
pyproject.toml.
Hatchling supports globbing:- "components/backends/trtllm/src/dynamo", - "components/backends/sglang/src/dynamo", - "components/backends/vllm/src/dynamo" + "components/backends/*/src/dynamo"Simplifies maintenance and avoids accidental omissions.
container/Dockerfile.tensorrt_llm (2)
109-120:disable_gds_backendflag duplicated – extract onceThe
if [ "$ARCH" = "arm64" ]block with-Ddisable_gds_backend=trueappears twice (source build and wheel build).
Factor this into a build-arg or environment variable to avoid divergence.Also applies to: 221-226
318-318:metricsbinary copied manually – package it insteadManually sprinkling
cp target/release/metrics /usr/local/binin multiple stages is brittle; any rename or new binary requires N Dockerfile edits.Proposed fix (single wheel already produced for Rust bindings):
- cp target/release/metrics /usr/local/bin + # Include metrics in the ai-dynamo wheel + maturin build --release -m lib/metrics/Cargo.toml --out /workspace/dist + uv pip install /workspace/dist/metrics*.whlKeeps runtime images in sync and removes duplicate COPY commands.
Also applies to: 478-483
README.md (1)
187-196: Spelling & fenced-code nitpicks
- “prerequites” / “prequisites” → prerequisites
- Add language identifiers to fenced code blocks to satisfy markdown-lint.
-## Install prerequites -``` +# ## Install prerequisites +```bashMinor, but improves docs polish.
container/Dockerfile.sglang (1)
368-369: Repeat manual copy ofmetricsbinarySame maintainability issue as in TRT-LLM Dockerfile – please package the binary or create a single COPY point.
Also applies to: 433-439
container/Dockerfile.vllm (2)
198-206: Consider hoisting the vLLM install into its own build stage for better layer cachingRe-building vLLM (or running
pip install) each time unrelated Dockerfile lines change invalidates the cache and lengthens the build. A small two-stage pattern keeps layers reusable:
- Build/install vLLM in an
as vllmstage.COPY --from=vllm /opt/vllm …(arm64) orCOPY --from=vllm /venv /venv(x86).This isolates the expensive step and improves CI turnaround time.
21-23: Version arg drift: keep single source of truth
ARG VLLM_VERSION="0.9.2"is hard-coded here whilepyproject.tomlnow pinsvllm==0.9.2in thevllmoptional group. Whenever one is bumped the other must be updated manually – an easy place for drift.A small helper script or build-arg passthrough from the
pyprojectmetadata (e.g.,grep -Po '(?<=vllm==)[^"]+') would ensure they stay in sync.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
Earthfile(1 hunks)README.md(2 hunks)components/backends/sglang/requirements.txt(0 hunks)components/backends/vllm/requirements.txt(0 hunks)container/Dockerfile.sglang(5 hunks)container/Dockerfile.tensorrt_llm(4 hunks)container/Dockerfile.vllm(8 hunks)deploy/sdk/src/dynamo/sdk/tests/test_e2e_args.py(0 hunks)deploy/sdk/src/dynamo/sdk/tests/test_e2e_config.py(0 hunks)deploy/sdk/src/dynamo/sdk/tests/test_link.py(0 hunks)deploy/sdk/src/dynamo/sdk/tests/test_resources.py(0 hunks)hatch_build.py(0 hunks)pyproject.toml(1 hunks)
💤 Files with no reviewable changes (7)
- deploy/sdk/src/dynamo/sdk/tests/test_e2e_config.py
- deploy/sdk/src/dynamo/sdk/tests/test_e2e_args.py
- components/backends/sglang/requirements.txt
- deploy/sdk/src/dynamo/sdk/tests/test_resources.py
- deploy/sdk/src/dynamo/sdk/tests/test_link.py
- hatch_build.py
- components/backends/vllm/requirements.txt
🧰 Additional context used
🪛 LanguageTool
README.md
[grammar] ~187-~187: Ensure spelling is correct
Context: ...ize=1g --ulimit memlock=-1` ## Install prerequites # Optional step: Only required for Blackwell and Grace Hopper pip3 install torch==2.7.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128 sudo apt-get -y install libopenmpi-dev > [!Tip] > You can learn more about these ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[grammar] ~196-~196: Ensure spelling is correct
Context: ...[!Tip] > You can learn more about these prequisites and known issues with TensorRT-LLM pip ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.17.2)
README.md
183-183: Blank line inside blockquote
(MD028, no-blanks-blockquote)
188-188: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
199-199: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
204-204: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (2)
pyproject.toml (1)
61-66: Pinning an RC build oftensorrt-llmis risky
1.0.0rc4is a release-candidate. Down-stream users who installai-dynamo[trtllm]will silently get a pre-release that can disappear or change APIs. Consider either:- "tensorrt-llm==1.0.0rc4" + # Prefer a stable official release; fall back to the rc only if no GA is available + "tensorrt-llm>=1.0.0,<2.0.0"or publish explicit guidance that only the RC is supported.
Earthfile (1)
108-108: LGTM – removal ofhatch_build.pyreferencesDeleting the copy step keeps the build context minimal and matches the project’s move away from the custom build hook.
Signed-off-by: Anant Sharma <[email protected]>
Signed-off-by: Anant Sharma <[email protected]>
Overview:
Follow-up PR to address TRTLLM support in dynamo package.
Summary by CodeRabbit
New Features
Bug Fixes
Chores