-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[TRTLLM-4721][test] Add qa test for llm-api #6727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe changes implement explicit backend selection and logging for the LLM API, defaulting to PyTorch. The logger level is temporarily set to "info" during backend detection. Integration tests and a test list are added to verify backend selection, argument types, and log outputs. Command-line backend selection is introduced for integration scripts. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant TestScript (_run_llmapi_llm.py)
participant LLM API (llmapi/llm.py)
participant Logger
User->>TestScript: Run with --backend (default: tensorrt)
TestScript->>LLM API: Instantiate LLM (backend param)
LLM API->>Logger: Save current log level
LLM API->>Logger: Set log level to info
alt Detect backend
LLM API->>Logger: Log backend info message
end
LLM API->>Logger: Restore original log level
LLM API-->>TestScript: LLM instance
TestScript->>LLM API: Generate output
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~18 minutes Suggested reviewers
Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
tensorrt_llm/llmapi/llm.py (1)
130-141: Backend selection logic is correct with clear logging.The explicit backend selection and logging implementation looks good. Each backend case is properly handled with appropriate argument class selection and informative log messages.
For future maintainability, consider using a dictionary-based approach:
+ backend_config = { + "pytorch": (TorchLlmArgs, "Using LLM with PyTorch backend"), + "_autodeploy": (lambda: (AutoDeployLlmArgs if 'AutoDeployLlmArgs' in locals() else TorchLlmArgs), "Using LLM with AutoDeploy backend"), + } + + if backend in backend_config: + llm_args_cls, message = backend_config[backend] + logger.info(message) + if callable(llm_args_cls): + llm_args_cls = llm_args_cls() + else: + logger.info("Using LLM with TensorRT backend") + llm_args_cls = TrtLlmArgstests/integration/defs/llmapi/test_llm_qa.py (2)
11-15: Fix docstring formatting.The docstring should either be a single line or properly formatted as a multi-line docstring.
Apply this diff to fix the formatting:
- """ - Check that the default backend is PyTorch for v1.0 breaking change - """ + """Check that the default backend is PyTorch for v1.0 breaking change."""
46-70: Excellent logging verification test with minor formatting issue.The test effectively validates backend logging by:
- Running the external script with different backend options
- Capturing and verifying specific log messages
- Testing both PyTorch and TensorRT backend logging
Fix the line length issue on line 69:
- assert "Using LLM with TensorRT backend" in tensorrt_output, f"Expected 'tensorrt' in logs, got: {tensorrt_output}" + assert "Using LLM with TensorRT backend" in tensorrt_output, \ + f"Expected 'tensorrt' in logs, got: {tensorrt_output}"
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tensorrt_llm/llmapi/llm.py(2 hunks)tests/integration/defs/llmapi/_run_llmapi_llm.py(1 hunks)tests/integration/defs/llmapi/test_llm_qa.py(1 hunks)tests/integration/test_lists/qa/llm_function_full.txt(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
tensorrt_llm/llmapi/llm.pytests/integration/defs/llmapi/test_llm_qa.pytests/integration/defs/llmapi/_run_llmapi_llm.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
tensorrt_llm/llmapi/llm.pytests/integration/defs/llmapi/test_llm_qa.pytests/integration/defs/llmapi/_run_llmapi_llm.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
tensorrt_llm/llmapi/llm.pytests/integration/defs/llmapi/test_llm_qa.pytests/integration/test_lists/qa/llm_function_full.txttests/integration/defs/llmapi/_run_llmapi_llm.py
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Applied to files:
tensorrt_llm/llmapi/llm.pytests/integration/defs/llmapi/test_llm_qa.pytests/integration/test_lists/qa/llm_function_full.txttests/integration/defs/llmapi/_run_llmapi_llm.py
📚 Learning: 2025-08-01T15:14:45.673Z
Learnt from: yibinl-nvidia
PR: NVIDIA/TensorRT-LLM#6506
File: examples/models/core/mixtral/requirements.txt:3-3
Timestamp: 2025-08-01T15:14:45.673Z
Learning: In TensorRT-LLM, examples directory can have different dependency versions than the root requirements.txt file. Version conflicts between root and examples dependencies are acceptable because examples are designed to be standalone and self-contained.
Applied to files:
tests/integration/defs/llmapi/test_llm_qa.pytests/integration/test_lists/qa/llm_function_full.txttests/integration/defs/llmapi/_run_llmapi_llm.py
📚 Learning: 2025-07-22T08:33:49.109Z
Learnt from: yiqingy0
PR: NVIDIA/TensorRT-LLM#5198
File: jenkins/mergeWaiveList.py:0-0
Timestamp: 2025-07-22T08:33:49.109Z
Learning: In the TensorRT-LLM waive list merging system, removed lines are always located at the end of the merge waive lists, which is why the mergeWaiveList.py script uses reverse traversal - it's an optimization for this specific domain constraint.
Applied to files:
tests/integration/test_lists/qa/llm_function_full.txt
🪛 Ruff (0.12.2)
tests/integration/defs/llmapi/test_llm_qa.py
12-13: One-line docstring should fit on one line
Reformat to one line
(D200)
12-13: First line should end with a period, question mark, or exclamation point
Add closing punctuation
(D415)
69-69: Line too long (123 > 120)
(E501)
🔇 Additional comments (8)
tensorrt_llm/llmapi/llm.py (1)
170-172: Proper cleanup in finally block.The logger level restoration in the finally block is implemented correctly, ensuring the original level is restored even if exceptions occur during backend detection.
Note: This cleanup is still subject to the thread safety concerns mentioned earlier regarding global logger state modification.
tests/integration/test_lists/qa/llm_function_full.txt (1)
674-679: Potential duplication in test list.The test
test_llm_args_loggingappears twice (lines 676 and 679). Please verify if this duplication is intentional or if one of these entries should reference a different test method.tests/integration/defs/llmapi/_run_llmapi_llm.py (4)
3-3: LGTM!Proper import of
Optionalfor type annotation.
7-8: LGTM!Clean import of LLM classes from both backend modules to enable dynamic backend selection.
15-16: LGTM!Proper addition of backend selection CLI option and function signature update with correct typing.
22-29: Excellent backend selection implementation.The logic cleanly handles:
- Default fallback to "pytorch" for backward compatibility
- Proper validation of supported backends
- Dynamic class and argument selection based on backend type
- Correct handling of backend-specific requirements (BuildConfig for TensorRT)
tests/integration/defs/llmapi/test_llm_qa.py (2)
1-9: LGTM!Clean test file setup with appropriate imports and model path configuration.
16-44: Excellent test coverage for backend selection.Both test methods provide comprehensive validation:
- Verify correct default backend selection (PyTorch)
- Validate backend-specific argument types (
TorchLlmArgsvsTrtLlmArgs)- Include functional testing with generation calls
- Use appropriately scoped imports within test methods
syuoni
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit concerned with the logging level intervention in llm.py, please take a look, thanks!
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/integration/defs/llmapi/test_llm_api_qa.py (2)
12-14: Fix docstring formatting.The docstring has formatting issues that should be addressed.
- """ - Check that the default backend is PyTorch for v1.0 breaking change - """ + """Check that the default backend is PyTorch for v1.0 breaking change."""
46-70: Fix line length violation and improve formatting.The test logic is sound, but there's a line length violation that needs to be addressed.
- # Check that tensorrt backend keyword appears in logs - assert "Using LLM with TensorRT backend" in tensorrt_output, f"Expected 'tensorrt' in logs, got: {tensorrt_output}" + # Check that tensorrt backend keyword appears in logs + assert "Using LLM with TensorRT backend" in tensorrt_output, ( + f"Expected 'tensorrt' in logs, got: {tensorrt_output}")
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tensorrt_llm/llmapi/llm.py(2 hunks)tests/integration/defs/llmapi/_run_llmapi_llm.py(1 hunks)tests/integration/defs/llmapi/test_llm_api_qa.py(1 hunks)tests/integration/test_lists/qa/llm_function_full.txt(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- tensorrt_llm/llmapi/llm.py
- tests/integration/test_lists/qa/llm_function_full.txt
- tests/integration/defs/llmapi/_run_llmapi_llm.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-01T15:14:45.673Z
Learnt from: yibinl-nvidia
PR: NVIDIA/TensorRT-LLM#6506
File: examples/models/core/mixtral/requirements.txt:3-3
Timestamp: 2025-08-01T15:14:45.673Z
Learning: In TensorRT-LLM, examples directory can have different dependency versions than the root requirements.txt file. Version conflicts between root and examples dependencies are acceptable because examples are designed to be standalone and self-contained.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🪛 Ruff (0.12.2)
tests/integration/defs/llmapi/test_llm_api_qa.py
12-13: One-line docstring should fit on one line
Reformat to one line
(D200)
12-13: First line should end with a period, question mark, or exclamation point
Add closing punctuation
(D415)
69-69: Line too long (123 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (3)
tests/integration/defs/llmapi/test_llm_api_qa.py (3)
1-9: LGTM on imports and setup!The imports are appropriate for an integration test, and the model path construction follows the expected pattern for accessing test models.
16-28: LGTM on default backend test!The test correctly verifies that the default LLM instantiation uses PyTorch backend with appropriate argument types. The generation test ensures functional verification beyond just type checking.
30-44: LGTM on TensorRT backend test!The test appropriately verifies TensorRT backend behavior with correct imports and type assertions. The flexible backend assertion (allowing both "tensorrt" and None) suggests proper handling of different TensorRT engine states.
|
PR_Github #14564 [ run ] triggered by Bot |
|
PR_Github #14564 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #14594 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
tests/integration/defs/llmapi/test_llm_api_qa.py (5)
11-14: Docstring style nits (D200 / D415)First line should be a single-line sentence ending with a period:
-""" -Check that the default backend is PyTorch for v1.0 breaking change -""" +"""Verify that the default LLM backend is PyTorch (v1.0 breaking change)."""
24-26: Missing assertion on.generate()outputThe loop only prints tokens; if generation silently fails the test still passes.
Capture a token or the full string and assert it is non-empty.outputs = list(llm.generate(["Hello, world!"])) assert outputs and outputs[0], "Generation returned empty output"
27-29: RemovePrinting tokens clutters CI logs; rely on assertions instead.
Also applies to: 43-44
48-49: Preferpathlib.Pathfor script path constructionReadability & platform safety.
script_path = Path(__file__).with_name("_run_llmapi_llm.py")
60-60: Line exceeds 120 charsBreak the long assertion line for compliance with style guides.
-assert "Using LLM with PyTorch backend" in pytorch_output, f"Expected 'pytorch' in logs, got: {pytorch_output}" +msg = f"Expected 'pytorch' backend log, got: {pytorch_output}" +assert "Using LLM with PyTorch backend" in pytorch_output, msg
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tensorrt_llm/llmapi/llm.py(2 hunks)tests/integration/defs/llmapi/_run_llmapi_llm.py(1 hunks)tests/integration/defs/llmapi/test_llm_api_qa.py(1 hunks)tests/integration/test_lists/qa/llm_function_full.txt(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/integration/defs/llmapi/_run_llmapi_llm.py
- tensorrt_llm/llmapi/llm.py
- tests/integration/test_lists/qa/llm_function_full.txt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-01T15:14:45.673Z
Learnt from: yibinl-nvidia
PR: NVIDIA/TensorRT-LLM#6506
File: examples/models/core/mixtral/requirements.txt:3-3
Timestamp: 2025-08-01T15:14:45.673Z
Learning: In TensorRT-LLM, examples directory can have different dependency versions than the root requirements.txt file. Version conflicts between root and examples dependencies are acceptable because examples are designed to be standalone and self-contained.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🪛 Ruff (0.12.2)
tests/integration/defs/llmapi/test_llm_api_qa.py
12-13: One-line docstring should fit on one line
Reformat to one line
(D200)
12-13: First line should end with a period, question mark, or exclamation point
Add closing punctuation
(D415)
69-69: Line too long (123 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (1)
tests/integration/defs/llmapi/test_llm_api_qa.py (1)
40-41: Avoid acceptingNonefor backendAllowing
Nonemasks mis-configuration; the API should always expose the resolved backend string.-assert llm.args.backend in ("tensorrt", None) +assert llm.args.backend == "tensorrt"
|
PR_Github #14594 [ run ] completed with state |
|
/bot run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
tests/integration/defs/llmapi/test_llm_api_qa.py (1)
6-9: Fix: don’t import conftest; derive model_path from a fixture (likely cause of CI failures).Importing from conftest and calling a fixture-like helper at module import time breaks pytest collection and is fragile. Build the model_path inside tests via a real fixture and pathlib for cross‑platform safety.
Apply this removal here (see method-level diffs below for additions):
-from ..conftest import llm_models_root - -model_path = llm_models_root() + "/llama-models-v3/llama-v3-8b-instruct-hf"
🧹 Nitpick comments (5)
tests/integration/defs/llmapi/test_llm_api_qa.py (5)
12-14: Docstring formatting: one line with punctuation.Comply with D200/D415: keep it on one line and end with a period.
-class TestLlmDefaultBackend: - """ - Check that the default backend is PyTorch for v1.0 breaking change - """ +class TestLlmDefaultBackend: + """Check that the default backend is PyTorch for the v1.0 breaking change."""
27-29: Avoid noisy test logs.Consuming the generator is enough; printing in tests makes CI logs noisy.
- for output in llm.generate(["Hello, world!"]): - print(output) + for _ in llm.generate(["Hello, world!"]): + pass # consume generator; avoid noisy test logs
43-45: Avoid noisy test logs.Same as above; don’t print from tests.
- for output in llm.generate(["Hello, world!"]): - print(output) + for _ in llm.generate(["Hello, world!"]): + pass # consume generator; avoid noisy test logs
60-61: Wrap the long assert message to satisfy E501 (<=120 chars).Keeps the message readable and within the line-length limit.
- assert "Using LLM with PyTorch backend" in pytorch_output, f"Expected 'pytorch' in logs, got: {pytorch_output}" + assert "Using LLM with PyTorch backend" in pytorch_output, ( + f"Expected 'pytorch' in logs, got: {pytorch_output}" + )
70-71: Wrap the long assert message to satisfy E501 (<=120 chars).Same fix for the TensorRT case.
- assert "Using LLM with TensorRT backend" in tensorrt_output, f"Expected 'tensorrt' in logs, got: {tensorrt_output}" + assert "Using LLM with TensorRT backend" in tensorrt_output, ( + f"Expected 'tensorrt' in logs, got: {tensorrt_output}" + )
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tensorrt_llm/llmapi/llm.py(2 hunks)tests/integration/defs/llmapi/_run_llmapi_llm.py(1 hunks)tests/integration/defs/llmapi/test_llm_api_qa.py(1 hunks)tests/integration/test_lists/qa/llm_function_full.txt(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- tensorrt_llm/llmapi/llm.py
- tests/integration/defs/llmapi/_run_llmapi_llm.py
- tests/integration/test_lists/qa/llm_function_full.txt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-01T15:14:45.673Z
Learnt from: yibinl-nvidia
PR: NVIDIA/TensorRT-LLM#6506
File: examples/models/core/mixtral/requirements.txt:3-3
Timestamp: 2025-08-01T15:14:45.673Z
Learning: In TensorRT-LLM, examples directory can have different dependency versions than the root requirements.txt file. Version conflicts between root and examples dependencies are acceptable because examples are designed to be standalone and self-contained.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🪛 Ruff (0.12.2)
tests/integration/defs/llmapi/test_llm_api_qa.py
12-13: One-line docstring should fit on one line
Reformat to one line
(D200)
12-13: First line should end with a period, question mark, or exclamation point
Add closing punctuation
(D415)
69-69: Line too long (123 > 120)
(E501)
🔇 Additional comments (1)
tests/integration/defs/llmapi/test_llm_api_qa.py (1)
16-16: Confirm whether llm_root is required.Each test includes llm_root but doesn’t use it directly. If it’s only for side-effects (env setup), keep it; otherwise remove to reduce noise.
Also applies to: 30-30, 46-46
|
PR_Github #14669 [ run ] triggered by Bot |
|
PR_Github #14669 [ run ] completed with state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
tests/integration/defs/llmapi/test_llm_api_qa.py (1)
46-71: Logging test: remove debug print, use fixture + pathlib, and wrap long asserts
- Build paths with pathlib.
- Drop noisy print.
- Fix long-line E501 and make TRT assertion robust (accepts TensorRT or AutoDeploy message).
- def test_llm_args_logging(self, llm_root, llm_venv): + def test_llm_args_logging(self, llm_root, llm_venv, llm_models_root): @@ - script_path = os.path.join(os.path.dirname(__file__), - "_run_llmapi_llm.py") - print(f"script_path: {script_path}") + import pathlib + script_path = str(pathlib.Path(__file__).parent / "_run_llmapi_llm.py") + model_path = str( + pathlib.Path(llm_models_root) / "llama-models-v3" / "llama-v3-8b-instruct-hf" + ) @@ - pytorch_output = venv_check_output(llm_venv, pytorch_cmd) + pytorch_output = venv_check_output(llm_venv, pytorch_cmd) @@ - assert "Using LLM with PyTorch backend" in pytorch_output, f"Expected 'pytorch' in logs, got: {pytorch_output}" + expected_pt = "Using LLM with PyTorch backend" + assert expected_pt in pytorch_output, f"Missing '{expected_pt}' in logs" @@ - tensorrt_output = venv_check_output(llm_venv, tensorrt_cmd) + tensorrt_output = venv_check_output(llm_venv, tensorrt_cmd) @@ - assert "Using LLM with TensorRT backend" in tensorrt_output, f"Expected 'tensorrt' in logs, got: {tensorrt_output}" + expected_trt = ("Using LLM with TensorRT backend", + "Using LLM with AutoDeploy backend") + assert any(s in tensorrt_output for s in expected_trt), \ + f"Missing any of {expected_trt} in logs"
♻️ Duplicate comments (3)
tests/integration/defs/llmapi/test_llm_api_qa.py (3)
6-8: Critical: avoid importing conftest and building globals from itImporting conftest as a module is brittle and can break test collection. Also, computing
model_pathat module import time ties tests to environment state. Use thellm_models_rootfixture and build the path inside each test with pathlib.-from ..conftest import llm_models_root - -model_path = llm_models_root() + "/llama-models-v3/llama-v3-8b-instruct-hf" +# model_path is derived inside each test from the llm_models_root fixture.
16-26: Default backend test: use fixture + pathlib and module-namespace importsAligns with repo guidelines and fixes cross-platform pathing.
- def test_llm_args_type_default(self, llm_root, llm_venv): - # Keep the complete example code here - from tensorrt_llm.llmapi import LLM, KvCacheConfig, TorchLlmArgs + def test_llm_args_type_default(self, llm_root, llm_venv, llm_models_root): + # Keep the complete example code here + import pathlib + import tensorrt_llm.llmapi as llmapi @@ - kv_cache_config = KvCacheConfig(free_gpu_memory_fraction=0.4) - llm = LLM(model=model_path, kv_cache_config=kv_cache_config) + model_path = str( + pathlib.Path(llm_models_root) / "llama-models-v3" / "llama-v3-8b-instruct-hf" + ) + kv_cache_config = llmapi.KvCacheConfig(free_gpu_memory_fraction=0.4) + llm = llmapi.LLM(model=model_path, kv_cache_config=kv_cache_config) @@ - assert isinstance(llm.args, TorchLlmArgs) + assert isinstance(llm.args, llmapi.TorchLlmArgs)
30-41: TensorRT test: mirror fixture + pathlib and module-namespace patternsKeep import namespaces and build
model_pathfrom fixture.- def test_llm_args_type_tensorrt(self, llm_root, llm_venv): - # Keep the complete example code here - from tensorrt_llm._tensorrt_engine import LLM - from tensorrt_llm.llmapi import KvCacheConfig, TrtLlmArgs + def test_llm_args_type_tensorrt(self, llm_root, llm_venv, llm_models_root): + # Keep the complete example code here + import pathlib + import tensorrt_llm._tensorrt_engine as trt_engine + import tensorrt_llm.llmapi as llmapi @@ - kv_cache_config = KvCacheConfig(free_gpu_memory_fraction=0.4) + model_path = str( + pathlib.Path(llm_models_root) / "llama-models-v3" / "llama-v3-8b-instruct-hf" + ) + kv_cache_config = llmapi.KvCacheConfig(free_gpu_memory_fraction=0.4) @@ - llm = LLM(model=model_path, kv_cache_config=kv_cache_config) + llm = trt_engine.LLM(model=model_path, kv_cache_config=kv_cache_config) @@ - assert isinstance(llm.args, TrtLlmArgs) + assert isinstance(llm.args, llmapi.TrtLlmArgs)
🧹 Nitpick comments (3)
tests/integration/defs/llmapi/test_llm_api_qa.py (3)
4-4: Follow guideline: keep module namespace when importingPrefer importing the module namespace and referencing attributes to comply with project Python import style.
-from defs.common import venv_check_output +import defs.common as commonThen update usages:
- pytorch_output = venv_check_output(llm_venv, pytorch_cmd) + pytorch_output = common.venv_check_output(llm_venv, pytorch_cmd) @@ - tensorrt_output = venv_check_output(llm_venv, tensorrt_cmd) + tensorrt_output = common.venv_check_output(llm_venv, tensorrt_cmd)Please verify the import path resolution in your test harness (sys.path) still supports
import defs.common as common.
12-14: Docstring style: single-line with ending punctuation (D200, D415)- """ - Check that the default backend is PyTorch for v1.0 breaking change - """ + """Verify default backend is PyTorch for v1.0 breaking change."""
27-29: Drop prints in testsPrinting generation outputs creates noisy CI logs without assertions. Iterate to exercise the path without emitting output.
- for output in llm.generate(["Hello, world!"]): - print(output) + for _ in llm.generate(["Hello, world!"]): + passAlso applies to: 43-45
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tensorrt_llm/llmapi/llm.py(2 hunks)tests/integration/defs/llmapi/_run_llmapi_llm.py(1 hunks)tests/integration/defs/llmapi/test_llm_api_qa.py(1 hunks)tests/integration/test_lists/qa/llm_function_full.txt(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- tensorrt_llm/llmapi/llm.py
- tests/integration/defs/llmapi/_run_llmapi_llm.py
- tests/integration/test_lists/qa/llm_function_full.txt
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🧠 Learnings (8)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-01T15:14:45.673Z
Learnt from: yibinl-nvidia
PR: NVIDIA/TensorRT-LLM#6506
File: examples/models/core/mixtral/requirements.txt:3-3
Timestamp: 2025-08-01T15:14:45.673Z
Learning: In TensorRT-LLM, examples directory can have different dependency versions than the root requirements.txt file. Version conflicts between root and examples dependencies are acceptable because examples are designed to be standalone and self-contained.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-07-22T09:22:14.726Z
Learnt from: yechank-nvidia
PR: NVIDIA/TensorRT-LLM#6254
File: tensorrt_llm/_torch/pyexecutor/model_engine.py:1201-1204
Timestamp: 2025-07-22T09:22:14.726Z
Learning: In TensorRT-LLM's multimodal processing pipeline, shared tensor recovery using `from_shared_tensor()` is only needed during the context phase. Generation requests reuse the already-recovered tensor data and only need to call `strip_for_generation()` to remove unnecessary multimodal data while preserving the recovered tensors. This avoids redundant tensor recovery operations during generation.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-08T04:10:18.987Z
Learnt from: djns99
PR: NVIDIA/TensorRT-LLM#6728
File: cpp/tensorrt_llm/plugins/mixtureOfExperts/mixtureOfExpertsPlugin.cpp:966-966
Timestamp: 2025-08-08T04:10:18.987Z
Learning: TensorRT plugins currently don't support padding functionality, and TensorRT is not getting new features (in maintenance mode). This means that duplicating parameters like mExpertHiddenSize in function calls, even with TODO comments, can be acceptable as pragmatic solutions within these constraints.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-06T03:47:16.802Z
Learnt from: venkywonka
PR: NVIDIA/TensorRT-LLM#6650
File: tests/integration/test_lists/qa/llm_perf_cluster.yml:33-37
Timestamp: 2025-08-06T03:47:16.802Z
Learning: Ministral is a valid model name from Mistral AI, distinct from the regular Mistral models. In TensorRT-LLM test configurations, "ministral_8b" and "ministral_8b_fp8" are correct model identifiers and should not be changed to "mistral_8b".
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-08T05:06:31.537Z
Learnt from: sklevtsov-nvidia
PR: NVIDIA/TensorRT-LLM#3294
File: cpp/tensorrt_llm/cutlass_extensions/include/cutlass_extensions/epilogue/fusion/sm90_visitor_scatter.hpp:36-36
Timestamp: 2025-08-08T05:06:31.537Z
Learning: CUTLASS extension files (under cpp/tensorrt_llm/cutlass_extensions/) follow CUTLASS coding style conventions, including using #pragma once instead of TRTLLM_ prefixed header guards, even though they are .hpp files.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🪛 Ruff (0.12.2)
tests/integration/defs/llmapi/test_llm_api_qa.py
12-13: One-line docstring should fit on one line
Reformat to one line
(D200)
12-13: First line should end with a period, question mark, or exclamation point
Add closing punctuation
(D415)
69-69: Line too long (123 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
|
/bot run |
|
PR_Github #14701 [ run ] triggered by Bot |
|
PR_Github #14701 [ run ] completed with state |
Signed-off-by: Superjomn <[email protected]>
|
/bot run --disable-fail-fast |
|
PR_Github #14707 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (3)
tests/integration/defs/llmapi/test_llm_api_qa.py (3)
6-8: Do not import from conftest; compute model_path via fixture inside tests.Importing conftest breaks pytest collection and computing model_path at module import-time couples discovery to environment. Pass llm_models_root as a fixture argument and build the path with pathlib inside each test.
Apply this removal locally; follow-up diffs below adjust each test:
-from ..conftest import llm_models_root - -model_path = llm_models_root() + "/llama-models-v3/llama-v3-8b-instruct-hf"
46-55: Logging test: inject fixture, drop debug print, and use pathlib for model_path.This aligns with fixtures-first and cross-platform paths.
- def test_llm_args_logging(self, llm_root, llm_venv): + def test_llm_args_logging(self, llm_root, llm_venv, llm_models_root): # It should print the backend in the log - script_path = os.path.join(os.path.dirname(__file__), - "_run_llmapi_llm.py") - print(f"script_path: {script_path}") + import pathlib + script_path = os.path.join(os.path.dirname(__file__), "_run_llmapi_llm.py") + model_path = str( + pathlib.Path(llm_models_root) / "llama-models-v3" / "llama-v3-8b-instruct-hf" + )
16-25: Use fixtures, pathlib, and module-namespace imports per guidelines.
- Inject llm_models_root fixture.
- Use pathlib for cross-platform paths.
- Keep module namespace on imports and update type references.
- def test_llm_args_type_default(self, llm_root, llm_venv): + def test_llm_args_type_default(self, llm_root, llm_venv, llm_models_root): # Keep the complete example code here - from tensorrt_llm.llmapi import LLM, KvCacheConfig, TorchLlmArgs + import pathlib + import tensorrt_llm.llmapi as llmapi - - kv_cache_config = KvCacheConfig(free_gpu_memory_fraction=0.4) - llm = LLM(model=model_path, kv_cache_config=kv_cache_config) + model_path = str( + pathlib.Path(llm_models_root) / "llama-models-v3" / "llama-v3-8b-instruct-hf" + ) + kv_cache_config = llmapi.KvCacheConfig(free_gpu_memory_fraction=0.4) + llm = llmapi.LLM(model=model_path, kv_cache_config=kv_cache_config) @@ - assert llm.args.backend == "pytorch" - assert isinstance(llm.args, TorchLlmArgs) + assert llm.args.backend == "pytorch" + assert isinstance(llm.args, llmapi.TorchLlmArgs)
🧹 Nitpick comments (4)
tests/integration/defs/llmapi/test_llm_api_qa.py (4)
12-14: Docstring style: one line and end with punctuation (D200, D415).Make the class docstring a single line ending with a period.
- """ - Check that the default backend is PyTorch for v1.0 breaking change - """ + """Check that the default backend is PyTorch for the v1.0 breaking change."""
27-29: Avoid printing in tests; keep execution minimal.Reduce runtime and log noise; just touch generate once.
- for output in llm.generate(["Hello, world!"]): - print(output) + for _ in llm.generate(["Hello, world!"]): + break # exercise the path without spamming CI logs
43-45: Remove prints to keep tests clean.Same rationale as the PyTorch test.
- for output in llm.generate(["Hello, world!"]): - print(output) + for _ in llm.generate(["Hello, world!"]): + break
59-61: Break long assertion and relax message match to be less brittle (E501).Keep under 120 chars and avoid hard-coding the full sentence; the exact wording can drift.
- assert "Using LLM with PyTorch backend" in pytorch_output, f"Expected 'pytorch' in logs, got: {pytorch_output}" + assert ( + "PyTorch backend" in pytorch_output + ), f"Expected 'pytorch' in logs, got: {pytorch_output}"
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
tensorrt_llm/llmapi/llm.py(2 hunks)tests/integration/defs/llmapi/_run_llmapi_llm.py(1 hunks)tests/integration/defs/llmapi/test_llm_api_qa.py(1 hunks)tests/integration/test_lists/qa/llm_function_full.txt(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/integration/test_lists/qa/llm_function_full.txt
- tensorrt_llm/llmapi/llm.py
- tests/integration/defs/llmapi/_run_llmapi_llm.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the class docstring.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.
Files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
📚 Learning: 2025-07-28T17:06:08.621Z
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-06T13:58:07.506Z
Learnt from: galagam
PR: NVIDIA/TensorRT-LLM#6487
File: tests/unittest/_torch/auto_deploy/unit/singlegpu/test_ad_trtllm_bench.py:1-12
Timestamp: 2025-08-06T13:58:07.506Z
Learning: In TensorRT-LLM, test files (files under tests/ directories) do not require NVIDIA copyright headers, unlike production source code files. Test files typically start directly with imports, docstrings, or code.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
📚 Learning: 2025-08-01T15:14:45.673Z
Learnt from: yibinl-nvidia
PR: NVIDIA/TensorRT-LLM#6506
File: examples/models/core/mixtral/requirements.txt:3-3
Timestamp: 2025-08-01T15:14:45.673Z
Learning: In TensorRT-LLM, examples directory can have different dependency versions than the root requirements.txt file. Version conflicts between root and examples dependencies are acceptable because examples are designed to be standalone and self-contained.
Applied to files:
tests/integration/defs/llmapi/test_llm_api_qa.py
🪛 Ruff (0.12.2)
tests/integration/defs/llmapi/test_llm_api_qa.py
12-13: One-line docstring should fit on one line
Reformat to one line
(D200)
12-13: First line should end with a period, question mark, or exclamation point
Add closing punctuation
(D415)
69-69: Line too long (123 > 120)
(E501)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
|
PR_Github #14707 [ run ] completed with state |
Signed-off-by: Superjomn <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Signed-off-by: Superjomn <[email protected]> Signed-off-by: Wangshanshan <[email protected]>
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Description
Test Coverage
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]to print this help message.See details below for each supported subcommand.
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-listparameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.mdand the
scripts/test_to_stage_mapping.pyhelper.kill
killKill all running builds associated with pull request.
skip
skip --comment COMMENTSkip testing for latest commit on pull request.
--comment "Reason for skipping build/test"is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipelineReuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.