Skip to content

Conversation

@crazydemo
Copy link
Collaborator

@crazydemo crazydemo commented Jul 24, 2025

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced structured timeout management for integration tests, allowing time limits to be set and enforced for major test phases.
    • Added a utility class and fixtures for managing and tracking timeouts during test execution.
  • Improvements

    • Enhanced test functions to support timeout management, improving control over test durations.
    • Allowed passing timeout parameters to subprocess calls within tests for more robust execution.
    • Wrapped key subprocess operations in timeout contexts across multiple test suites to enforce time limits.
  • Chores

    • Updated test list annotations to adjust and increase timeout durations for certain tests.

Description

This PR adds timeout management system to handle TensorRT flow test cases with proper resource cleanup.

  1. Added TimeoutManager Class
  • Manages TensorRT flow test cases with proper timeout handling
  • Ensures resource cleanup for test operations
  1. Added timeout_manager Fixture
  • Easy integration with test cases via fixture parameter
  • Provides granular timeout control for different test phases

Test Coverage

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md
and the scripts/test_to_stage_mapping.py helper.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 24, 2025

📝 Walkthrough

Walkthrough

Timeout management was integrated into the integration test framework. A TimeoutManager utility was introduced, and test harnesses, test functions, and subprocess-invoking utilities were updated to support structured timeout handling. Pytest fixtures were added to supply timeout values from markers or command-line options, and test list timeout annotations were adjusted.

Changes

File(s) Change Summary
tests/integration/defs/accuracy/accuracy_core.py CliFlowAccuracyTestHarness.run updated to accept and use an optional timeout_manager for phase-level timeouts.
tests/integration/defs/accuracy/test_cli_flow.py Test methods updated to accept and forward timeout_manager.
tests/integration/defs/common.py MPI wrapper and conversion/quantization utilities updated to accept and forward timeout and arbitrary kwargs.
tests/integration/defs/conftest.py Added fixtures: timeout_from_marker, timeout_from_command_line, and timeout_manager for timeout injection.
tests/integration/defs/examples/test_commandr.py
tests/integration/defs/examples/test_gpt.py
tests/integration/defs/examples/test_exaone.py
tests/integration/defs/examples/test_llama.py
Test functions updated to accept timeout_manager and wrap subprocess calls in timed contexts with timeouts enforced.
tests/integration/defs/utils/init.py New module for test utilities; exports timeout management API.
tests/integration/defs/utils/timeout_manager.py New TimeoutManager class, factory, and decorator for managing and injecting timeouts in tests.
tests/integration/test_lists/qa/examples_test_list.txt Timeout annotations and durations updated for several tests.

Sequence Diagram(s)

sequenceDiagram
    participant pytest
    participant Fixture as timeout_manager fixture
    participant Test as Test Function
    participant Timeout as TimeoutManager
    participant Subprocess as Subprocess Call

    pytest->>Fixture: Request timeout_manager
    Fixture->>Timeout: Instantiate TimeoutManager (from marker/CLI)
    pytest->>Test: Inject timeout_manager
    loop For each major test phase
        Test->>Timeout: Enter timed_operation("phase")
        Timeout->>Subprocess: Call with timeout=remaining_timeout
        Subprocess-->>Timeout: Return result or raise on timeout
        Timeout->>Test: Exit timed_operation (update remaining timeout)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • litaotju
  • yilin-void
  • syuoni
  • yuxianq

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@coderabbitai coderabbitai bot requested review from EmmaQiaoCh and tburt-nv July 24, 2025 18:25
@crazydemo
Copy link
Collaborator Author

/bot run --stage-list "A10-PyTorch-1, H100_PCIe-PyTorch-2"

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
tests/integration/defs/conftest.py (1)

371-384: Fix the priority comment to match the actual implementation.

The comment states "command line > marker > config" but the implementation prioritizes marker over command line (lines 379-382). This inconsistency could confuse future maintainers.

Apply this diff to correct the comment:

-    """Create a TimeoutManager instance with priority: command line > marker > config."""
+    """Create a TimeoutManager instance with priority: marker > command line."""

Alternatively, if the intended priority is command line over marker, update the implementation:

-    if timeout_from_marker is not None:
+    if timeout_from_command_line is not None:
+        timeout_value = timeout_from_command_line
+    elif timeout_from_marker is not None:
         timeout_value = timeout_from_marker
-    elif timeout_from_command_line is not None:
-        timeout_value = timeout_from_command_line
🧹 Nitpick comments (2)
tests/integration/defs/examples/test_gpt.py (1)

640-641: Good integration of timeout management parameter.

The addition of timeout_manager parameter aligns well with the PR's timeout management objectives.

Note: The docstring on line 642 should end with a period per static analysis hint D415.

-    "Build & Run GPT-3 175B: 96 layer w/ plugins"
+    "Build & Run GPT-3 175B: 96 layer w/ plugins."
tests/integration/defs/accuracy/accuracy_core.py (1)

704-756: Excellent timeout management integration with backward compatibility.

The implementation effectively integrates timeout management while maintaining backward compatibility. The conditional approach cleanly separates timeout-managed execution from the original behavior, and each phase is properly wrapped with descriptive operation names.

The parameter forwarding is consistent across both execution paths, ensuring no functionality is lost when timeout management is disabled.

Fix the docstring formatting per static analysis hint D205:

 def run(self,
         tasks: Optional[List[AccuracyTask]] = None,
         dtype: str = 'auto',
         quant_algo: Optional[str] = None,
         kv_cache_quant_algo: Optional[str] = None,
         spec_dec_algo: Optional[str] = None,
         extra_acc_spec: Optional[str] = None,
         tp_size: int = 1,
         pp_size: int = 1,
         cp_size: int = 1,
         extra_convert_args: Optional[list] = None,
         extra_build_args: Optional[list] = None,
         extra_summarize_args: Optional[list] = None,
         extra_eval_long_context_args: Optional[list] = None,
         env: Optional[Dict[str, str]] = None,
         timeout_manager=None):
     """
     Run all accuracy test phases with timeout management.
+
     If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
     """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0df758e and f9cfd2f.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile = ...).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL = ...).
Python constants should use upper snake_case (e.g., MY_CONSTANT = ...).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a file, prefer docstrings over comments in Python.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/utils/__init__.py
  • tests/integration/defs/common.py
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/conftest.py
  • tests/integration/defs/examples/test_llama.py
  • tests/integration/defs/examples/test_gpt.py
  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/utils/timeout_manager.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. The block should be prepended to the top of all files, including .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/utils/__init__.py
  • tests/integration/defs/common.py
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/conftest.py
  • tests/integration/defs/examples/test_llama.py
  • tests/integration/defs/examples/test_gpt.py
  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/utils/timeout_manager.py
🪛 Ruff (0.12.2)
tests/integration/defs/examples/test_gpt.py

640-640: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

🔇 Additional comments (38)
tests/integration/defs/utils/__init__.py (1)

1-28: LGTM! Clean package initialization following Python best practices.

The package initialization is well-structured with proper copyright header, clear documentation, and explicit API definition via __all__. The imports from the timeout_manager submodule are clean and the module serves its intended purpose as a utility package for the timeout management system.

tests/integration/test_lists/qa/examples_test_list.txt (2)

18-18: Timeout annotation added for commandr test.

The 120-second timeout annotation aligns with the broader timeout management system being introduced to prevent test hangs during intensive workloads.


22-31: Consistent timeout annotations for enc_dec tests.

All enc_dec tests now have uniform 90-second timeout annotations, providing structured timeout handling that aligns with the new timeout management framework. The timeout duration appears appropriate for these test types.

tests/integration/defs/accuracy/test_cli_flow.py (2)

1158-1166: Timeout management integration for fp8_tp2pp2 test.

The addition of the timeout_manager parameter and its forwarding to self.run() properly integrates structured timeout handling without affecting the core test logic. This aligns with the broader timeout management framework being introduced.


1176-1189: Timeout management integration for int8_plugin_tp8 test.

The timeout_manager parameter addition follows the same pattern as other tests, enabling structured timeout handling for this resource-intensive test method. The implementation is clean and maintains the existing test functionality.

tests/integration/defs/examples/test_exaone.py (8)

36-36: LGTM! Consistent timeout management integration.

The timeout_manager parameter has been properly added to both test functions, enabling structured timeout handling for TensorRT flow test cases as intended by the PR objectives.


42-53: Effective timeout management for weight conversion.

The weight conversion operation is properly wrapped with timeout_manager.timed_operation("convert") and the remaining timeout is correctly passed to the convert_weights function. This ensures proper timeout tracking and enforcement during the conversion phase.


56-66: Consistent timeout handling for engine building.

The engine building phase is correctly wrapped with the timeout manager using the "build" label, and timeout_manager.remaining_timeout is properly passed to the check_call function. This maintains consistency with the overall timeout management pattern.


88-91: Proper timeout management for summary execution.

The summary execution is appropriately wrapped with timeout_manager.timed_operation("summary") and the timeout is correctly forwarded to venv_check_call. This completes the timeout management for all three critical phases of the test.


103-104: Consistent parameter handling across test functions.

The second test function test_llm_exaone_2gpu correctly includes the timeout_manager parameter, maintaining consistency with the first function. The parameter positioning is appropriate.


111-123: Proper timeout management implementation for 2GPU test.

The weight conversion in the 2GPU test follows the same timeout management pattern as the 1GPU test, with proper context manager usage and timeout parameter forwarding. The implementation is consistent and correct.


126-134: Consistent engine building timeout handling.

The engine building phase in the 2GPU test properly implements the same timeout management pattern, using the "build" label and forwarding the remaining timeout to the subprocess call.


150-154: Appropriate timeout management for MPI execution.

The summary execution for the 2GPU test correctly uses venv_mpi_check_call with proper timeout forwarding, maintaining consistency with the timeout management pattern while using the appropriate MPI-enabled function for multi-GPU execution.

tests/integration/defs/common.py (8)

46-46: **Proper kwargs forwarding for MPI functions.

The addition of **kwargs parameter to venv_mpi_check_call enables passing additional parameters (including timeout) through the MPI wrapper function. This change aligns with the timeout management requirements.


63-63: Consistent kwargs forwarding implementation.

The venv.run_cmd call now properly forwards the **kwargs, ensuring that timeout and other parameters can be passed through to the underlying command execution.


66-66: Matching kwargs support for MPI output function.

The venv_mpi_check_output function has been updated with **kwargs parameter, maintaining consistency with the venv_mpi_check_call function and enabling timeout support for MPI output operations.


83-83: Proper kwargs forwarding in MPI output function.

The venv.run_cmd call in venv_mpi_check_output correctly forwards the **kwargs, ensuring consistent parameter handling across both MPI wrapper functions.


508-508: Clean timeout parameter extraction.

The timeout parameter is properly extracted from kwargs using pop(), which removes it from the kwargs dictionary before processing other parameters. This prevents the timeout from being passed as a command-line argument.


518-518: Correct timeout parameter forwarding.

The timeout parameter is properly passed to venv_check_call as a keyword argument, enabling timeout enforcement during weight conversion operations. The implementation correctly handles the case where timeout might be None.


610-610: Consistent timeout handling in quantize_data.

The quantize_data function follows the same timeout extraction pattern as convert_weights, maintaining consistency across the codebase for timeout parameter handling.


621-621: Proper timeout forwarding in quantization.

The timeout parameter is correctly passed to venv_check_call during quantization operations, ensuring that timeout constraints are enforced consistently across different operations.

tests/integration/defs/conftest.py (2)

352-358: LGTM! Clean implementation of timeout marker extraction.

The fixture correctly extracts timeout values from pytest markers with proper error handling.


361-368: LGTM! Proper command line timeout extraction.

The fixture correctly retrieves and converts timeout values from command line arguments.

tests/integration/defs/examples/test_commandr.py (5)

88-88: LGTM! Proper timeout manager integration.

The function signature correctly adds the timeout_manager fixture parameter for structured timeout handling.


94-106: LGTM! Proper timeout management for checkpoint conversion.

The timeout context manager is correctly implemented with appropriate timeout passing to the subprocess call.


129-133: LGTM! Consistent timeout management for engine building.

The timeout context manager is properly applied to the engine build operation with correct timeout parameter passing.


137-142: LGTM! Proper timeout management for engine execution.

The timeout context manager correctly wraps the MPI execution call with appropriate timeout handling.


154-159: LGTM! Complete timeout management for summary execution.

The final timeout context manager properly handles the summary operation, completing the comprehensive timeout management for all test phases.

tests/integration/defs/examples/test_gpt.py (2)

645-656: Excellent timeout management implementation for checkpoint conversion.

The code correctly wraps the conversion phase with timeout_manager.timed_operation and passes the remaining timeout to the subprocess call. This ensures proper resource cleanup and timeout enforcement during the conversion phase.


658-702: Well-structured timeout management for build and inference phases.

The implementation properly wraps both the build and inference phases with timeout management. The explicit handling of context_fmha logic (lines 670-675) is a good improvement that makes the conditional behavior clearer when use_attention_plugin is true but context_fmha is false.

All subprocess calls correctly receive the remaining_timeout parameter, ensuring consistent timeout enforcement across the test workflow.

tests/integration/defs/examples/test_llama.py (2)

3028-3100: LGTM! Excellent timeout management implementation.

The timeout management integration is well-structured and comprehensive:

  • Function signature properly updated to accept timeout_manager parameter
  • All major operations (dataset generation, checkpoint conversion, engine building, evaluation) are wrapped in timeout_manager.timed_operation() context managers
  • Subprocess calls consistently receive timeout=timeout_manager.remaining_timeout to enforce time limits
  • The structured approach ensures proper resource cleanup and prevents test hangs during intensive workloads

The implementation follows the established pattern and aligns with the PR objectives for improving test stability.


3400-3506: LGTM! Consistent and thorough timeout management implementation.

The timeout management integration is well-executed:

  • Function signature correctly updated to include timeout_manager parameter
  • All subprocess operations (convert weights, build engines, dataset generation, evaluation, MMLU testing) are properly wrapped in timeout_manager.timed_operation() contexts
  • Consistent application of timeout=timeout_manager.remaining_timeout across all subprocess calls
  • The implementation maintains the same high-quality pattern established in the codebase

The changes align perfectly with the PR objectives to prevent test hangs and improve overall test completion reliability.

tests/integration/defs/utils/timeout_manager.py (6)

1-19: LGTM! Proper copyright header and imports.

The copyright header follows the required NVIDIA format and the imports are appropriate for the timeout management functionality.


21-40: Well-designed class with proper initialization.

The TimeoutManager class follows Python naming conventions and properly initializes all attributes. The constructor design with optional timeout is flexible for different use cases.


41-68: Core timeout management methods are well-implemented.

The property getter, reset method, and timeout checking logic are correctly implemented with appropriate error handling and clear documentation.


69-95: Excellent context manager implementation.

The timed_operation context manager is well-designed with proper error handling using try/finally. It correctly handles both timeout and no-timeout scenarios, and ensures accurate time tracking.


96-139: Useful convenience methods for timeout execution.

Both execute_with_timeout and call_with_timeout provide clean interfaces for wrapping function calls with timeout tracking. The implementation correctly delegates to the timed_operation context manager.


141-152: Simple and effective factory function.

The create_timeout_manager factory function provides a clean interface for creating TimeoutManager instances from pytest marker values.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #12889 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #12889 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #9607 (Partly Tested) completed with status: 'SUCCESS'

@crazydemo crazydemo force-pushed the resubmit_timeout_refactor branch from f9cfd2f to 6b525d6 Compare July 24, 2025 23:16
@crazydemo
Copy link
Collaborator Author

/bot run

@coderabbitai coderabbitai bot requested review from chzblych and niukuo July 24, 2025 23:17
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/integration/defs/accuracy/accuracy_core.py (1)

706-709: Fix docstring formatting.

The docstring needs a blank line between the summary and description according to Python docstring conventions.

-        """
-        Run all accuracy test phases with timeout management.
-        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
-        """
+        """
+        Run all accuracy test phases with timeout management.
+
+        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
+        """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9cfd2f and 6b525d6.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • tests/integration/test_lists/qa/examples_test_list.txt
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/common.py
  • tests/integration/defs/conftest.py
  • tests/integration/defs/utils/init.py
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/examples/test_llama.py
  • tests/integration/defs/utils/timeout_manager.py
  • tests/integration/defs/examples/test_commandr.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile = ...).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL = ...).
Python constants should use upper snake_case (e.g., MY_CONSTANT = ...).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a file, prefer docstrings over comments in Python.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. The block should be prepended to the top of all files, including .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
🪛 Ruff (0.12.2)
tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

tests/integration/defs/examples/test_gpt.py

640-640: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (6)
tests/integration/defs/examples/test_gpt.py (4)

640-641: LGTM: Clean timeout management integration.

The function signature appropriately adds the timeout_manager parameter, enabling structured timeout handling for this test case.


645-656: LGTM: Proper timeout management for checkpoint conversion.

The convert phase is correctly wrapped with timeout management, using a descriptive operation name and properly passing the remaining timeout to the subprocess call.


660-691: LGTM: Enhanced build phase with timeout management and improved logic.

The build phase correctly integrates timeout management and improves the context_fmha flag handling logic. The explicit conditional handling ensures proper flag values are passed based on the parameter combinations.


695-702: LGTM: Proper timeout management for inference execution.

The inference run phase is correctly wrapped with timeout management, using a descriptive operation name and properly passing the remaining timeout to the MPI subprocess call.

tests/integration/defs/accuracy/accuracy_core.py (2)

704-705: LGTM: Clean method signature extension.

The addition of the optional timeout_manager parameter with a default value maintains backward compatibility while enabling timeout management functionality.


710-756: LGTM: Correct timeout management implementation.

The conditional timeout management logic correctly wraps each phase when a timeout manager is provided, while maintaining the original behavior as a fallback. The phase names are descriptive and match the actual operations.

Note: There's code duplication between the two branches, but this explicit approach provides clarity about the timeout management behavior.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #12905 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #12905 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #9622 completed with status: 'FAILURE'

@crazydemo crazydemo force-pushed the resubmit_timeout_refactor branch from 6b525d6 to 11d7b3d Compare July 28, 2025 02:08
@crazydemo
Copy link
Collaborator Author

/bot run

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/integration/defs/accuracy/accuracy_core.py (1)

706-708: Fix docstring formatting to comply with PEP 257.

The docstring needs a blank line between the summary line and the description.

-        """
-        Run all accuracy test phases with timeout management.
-        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
-        """
+        """
+        Run all accuracy test phases with timeout management.
+        
+        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
+        """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b525d6 and 11d7b3d.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/integration/defs/utils/init.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/utils/timeout_manager.py
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/common.py
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/test_lists/qa/examples_test_list.txt
  • tests/integration/defs/examples/test_llama.py
  • tests/integration/defs/conftest.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
🪛 Ruff (0.12.2)
tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

tests/integration/defs/examples/test_gpt.py

640-640: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (5)
tests/integration/defs/examples/test_gpt.py (4)

640-641: LGTM! Function signature correctly updated for timeout management.

The addition of the timeout_manager parameter enables structured timeout handling for this comprehensive GPT-3 175B test.


645-657: Excellent timeout management implementation for checkpoint conversion.

The conversion phase is properly wrapped with timeout tracking, and the remaining timeout is correctly passed to the subprocess call.


660-691: Well-structured timeout management and build logic.

The build phase correctly:

  • Wraps the operation with timeout tracking
  • Explicitly handles context_fmha configuration based on the use_attention_plugin flag
  • Passes the remaining timeout to the subprocess call

The conditional logic properly disables context_fmha when use_attention_plugin is false, and explicitly sets it when the plugin is enabled.


695-702: Proper timeout management for MPI-based inference execution.

The inference phase correctly wraps the MPI execution with timeout tracking and passes the remaining timeout to the multi-process subprocess call.

tests/integration/defs/accuracy/accuracy_core.py (1)

704-757: Excellent timeout management implementation with proper fallback.

The conditional timeout management design is well-structured:

  • Maintains backward compatibility with optional timeout_manager parameter
  • Consistently wraps all major phases (install_requirements, initialize_case, convert, build, evaluate) with timeout tracking
  • Provides clear fallback behavior when timeout management isn't needed
  • Uses descriptive operation names for better debugging

This implementation enables granular timeout control across the entire accuracy test workflow.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #13129 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #13129 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #9826 completed with status: 'SUCCESS'

@crazydemo crazydemo force-pushed the resubmit_timeout_refactor branch from 11d7b3d to 2e8ee03 Compare July 28, 2025 04:30
@crazydemo
Copy link
Collaborator Author

/bot reuse-pipeline

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/integration/defs/accuracy/accuracy_core.py (1)

704-709: Good timeout management integration with minor docstring formatting issue.

The method signature correctly adds optional timeout management support while maintaining backward compatibility. The implementation approach is sound.

Fix the docstring formatting to comply with PEP 257:

-        """
-        Run all accuracy test phases with timeout management.
-        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
-        """
+        """Run all accuracy test phases with timeout management.
+        
+        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
+        """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11d7b3d and 2e8ee03.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/integration/defs/utils/init.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/common.py
  • tests/integration/defs/conftest.py
  • tests/integration/test_lists/qa/examples_test_list.txt
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/utils/timeout_manager.py
  • tests/integration/defs/examples/test_llama.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: Python code should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL).
Python constants should use upper snake_case (e.g., MY_CONSTANT).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a Python file, prefer docstrings over comments.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for Python classes and functions, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
🪛 Ruff (0.12.2)
tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

tests/integration/defs/examples/test_gpt.py

640-640: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

🔇 Additional comments (5)
tests/integration/defs/examples/test_gpt.py (4)

640-641: LGTM: Function signature updated correctly.

The timeout_manager parameter has been properly added to integrate structured timeout management into this test function.


645-657: Well-implemented timeout management for checkpoint conversion.

The convert phase is properly wrapped with timeout_manager.timed_operation("convert") context manager, and the subprocess call correctly receives the remaining timeout parameter.


660-692: Excellent timeout management and logic improvement.

The build phase is properly wrapped with timeout management, and the --context_fmha logic has been improved for better clarity. The explicit handling of the case where use_attention_plugin is true but context_fmha is false makes the build argument logic more readable and maintainable.


695-703: Proper timeout management for inference run phase.

The run phase is correctly wrapped with timeout_manager.timed_operation("run"), and the MPI subprocess call properly receives the remaining timeout parameter while preserving the original execution structure.

tests/integration/defs/accuracy/accuracy_core.py (1)

710-757: Excellent backward-compatible timeout management implementation.

The conditional approach maintains backward compatibility while enabling structured timeout management. Each phase is properly wrapped with descriptive operation names, and the fallback path preserves existing behavior. The code duplication between the two paths is justified for clarity and maintainability.

@crazydemo crazydemo force-pushed the resubmit_timeout_refactor branch from 2e8ee03 to 65243ea Compare July 28, 2025 08:33
@crazydemo
Copy link
Collaborator Author

/bot reuse-pipeline

@coderabbitai coderabbitai bot requested a review from venkywonka July 28, 2025 08:33
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/integration/defs/utils/timeout_manager.py (1)

155-184: The previous review suggestions still apply.

The decorator implementation still has the same issues identified in the previous review: functools import should be at module level, missing error handling for timeout_from_marker, and the decorator assumes the decorated function accepts a timeout_manager parameter without verification.

🧹 Nitpick comments (3)
tests/integration/defs/examples/test_gpt.py (1)

638-641: Fix docstring punctuation and approve timeout parameter addition.

The addition of the timeout_manager parameter is well-integrated with the timeout management system. However, the docstring should end with proper punctuation.

-    "Build & Run GPT-3 175B: 96 layer w/ plugins"
+    "Build & Run GPT-3 175B: 96 layer w/ plugins."
tests/integration/defs/accuracy/accuracy_core.py (1)

690-709: Fix docstring formatting and approve timeout parameter addition.

The addition of the timeout_manager parameter with proper default value maintains backward compatibility while enabling timeout management. However, the docstring formatting needs improvement.

     def run(self,
             tasks: Optional[List[AccuracyTask]] = None,
             dtype: str = 'auto',
             quant_algo: Optional[str] = None,
             kv_cache_quant_algo: Optional[str] = None,
             spec_dec_algo: Optional[str] = None,
             extra_acc_spec: Optional[str] = None,
             tp_size: int = 1,
             pp_size: int = 1,
             cp_size: int = 1,
             extra_convert_args: Optional[list] = None,
             extra_build_args: Optional[list] = None,
             extra_summarize_args: Optional[list] = None,
             extra_eval_long_context_args: Optional[list] = None,
             env: Optional[Dict[str, str]] = None,
             timeout_manager=None):
         """
         Run all accuracy test phases with timeout management.
+
         If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
         """
tests/integration/defs/utils/timeout_manager.py (1)

56-67: Improve timeout validation logic.

The current check only validates <= 0, but this could be confusing if the remaining timeout becomes significantly negative due to long operations. Consider checking for negative values and providing more informative error messages.

 def check_timeout(self, phase_name: str = "operation") -> None:
     """
     Check if timeout has been exceeded and raise TimeoutError if so.

     Args:
         phase_name: Name of the current phase for error message.

     Raises:
         TimeoutError: If timeout has been exceeded.
     """
-    if self._remaining_timeout is not None and self._remaining_timeout <= 0:
-        raise TimeoutError(f"Timeout exceeded after {phase_name} phase!")
+    if self._remaining_timeout is not None and self._remaining_timeout <= 0:
+        exceeded_by = abs(self._remaining_timeout)
+        raise TimeoutError(f"Timeout exceeded by {exceeded_by:.2f}s after {phase_name} phase!")
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e8ee03 and 65243ea.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/integration/defs/utils/init.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/test_lists/qa/examples_test_list.txt
  • tests/integration/defs/examples/test_llama.py
  • tests/integration/defs/conftest.py
  • tests/integration/defs/common.py
🧰 Additional context used
🪛 Ruff (0.12.2)
tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

tests/integration/defs/examples/test_gpt.py

640-640: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (7)
tests/integration/defs/examples/test_gpt.py (3)

645-657: LGTM: Proper timeout management for checkpoint conversion.

The checkpoint conversion phase is correctly wrapped with timeout management, ensuring that the operation respects the remaining timeout and updates the manager's state accordingly.


659-691: LGTM: Improved build phase with timeout management and better context_fmha logic.

The build phase correctly implements timeout management, and the explicit handling of context_fmha=disable when use_attention_plugin=True but context_fmha=False improves the logic clarity.


693-702: LGTM: Proper timeout management for inference run phase.

The inference run phase is correctly wrapped with timeout management, ensuring that the MPI-based execution respects the remaining timeout while maintaining the original command structure.

tests/integration/defs/accuracy/accuracy_core.py (2)

710-736: LGTM: Well-structured timeout management implementation.

The timeout-managed execution path properly wraps each phase in timeout_manager.timed_operation contexts, enabling precise timeout tracking and enforcement across all accuracy test phases. The parameter passing is complete and correct.


737-756: LGTM: Proper fallback implementation for backward compatibility.

The fallback execution path maintains the original sequential behavior when no timeout manager is provided, ensuring backward compatibility while keeping the implementation clean and straightforward.

tests/integration/defs/utils/timeout_manager.py (2)

69-94: LGTM! Well-designed context manager.

The timed_operation context manager properly handles both timeout and no-timeout scenarios, correctly updates the remaining timeout in the finally block, and provides clear error reporting. The design ensures that timeout tracking occurs regardless of whether the operation succeeds or fails.


141-152: LGTM! Clean factory function.

The factory function provides a clean abstraction for creating TimeoutManager instances and follows the expected pattern for pytest marker integration.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #13172 [ reuse-pipeline ] triggered by Bot

@crazydemo crazydemo changed the title [nvbug/5410279] tests: resubmit timeout refactor [https://nvbugs/5410279][test]: resubmit timeout refactor Aug 4, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tests/integration/defs/utils/timeout_manager.py (2)

30-39: Add validation for non-negative timeout values.

The constructor should validate that initial_timeout is non-negative when provided to prevent unexpected behavior.

 def __init__(self, initial_timeout: Optional[float] = None):
     """
     Initialize the timeout manager.

     Args:
         initial_timeout: Initial timeout value in seconds. If None, no timeout is enforced.
     """
+    if initial_timeout is not None and initial_timeout < 0:
+        raise ValueError("initial_timeout must be non-negative")
     self._initial_timeout = initial_timeout
     self._remaining_timeout = initial_timeout
     self._start_time = None

155-184: Improve the decorator implementation for better robustness.

The decorator has several areas for improvement:

  1. The functools import should be at module level, not inside the function
  2. Missing error handling when timeout_from_marker is not in kwargs
  3. The decorator assumes the decorated function accepts a timeout_manager parameter

Consider this improved implementation:

+import functools
+
 # Convenience decorator for test functions
 def with_timeout_management(func: Callable) -> Callable:
     """
     Decorator to automatically inject timeout management into test functions.
 
-    This decorator expects the test function to have a 'timeout_from_marker' parameter
+    This decorator expects the test function to have 'timeout_from_marker' and 'timeout_manager' parameters
     and automatically creates a TimeoutManager instance.
 
     Args:
         func: The test function to decorate.
 
     Returns:
         The decorated function.
     """
-    import functools
 
     @functools.wraps(func)
     def wrapper(*args, **kwargs):
         # Extract timeout_from_marker from kwargs
-        timeout_from_marker = kwargs.get('timeout_from_marker')
+        timeout_from_marker = kwargs.get('timeout_from_marker')
+        
+        # Only inject timeout_manager if the function expects it
+        import inspect
+        sig = inspect.signature(func)
+        if 'timeout_manager' not in sig.parameters:
+            return func(*args, **kwargs)
 
         # Create timeout manager
         timeout_manager = create_timeout_manager(timeout_from_marker)
 
         # Add timeout_manager to kwargs
         kwargs['timeout_manager'] = timeout_manager
 
         return func(*args, **kwargs)
 
     return wrapper
🧹 Nitpick comments (1)
tests/integration/defs/accuracy/accuracy_core.py (1)

704-756: LGTM! Well-structured timeout management implementation with backward compatibility.

The method correctly implements conditional timeout management:

  • Each phase (install_requirements, initialize_case, convert, build, evaluate) is wrapped in timed_operation contexts when timeout_manager is provided
  • Maintains backward compatibility with a fallback to original sequential execution
  • Clean separation between timeout-managed and non-timeout-managed execution paths

Consider adding a blank line between the docstring summary and description to address the static analysis hint:

-        """
-        Run all accuracy test phases with timeout management.
-        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
-        """
+        """
+        Run all accuracy test phases with timeout management.
+        
+        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
+        """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5bb2cd7 and 0c8707f.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/test_lists/qa/examples_test_list.txt
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/conftest.py
  • tests/integration/defs/utils/init.py
  • tests/integration/defs/examples/test_llama.py
  • tests/integration/defs/common.py
  • tests/integration/defs/accuracy/test_cli_flow.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile = ...).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL = ...).
Python constants should use upper snake_case (e.g., MY_CONSTANT = ...).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a file, prefer docstrings over comments in Python.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/utils/timeout_manager.py
  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/utils/timeout_manager.py
  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
📚 Learning: in tensorrt-llm testing, it's common to have both cli flow tests (test_cli_flow.py) and pytorch api ...
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Applied to files:

  • tests/integration/defs/examples/test_gpt.py
🪛 Ruff (0.12.2)
tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

tests/integration/defs/examples/test_gpt.py

647-647: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (8)
tests/integration/defs/examples/test_gpt.py (4)

647-648: LGTM! Function signature updated for timeout management integration.

The addition of the timeout_manager parameter is consistent with the broader timeout management implementation across the test suite.


652-663: LGTM! Proper timeout management for checkpoint conversion.

The conversion phase is correctly wrapped with timed_operation("convert") and the subprocess call receives the appropriate timeout value from timeout_manager.remaining_timeout.


665-698: LGTM! Enhanced build phase with timeout management and improved logic.

The build phase correctly implements timeout management with timed_operation("build"). The improved context_fmha flag handling explicitly disables it when use_attention_plugin is true but context_fmha is false, making the command construction more explicit and maintainable.


700-709: LGTM! Complete timeout management implementation for inference phase.

The inference phase is properly wrapped with timed_operation("run") and the MPI subprocess call receives the appropriate timeout value. This completes the comprehensive timeout management for all major test phases.

tests/integration/defs/utils/timeout_manager.py (4)

1-19: LGTM - Copyright header and imports are correct.

The NVIDIA copyright header includes the current year range (2022-2024) as required, and the imports are appropriate for the functionality provided.


41-94: LGTM - Core timeout management methods are well implemented.

The property, reset method, check_timeout method, and timed_operation context manager provide a clean and robust interface for timeout management. The logic correctly handles None timeout values for no enforcement scenarios.


96-138: LGTM - Execution wrapper methods provide clean abstractions.

Both execute_with_timeout and call_with_timeout methods provide convenient wrappers around the core timed_operation context manager, with appropriate parameter handling and documentation.


141-152: LGTM - Factory function is simple and appropriate.

The create_timeout_manager factory function provides a clean interface for creating TimeoutManager instances from pytest marker values.

@crazydemo crazydemo changed the title [https://nvbugs/5410279][test]: resubmit timeout refactor [https://nvbugs/5410279][test] resubmit timeout refactor Aug 4, 2025
@StanleySun639 StanleySun639 enabled auto-merge (squash) August 4, 2025 07:06
auto-merge was automatically disabled August 4, 2025 08:02

Head branch was pushed to by a user without write access

@crazydemo crazydemo force-pushed the resubmit_timeout_refactor branch from 0c8707f to 336439a Compare August 4, 2025 08:02
@crazydemo
Copy link
Collaborator Author

/bot reuse-pipeline

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/integration/defs/accuracy/accuracy_core.py (1)

706-709: Fix docstring formatting

The docstring needs a blank line between the summary and description to comply with PEP 257.

-        """
-        Run all accuracy test phases with timeout management.
-        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
-        """
+        """
+        Run all accuracy test phases with timeout management.
+
+        If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
+        """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0c8707f and 336439a.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (9)
  • tests/integration/test_lists/qa/examples_test_list.txt
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/utils/init.py
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/defs/common.py
  • tests/integration/defs/conftest.py
  • tests/integration/defs/examples/test_llama.py
  • tests/integration/defs/utils/timeout_manager.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

**/*.py: The code developed for TensorRT-LLM should conform to Python 3.8+.
Indent Python code with 4 spaces. Do not use tabs.
Always maintain the namespace when importing in Python, even if only one class or function from a module is used.
Python filenames should use snake_case (e.g., some_file.py).
Python classes should use PascalCase (e.g., class SomeClass).
Python functions and methods should use snake_case (e.g., def my_awesome_function():).
Python local variables should use snake_case. Prefix k for variable names that start with a number (e.g., k_99th_percentile = ...).
Python global variables should use upper snake_case and prefix G (e.g., G_MY_GLOBAL = ...).
Python constants should use upper snake_case (e.g., MY_CONSTANT = ...).
Avoid shadowing variables declared in an outer scope in Python.
Initialize all externally visible members of a Python class in the constructor.
For interfaces that may be used outside a file, prefer docstrings over comments in Python.
Comments in Python should be reserved for code within a function, or interfaces that are local to a file.
Use Google style docstrings for classes and functions in Python, which can be parsed by Sphinx.
Attributes and variables in Python can be documented inline; attribute docstrings will be rendered under the docstring for the class.
Avoid using reflection in Python when functionality can be easily achieved without it.
When using try-except blocks in Python, limit the except to the smallest set of errors possible.
When using try-except blocks to handle multiple possible variable types in Python, keep the body of the try as small as possible, using the else block to implement the logic.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
**/*.{cpp,h,hpp,cc,cxx,cu,py}

📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)

All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the current year. This includes .cpp, .h, .cu, .py, and any other source files which are compiled or interpreted.

Files:

  • tests/integration/defs/accuracy/accuracy_core.py
  • tests/integration/defs/examples/test_gpt.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
📚 Learning: in tensorrt-llm testing, it's common to have both cli flow tests (test_cli_flow.py) and pytorch api ...
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Applied to files:

  • tests/integration/defs/examples/test_gpt.py
🪛 Ruff (0.12.2)
tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

tests/integration/defs/examples/test_gpt.py

647-647: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (5)
tests/integration/defs/examples/test_gpt.py (4)

647-648: LGTM: Clean timeout management integration

The addition of the timeout_manager parameter follows the established pattern for integrating timeout management across the test suite.


652-663: LGTM: Proper timeout management for conversion phase

The checkpoint conversion is correctly wrapped with timeout management, ensuring proper resource cleanup and timeout enforcement during the conversion process.


665-698: LGTM: Enhanced build phase with improved context_fmha logic

The build phase correctly integrates timeout management while also improving the context_fmha logic clarity. The explicit conditional handling (lines 679-682) makes it clear when context_fmha should be enabled versus disabled when use_attention_plugin is true, which is an improvement over the previous implicit behavior.


700-709: LGTM: Consistent timeout management for inference phase

The inference run phase correctly implements timeout management consistent with the convert and build phases, ensuring comprehensive timeout coverage across all test phases.

tests/integration/defs/accuracy/accuracy_core.py (1)

704-756: LGTM: Excellent timeout management integration with backward compatibility

The implementation excellently integrates timeout management while maintaining backward compatibility. The dual-path approach (lines 711-735 for timeout management, lines 737-756 for fallback) ensures existing functionality remains intact while providing comprehensive timeout coverage for all test phases when the timeout manager is available.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #13964 [ reuse-pipeline ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #13964 [ reuse-pipeline ] completed with state SUCCESS
Can't reuse PR_Github #0 with status: UNKNOWN

@crazydemo
Copy link
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Collaborator

PR_Github #13967 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #13967 [ run ] completed with state FAILURE
/LLM/main/L0_MergeRequest_PR pipeline #10520 completed with status: 'FAILURE'

@crazydemo crazydemo force-pushed the resubmit_timeout_refactor branch from 336439a to be850fb Compare August 5, 2025 05:38
@crazydemo
Copy link
Collaborator Author

/bot run

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tests/integration/defs/utils/timeout_manager.py (2)

30-39: Add validation for non-negative timeout values.

The constructor should validate that initial_timeout is non-negative to prevent unexpected behavior.


155-184: Improve the decorator implementation for better robustness.

The decorator needs several improvements including moving the functools import to module level, adding error handling, and parameter inspection to avoid unexpected argument errors.

🧹 Nitpick comments (4)
tests/integration/defs/examples/test_gpt.py (1)

647-648: Good timeout management integration with minor docstring fix needed.

The addition of the timeout_manager parameter correctly integrates with the new timeout management framework. However, the docstring should end with a period for proper formatting.

Apply this diff to fix the docstring formatting:

-    "Build & Run GPT-3 175B: 96 layer w/ plugins"
+    "Build & Run GPT-3 175B: 96 layer w/ plugins."
tests/integration/defs/accuracy/accuracy_core.py (1)

704-709: Good timeout management integration with docstring formatting fix needed.

The addition of the optional timeout_manager parameter properly integrates with the timeout management framework. The docstring clearly explains the new functionality.

Apply this diff to fix the docstring formatting per PEP 257:

     def run(self,
             tasks: Optional[List[AccuracyTask]] = None,
             dtype: str = 'auto',
             quant_algo: Optional[str] = None,
             kv_cache_quant_algo: Optional[str] = None,
             spec_dec_algo: Optional[str] = None,
             extra_acc_spec: Optional[str] = None,
             tp_size: int = 1,
             pp_size: int = 1,
             cp_size: int = 1,
             extra_convert_args: Optional[list] = None,
             extra_build_args: Optional[list] = None,
             extra_summarize_args: Optional[list] = None,
             extra_eval_long_context_args: Optional[list] = None,
             env: Optional[Dict[str, str]] = None,
             timeout_manager=None):
         """
         Run all accuracy test phases with timeout management.
+
         If timeout_manager is provided, each phase will be wrapped to track and deduct remaining timeout.
         """
tests/integration/defs/utils/timeout_manager.py (2)

37-39: Remove unused _start_time attribute.

The _start_time attribute is initialized and reset but never used elsewhere in the class. Consider removing it to simplify the implementation.

 def __init__(self, initial_timeout: Optional[float] = None):
     """
     Initialize the timeout manager.

     Args:
         initial_timeout: Initial timeout value in seconds. If None, no timeout is enforced.
     """
     self._initial_timeout = initial_timeout
     self._remaining_timeout = initial_timeout
-    self._start_time = None

 def reset(self, timeout: Optional[float] = None) -> None:
     """
     Reset the timeout manager with a new timeout value.

     Args:
         timeout: New timeout value. If None, uses the initial timeout.
     """
     self._remaining_timeout = timeout if timeout is not None else self._initial_timeout
-    self._start_time = None

Also applies to: 53-54


96-115: Consider consolidating similar methods.

execute_with_timeout and call_with_timeout provide similar functionality. The more flexible call_with_timeout could potentially replace execute_with_timeout to reduce code duplication.

If you decide to consolidate, you could remove execute_with_timeout and update callers to use call_with_timeout:

-def execute_with_timeout(self,
-                         operation: Callable[[], Any],
-                         phase_name: str = "operation",
-                         **kwargs) -> Any:
-    """
-    Execute an operation with timeout tracking.
-
-    Args:
-        operation: The operation to execute.
-        phase_name: Name of the phase for timeout checking.
-        **kwargs: Additional arguments to pass to the operation.
-
-    Returns:
-        The result of the operation.
-
-    Raises:
-        TimeoutError: If timeout is exceeded after the operation.
-    """
-    with self.timed_operation(phase_name):
-        return operation(**kwargs)
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 336439a and be850fb.

📒 Files selected for processing (11)
  • tests/integration/defs/accuracy/accuracy_core.py (1 hunks)
  • tests/integration/defs/accuracy/test_cli_flow.py (3 hunks)
  • tests/integration/defs/common.py (7 hunks)
  • tests/integration/defs/conftest.py (1 hunks)
  • tests/integration/defs/examples/test_commandr.py (3 hunks)
  • tests/integration/defs/examples/test_exaone.py (4 hunks)
  • tests/integration/defs/examples/test_gpt.py (1 hunks)
  • tests/integration/defs/examples/test_llama.py (4 hunks)
  • tests/integration/defs/utils/__init__.py (1 hunks)
  • tests/integration/defs/utils/timeout_manager.py (1 hunks)
  • tests/integration/test_lists/qa/examples_test_list.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • tests/integration/defs/conftest.py
  • tests/integration/defs/common.py
  • tests/integration/defs/examples/test_exaone.py
  • tests/integration/defs/utils/init.py
  • tests/integration/test_lists/qa/examples_test_list.txt
  • tests/integration/defs/examples/test_commandr.py
  • tests/integration/defs/accuracy/test_cli_flow.py
  • tests/integration/defs/examples/test_llama.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.
📚 Learning: in tensorrt-llm testing, it's common to have both cli flow tests (test_cli_flow.py) and pytorch api ...
Learnt from: moraxu
PR: NVIDIA/TensorRT-LLM#6303
File: tests/integration/test_lists/qa/examples_test_list.txt:494-494
Timestamp: 2025-07-28T17:06:08.621Z
Learning: In TensorRT-LLM testing, it's common to have both CLI flow tests (test_cli_flow.py) and PyTorch API tests (test_llm_api_pytorch.py) for the same model. These serve different purposes: CLI flow tests validate the traditional command-line workflow, while PyTorch API tests validate the newer LLM API backend. Both are legitimate and should coexist.

Applied to files:

  • tests/integration/defs/examples/test_gpt.py
🪛 Ruff (0.12.2)
tests/integration/defs/accuracy/accuracy_core.py

706-708: 1 blank line required between summary line and description

(D205)

tests/integration/defs/examples/test_gpt.py

647-647: First line should end with a period, question mark, or exclamation point

Add closing punctuation

(D415)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Pre-commit Check
🔇 Additional comments (5)
tests/integration/defs/examples/test_gpt.py (3)

652-663: LGTM! Proper timeout management for convert phase.

The convert checkpoint phase is correctly wrapped with timeout_manager.timed_operation("convert") and passes the remaining timeout to the subprocess call. This ensures proper timeout tracking and enforcement during the conversion phase.


665-698: Excellent timeout management and improved context_fmha logic.

The build phase is properly wrapped with timeout management and the subprocess call receives the remaining timeout. Additionally, the context_fmha logic has been improved to be more explicit - when use_attention_plugin is true, it now explicitly enables or disables context_fmha based on the parameter value, making the behavior clearer.


700-709: LGTM! Complete timeout management implementation.

The run inference phase properly completes the timeout management integration by wrapping the operation and passing the remaining timeout to the MPI subprocess call. All three major phases (convert, build, run) now have consistent timeout handling.

tests/integration/defs/accuracy/accuracy_core.py (1)

710-756: Excellent timeout management implementation with backward compatibility.

The implementation demonstrates good design principles:

  1. Backward compatibility: The fallback path ensures existing code continues to work when timeout_manager is None
  2. Logical phase separation: Each major phase (install_requirements, initialize_case, convert, build, evaluate) is properly wrapped with timeout_manager.timed_operation
  3. Consistent pattern: The timeout management follows the same pattern established in other test files

This approach allows for gradual adoption of timeout management across the test suite while maintaining existing functionality.

tests/integration/defs/utils/timeout_manager.py (1)

141-152: LGTM!

The factory function provides a clean interface for creating TimeoutManager instances from pytest markers.

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14086 [ run ] triggered by Bot

@tensorrt-cicd
Copy link
Collaborator

PR_Github #14086 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #10629 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

@StanleySun639 StanleySun639 merged commit d101a6c into NVIDIA:main Aug 5, 2025
4 checks passed
zhanghaotong pushed a commit to zhanghaotong/TensorRT-LLM that referenced this pull request Aug 5, 2025
lancelly pushed a commit to lancelly/TensorRT-LLM that referenced this pull request Aug 6, 2025
jain-ria pushed a commit to jain-ria/TensorRT-LLM that referenced this pull request Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants