Skip to content

Conversation

@narendasan
Copy link
Collaborator

Description

Allows engines to not be setup immediately after compilation but all at once before the module is returned back to the user.

Fixes #2673

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@github-actions github-actions bot added component: tests Issues re: Tests component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: runtime component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Jul 10, 2024
@narendasan narendasan requested review from peri044 and zewenli98 July 10, 2024 23:46
@github-actions github-actions bot requested a review from gs-olive July 10, 2024 23:46
output_binding_names (List[str]): List of output TensorRT engine binding names in the order they should be returned
"""

defer_engine_setup = False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

@narendasan narendasan force-pushed the lazy_engine_loading branch from fe2cd6a to f2bf073 Compare July 11, 2024 00:14
Copy link
Collaborator

@peri044 peri044 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you add a testcase with a model that has fallback ops ?

@narendasan narendasan force-pushed the lazy_engine_loading branch 3 times, most recently from 366ecc6 to 37adede Compare July 12, 2024 19:58
Copy link
Collaborator

@zewenli98 zewenli98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@narendasan narendasan force-pushed the lazy_engine_loading branch 2 times, most recently from 7c77ffe to f976de9 Compare July 31, 2024 15:42
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/conversion/harness.py	2024-07-31 15:42:55.193957+00:00
+++ /home/runner/work/TensorRT/TensorRT/tests/py/dynamo/conversion/harness.py	2024-07-31 15:44:51.304037+00:00
@@ -62,11 +62,11 @@
        interpreter,
        rtol,
        atol,
        check_dtype=True,
        pyt_inputs=None,
-        rt_cls=PythonTorchTensorRTModule
+        rt_cls=PythonTorchTensorRTModule,
    ):
        with torch.no_grad():
            cuda_inputs = []
            for i in inputs:
                cuda_inputs.append(i.cuda())
@@ -132,11 +132,11 @@
        inputs,
        expected_ops,
        interpreter,
        comparators: List[Tuple[Callable, List]],
        fp16_mode=False,
-        rt_cls=PythonTorchTensorRTModule
+        rt_cls=PythonTorchTensorRTModule,
    ):
        """
        Runs the test and compares the result using the provided comparators.
        The size of comparators must be equal to the number of outputs from 'mod'.

@narendasan narendasan force-pushed the lazy_engine_loading branch 3 times, most recently from 5cf73b4 to 29b5e72 Compare July 31, 2024 22:41
Copy link
Collaborator

@zewenli98 zewenli98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a few minor comments

@narendasan narendasan force-pushed the lazy_engine_loading branch 3 times, most recently from db1cc6a to 9bf1e2b Compare August 2, 2024 19:04
@narendasan narendasan force-pushed the lazy_engine_loading branch 2 times, most recently from 2323fe2 to 05543ec Compare August 2, 2024 21:29
Allows engines to not be setup immediately after compilation but all at
once before the module is returned back to the user.

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
@narendasan narendasan force-pushed the lazy_engine_loading branch from 05543ec to ac71002 Compare August 2, 2024 23:54
@narendasan narendasan merged commit 1d5dd56 into main Aug 5, 2024
@narendasan narendasan deleted the lazy_engine_loading branch August 5, 2024 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: runtime component: tests Issues re: Tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

✨[Feature] Delayed Initialization for TRTModule Classes

5 participants