Skip to content

Conversation

peri044
Copy link
Collaborator

@peri044 peri044 commented Jul 11, 2024

Description

  1. Converter additions for LLM models
  2. Fix memory allocations on GPU - Models can now be exported on CPU and only use GPU for TRT compilation.
Inputs: List[Tensor: (1, (min=1, max=64))@int64]
    ...
    TRT Engine #1 - Submodule name: _run_on_acc_0
     Engine Inputs: List[Tensor: (1, (min=1, max=64))@int64]
     Number of Operators in Engine: 143
     Engine Outputs: List[Tensor: (1, (min=1, max=64), 32000)@float32]
    ...
   Outputs: List[Tensor: (1, (min=1, max=64), 32000)@float32]
  1. Modifications to dryrun tracker to handle dynamic shapes.
  2. LLM examples

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

@github-actions github-actions bot added component: lowering Issues re: The lowering / preprocessing passes component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths labels Jul 11, 2024
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-19 21:00:09.967336+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-19 21:00:32.451960+00:00
@@ -224,20 +224,22 @@
    """Format shapes and dtypes of input Tensors into a readable string"""

    def input_formatter_helper(shapes: Any, dtypes: Any) -> str:
        """Helper for input formatter"""
        # Base case 1 - single static/dynamic shape, single dtype
-        if isinstance(shapes, tuple) and all(isinstance(elt, (int, tuple)) for elt in shapes):
+        if isinstance(shapes, tuple) and all(
+            isinstance(elt, (int, tuple)) for elt in shapes
+        ):
            input_shape_string = "Tensor: ("
            for elt in shapes:
                if isinstance(elt, tuple):
-                    input_shape_string+= f"(min={elt[0]}, max={elt[1]}), "
+                    input_shape_string += f"(min={elt[0]}, max={elt[1]}), "
                else:
-                    input_shape_string+= f"{elt}, "
+                    input_shape_string += f"{elt}, "
            input_shape_string = input_shape_string[:-2] + ")" + f"@{str(dtypes)[6:]}, "
            return input_shape_string
-        
+
        # Base case 2 - dynamic shape, single dtype
        elif (
            isinstance(shapes, dict)
            and len(shapes) == 3
            and all(
--- /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-19 21:00:10.003336+00:00
+++ /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-19 21:00:37.999905+00:00
@@ -28,19 +28,16 @@
}


def load_hf_model(model_name_hf):
    print("Loading user-specified HF model: ", model_name_hf)
-    model_hf = (
-        AutoModelForCausalLM.from_pretrained(
-            model_name_hf,
-            trust_remote_code=True,
-            use_cache=False,
-            attn_implementation="eager",
-        )
-        .eval()
-    )
+    model_hf = AutoModelForCausalLM.from_pretrained(
+        model_name_hf,
+        trust_remote_code=True,
+        use_cache=False,
+        attn_implementation="eager",
+    ).eval()

    return {"model": model_hf}


class ModelStorage:

@peri044 peri044 requested a review from narendasan August 20, 2024 22:08
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-20 22:09:12.087830+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-20 22:09:32.444447+00:00
@@ -224,20 +224,22 @@
    """Format shapes and dtypes of input Tensors into a readable string"""

    def input_formatter_helper(shapes: Any, dtypes: Any) -> str:
        """Helper for input formatter"""
        # Base case 1 - single static/dynamic shape, single dtype
-        if isinstance(shapes, tuple) and all(isinstance(elt, (int, tuple)) for elt in shapes):
+        if isinstance(shapes, tuple) and all(
+            isinstance(elt, (int, tuple)) for elt in shapes
+        ):
            input_shape_string = "Tensor: ("
            for elt in shapes:
                if isinstance(elt, tuple):
-                    input_shape_string+= f"(min={elt[0]}, max={elt[1]}), "
+                    input_shape_string += f"(min={elt[0]}, max={elt[1]}), "
                else:
-                    input_shape_string+= f"{elt}, "
+                    input_shape_string += f"{elt}, "
            input_shape_string = input_shape_string[:-2] + ")" + f"@{str(dtypes)[6:]}, "
            return input_shape_string
-        
+
        # Base case 2 - dynamic shape, single dtype
        elif (
            isinstance(shapes, dict)
            and len(shapes) == 3
            and all(
--- /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-20 22:09:12.123831+00:00
+++ /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-20 22:09:37.620289+00:00
@@ -28,19 +28,16 @@
}


def load_hf_model(model_name_hf):
    print("Loading user-specified HF model: ", model_name_hf)
-    model_hf = (
-        AutoModelForCausalLM.from_pretrained(
-            model_name_hf,
-            trust_remote_code=True,
-            use_cache=False,
-            attn_implementation="eager",
-        )
-        .eval()
-    )
+    model_hf = AutoModelForCausalLM.from_pretrained(
+        model_name_hf,
+        trust_remote_code=True,
+        use_cache=False,
+        attn_implementation="eager",
+    ).eval()

    return {"model": model_hf}


class ModelStorage:

@github-actions github-actions bot removed the component: tests Issues re: Tests label Aug 21, 2024
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-21 00:33:04.790449+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-21 00:33:27.241394+00:00
@@ -224,20 +224,22 @@
    """Format shapes and dtypes of input Tensors into a readable string"""

    def input_formatter_helper(shapes: Any, dtypes: Any) -> str:
        """Helper for input formatter"""
        # Base case 1 - single static/dynamic shape, single dtype
-        if isinstance(shapes, tuple) and all(isinstance(elt, (int, tuple)) for elt in shapes):
+        if isinstance(shapes, tuple) and all(
+            isinstance(elt, (int, tuple)) for elt in shapes
+        ):
            input_shape_string = "Tensor: ("
            for elt in shapes:
                if isinstance(elt, tuple):
-                    input_shape_string+= f"(min={elt[0]}, max={elt[1]}), "
+                    input_shape_string += f"(min={elt[0]}, max={elt[1]}), "
                else:
-                    input_shape_string+= f"{elt}, "
+                    input_shape_string += f"{elt}, "
            input_shape_string = input_shape_string[:-2] + ")" + f"@{str(dtypes)[6:]}, "
            return input_shape_string
-        
+
        # Base case 2 - dynamic shape, single dtype
        elif (
            isinstance(shapes, dict)
            and len(shapes) == 3
            and all(
--- /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-21 00:33:04.830449+00:00
+++ /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-21 00:33:32.549136+00:00
@@ -28,19 +28,16 @@
}


def load_hf_model(model_name_hf):
    print("Loading user-specified HF model: ", model_name_hf)
-    model_hf = (
-        AutoModelForCausalLM.from_pretrained(
-            model_name_hf,
-            trust_remote_code=True,
-            use_cache=False,
-            attn_implementation="eager",
-        )
-        .eval()
-    )
+    model_hf = AutoModelForCausalLM.from_pretrained(
+        model_name_hf,
+        trust_remote_code=True,
+        use_cache=False,
+        attn_implementation="eager",
+    ).eval()

    return {"model": model_hf}


class ModelStorage:

@peri044 peri044 requested a review from zewenli98 August 21, 2024 00:41
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-21 00:41:21.928115+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_DryRunTracker.py	2024-08-21 00:41:47.424333+00:00
@@ -224,20 +224,22 @@
    """Format shapes and dtypes of input Tensors into a readable string"""

    def input_formatter_helper(shapes: Any, dtypes: Any) -> str:
        """Helper for input formatter"""
        # Base case 1 - single static/dynamic shape, single dtype
-        if isinstance(shapes, tuple) and all(isinstance(elt, (int, tuple)) for elt in shapes):
+        if isinstance(shapes, tuple) and all(
+            isinstance(elt, (int, tuple)) for elt in shapes
+        ):
            input_shape_string = "Tensor: ("
            for elt in shapes:
                if isinstance(elt, tuple):
-                    input_shape_string+= f"(min={elt[0]}, max={elt[1]}), "
+                    input_shape_string += f"(min={elt[0]}, max={elt[1]}), "
                else:
-                    input_shape_string+= f"{elt}, "
+                    input_shape_string += f"{elt}, "
            input_shape_string = input_shape_string[:-2] + ")" + f"@{str(dtypes)[6:]}, "
            return input_shape_string
-        
+
        # Base case 2 - dynamic shape, single dtype
        elif (
            isinstance(shapes, dict)
            and len(shapes) == 3
            and all(
--- /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-21 00:41:21.964116+00:00
+++ /home/runner/work/TensorRT/TensorRT/tools/perf/utils.py	2024-08-21 00:41:52.826949+00:00
@@ -28,19 +28,16 @@
}


def load_hf_model(model_name_hf):
    print("Loading user-specified HF model: ", model_name_hf)
-    model_hf = (
-        AutoModelForCausalLM.from_pretrained(
-            model_name_hf,
-            trust_remote_code=True,
-            use_cache=False,
-            attn_implementation="eager",
-        )
-        .eval()
-    )
+    model_hf = AutoModelForCausalLM.from_pretrained(
+        model_name_hf,
+        trust_remote_code=True,
+        use_cache=False,
+        attn_implementation="eager",
+    ).eval()

    return {"model": model_hf}


class ModelStorage:

github-actions[bot]

This comment was marked as resolved.

Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

github-actions[bot]

This comment was marked as resolved.

github-actions[bot]

This comment was marked as resolved.

@github-actions github-actions bot added the component: tests Issues re: Tests label Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: api [Python] Issues re: Python API component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: lowering Issues re: The lowering / preprocessing passes component: tests Issues re: Tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants