Remove old gptq #1115

danielpatrickhug · 2024-10-18T15:27:07Z

This PR is for the purpose of migrating several quantization classes in GPTQ.py to the new quantization.quantize_linear module before deleting the legacy GPTQ module following PR #914. This PR also handles changing the imports and README to correctly mirror the new structure as well as updating and removing tests in both test_qat and test_quant_api. @HDCharles @andrewor14 @jerryzh168 assisted and advised on how to execute this migration PR.

pytorch-bot · 2024-10-18T15:27:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1115

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 8 New Failures

As of commit aa105d6 with merge base 629e142 ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test (CPU 2.3, linux.4xlarge, torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 12ce129dbd09c4b6565e80a71e42e70b991371d82a11a5718f20e105afa7bca5 /exec failed with exit code 2
Run Regression Tests / test (CPU 2.4, linux.4xlarge, torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t 30c530d9a852f6315756f513bcf336d2acdbbccb29fa806adf027300200eb5f8 /exec failed with exit code 2
Run Regression Tests / test (CPU 2.5, linux.4xlarge, torch==2.5.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
RuntimeError: Command docker exec -t a93b0a6de65cdc8f41189e457ac30adc66e950f9084c2af65b5480db119ee643 /exec failed with exit code 2
Run Regression Tests / test (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/whl/nightl... / linux-job (gh)
RuntimeError: Command docker exec -t ff977794f12ed5b128d9a01f7b895b4bd24b6aa759ebab095ab7a83c86d74db7 /exec failed with exit code 2
Run Regression Tests / test (CUDA 2.3, linux.g5.12xlarge.nvidia.gpu, torch==2.3.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 3f4d70151a482767c4161854edf4ccdee8f3f28f6e13d831f4973c62691f65fe /exec failed with exit code 2
Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh)
RuntimeError: Command docker exec -t 270425f610e4f07f3ccd486f2576e5753520235c3b6c16dc3c4af3820a03397d /exec failed with exit code 2
Run Regression Tests / test (CUDA 2.5, linux.g5.12xlarge.nvidia.gpu, torch==2.5.0 --index-url https://download.pytorch.o... / linux-job (gh)
RuntimeError: Command docker exec -t f31c0be24e96c019b0cbec379f25d4234a01d5a7dcd01738a08ba27cd6a6e7c2 /exec failed with exit code 2
Run Regression Tests / test (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://download.pytorc... / linux-job (gh)
RuntimeError: Command docker exec -t 8dabaa69970cae5b79db0a6946cbe5c4a1ad6add3616c43bd9d5aedd0a22b727 /exec failed with exit code 2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-10-18T21:37:48Z

test/quantization/test_qat.py

 )

-from torchao.quantization.GPTQ import (
+from torchao.quantization.quantize_linear import (


this will be bc breaking since it's used by other repos like torchchat, so we'd probably want to keep backward compatibility by still keeping the old import path

andrewor14 · 2024-10-18T21:42:18Z

Hi Daniel, thanks for cleaning this up. It looks great to me overall! I just had some comments about maintaining BC to give us time to migrate some known use cases.

andrewor14 · 2024-10-18T21:30:59Z

test/quantization/test_quant_api.py

    def test_8da4w_quantizer(self):
        from torchao.quantization.quant_api import Int8DynActInt4WeightQuantizer
-        from torchao.quantization.GPTQ import Int8DynActInt4WeightLinear
+        from torchao.quantization.quantize_linear import Int8DynActInt4WeightLinear


nit: can we call this quantized_linear instead?

or maybe even _quantized_linear to discourage people from using it

andrewor14 · 2024-10-18T21:34:52Z

torchao/quantization/GPTQ.py

-
-# This source code is licensed under the license found in the
-# LICENSE file in the root directory of this source tree.
-


Can we keep this file and keep the legacy imports for now? It will help us migrate some known use cases. E.g. something like

from torchao.quantization.quantized_linear import ( Int4WeightOnlyQuantizer, Int8DynActInt4WeightQuantizer, ... )

andrewor14 · 2024-10-18T21:38:05Z

torchao/quantization/quantize_linear.py

+    group_quantize_tensor_symmetric,
+)
+
+


maybe add a comment to say everything in this file is intended to be deprecated and eventually removed? Then we should also add the new alternative of running these using the quantize_ API, e.g.

from torchao.quantization import ( quantize_, int4_weight_only, int8_dynamic_activation_int4_weight, ) quantize_(model, int4_weight_only()) quantize_(model, int8_dynamic_activation_int4_weight())

Hey, can you elaborate on this step a little bit more? I'm a bit confused.

Yeah sure, I mean we can express everything in this file using the new quantize_ API:

# Before: quantizer = Int4WeightOnlyQuantizer() # or Int8DynActInt4WeightQuantizer model = quantizer.quantize(model) # After: quantize_(model, int4_weight_only()) # or int8_dynamic_activation_int4_weight

We want to deprecate the Quantizer APIs eventually in favor of the quantize_ API, so we want to discourage people from using the quantizers in this file. I think adding a comment here explaining this context will help with that.

Thank you, I added the comments to the top of the file noting the deprecation.

andrewor14 · 2024-10-18T21:38:16Z

torchao/quantization/quant_api.py

 __all__ += [
    "Int8DynActInt4WeightQuantizer",
-    "Int8DynActInt4WeightGPTQQuantizer",
+


nit: remove this empty line?

andrewor14 · 2024-10-18T21:38:59Z

torchao/quantization/__init__.py

    "Int4WeightOnlyQuantizer",
+    "WeightOnlyInt4Linear",
+    "Int8DynActInt4WeightLinear",
+    "Int8DynActInt4WeightQuantizer",


Maybe we shouldn't expose these if they weren't exposed before, since these are meant to be legacy APIs?

andrewor14 · 2024-10-18T21:41:44Z

test/quantization/test_quant_api.py

        )

-    @unittest.skip("skipping until we get checkpoints for gpt-fast")
-    def test_quantizer_int4_weight_only(self):


Does this PR just move the location of these classes, if so this test should still pass right, or am I missing something? @danielpatrickhug @HDCharles Is there a reason why we want to remove this test?

We don't want 2 int4 implementations

andrewor14

Looks great. Thanks for the refactor!

andrewor14 · 2024-10-23T14:34:25Z

torchao/quantization/prototype/qat/linear.py

+    get_groupwise_affine_qparams,
+    groupwise_affine_quantize_tensor,
+)
+from torchao.quantization.prototype.qat.utils import (


sorry this is recently updated to torchao.quantization.qat.utils

Oh actually you'll face some conflicts with this file. Can you make the changes in torchao/quantization/qat/linear.py instead? Sorry for the merge conflicts

andrewor14 · 2024-10-23T14:45:12Z

Oh actually I just realized. The latest changes don't actually remove the classes from GPTQ.py. We should still do that so we don't end up with duplicate copies of the same classes. What I meant was remove those classes from GPTQ.py and add the following imports to GPTQ.py to maintain legacy imports:

from torchao.quantization._quantized_linear import (
    Int4WeightOnlyQuantizer,
    Int8DynActInt4WeightLinear,
    Int8DynActInt4WeightQuantizer,
    WeightOnlyInt4Linear,
)

…updating imports in tests and linear.py

…to new quantize_linear module

…inear from gptq

…the quantizer.quantize api

…dule

…se the new API

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 18, 2024

jerryzh168 reviewed Oct 18, 2024

View reviewed changes

andrewor14 reviewed Oct 18, 2024

View reviewed changes

andrewor14 approved these changes Oct 23, 2024

View reviewed changes

danielpatrickhug added 17 commits October 31, 2024 14:14

Migration of code from older GPTQ.py to quantize_linear.py as well as…

f041238

…updating imports in tests and linear.py

remove option to use gptq specific quantization class

94fca6b

updated imports to point correctly to the new quantize_linear module

3c9f321

update docs to remove legacy gptq mention

f35d3e0

remove test for legacy gptq quantization class and updated an import …

55766f7

…to new quantize_linear module

update imports and remove legacy qptq quantizer

ffcd780

Updated imports and migrated Quantizer based quantizers to quantize_l…

097300a

…inear from gptq

remove Int4WeightOnlyQuantizer test as its not fully implemented for …

33a9a8d

…the quantizer.quantize api

update imports in quant_api and quantization __init__

6051365

reorganize quantize_linear functions and add docstrings to gptq_MT mo…

0860dad

…dule

remove legacy GPTQ module

ce12f07

update imports for moved classes

19127b8

Changed quantize_linear to _quantized_linear and updated the imports

9d2f851

add GPTQ file back for backwards compatability

7379055

removed legacy classes from being exposed

44f8190

Add comments to _quantized_linear to note future deprecation and to u…

28bbff1

…se the new API

add imports from _quantized_linear to GPTQ to keep legacy code working

aa105d6

danielpatrickhug force-pushed the remove_old_gptq branch from b9a0ae1 to aa105d6 Compare October 31, 2024 14:24


		# This source code is licensed under the license found in the
		# LICENSE file in the root directory of this source tree.

Remove old gptq #1115

Are you sure you want to change the base?

Remove old gptq #1115

Uh oh!

Conversation

danielpatrickhug commented Oct 18, 2024

Uh oh!

pytorch-bot bot commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1115

❌ 8 New Failures

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Oct 18, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrewor14 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andrewor14 commented Oct 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot bot commented Oct 18, 2024 •

edited

Loading