[Misc] `compressed-tensors` code reuse #7277

kylesayrs · 2024-08-07T17:59:11Z

Reuse existing implementations of compression and quantization configs defined by compressed-tensors
Adds compressed-tensors as a dependency to vllm

The reused classes are

CompressionFormat
QuantizationArgs
QuantizationStrategy
QuantizationType

github-actions · 2024-08-07T17:59:25Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

requirements-cuda.txt

dsikka · 2024-08-08T18:33:27Z

/ready

vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors.py

dsikka · 2024-08-08T21:34:11Z

@kylesayrs you're missing

vllm/requirements-test.txt

Line 20 in 5923532

compressed-tensors==0.4.0 # required for compressed-tensors

requirements-test.txt

requirements-common.txt

robertgshaw2-redhat · 2024-08-09T14:13:15Z

Current state looks good so far.

Biggest piece of feedback is that we are still rewriting the logic associated with parsing the config. Specifically, the get_scheme function in compressed-tensors.py has this duplicated code

https://github.com/vllm-project/vllm/pull/7277/files#diff-db60d9d797517c6bdf2ce051b532fe6fa6bceab04516c6d8c2d80a2ae803abe9L263

It will be tricky to fix this (because the vLLM state_dict is not a 1:1 map with the transformers state_dict), so feel free to reach out if you need any pointers.

dsikka · 2024-08-09T15:10:58Z

@robertgshaw2-neuralmagic I think updating the get_scheme function is beyond this scope of this PR. I'd like to first land using compressed-tensors without any dependency conflicts. Refactoring get_scheme should be a follow-up

kylesayrs · 2024-08-09T16:20:38Z

These test failures seem unrelated to this PR? The a few seem to be cuda errors and one is complaining about bad llm metrics measurements

robertgshaw2-redhat · 2024-08-09T16:59:29Z

@robertgshaw2-neuralmagic I think updating the get_scheme function is beyond this scope of this PR. I'd like to first land using compressed-tensors without any dependency conflicts. Refactoring get_scheme should be a follow-up

Sounds good.

@kylesayrs im just running this by simon but we should be good to go

This reverts commit 373538f.

Signed-off-by: Alvant <[email protected]>

Signed-off-by: LeiWang1999 <[email protected]>

kylesayrs marked this pull request as draft August 7, 2024 17:59

mgoin reviewed Aug 7, 2024

View reviewed changes

requirements-cuda.txt Outdated Show resolved Hide resolved

kylesayrs force-pushed the compressed-tensors-reuse branch from aaa041e to 8960860 Compare August 8, 2024 17:58

kylesayrs changed the title ~~[Misc] DO NOT MERGE compressed-tensors code reuse~~ [Misc] compressed-tensors code reuse Aug 8, 2024

kylesayrs marked this pull request as ready for review August 8, 2024 18:32

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 8, 2024

dsikka suggested changes Aug 8, 2024

View reviewed changes

vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors.py Outdated Show resolved Hide resolved

kylesayrs requested a review from dsikka August 9, 2024 13:13

robertgshaw2-redhat reviewed Aug 9, 2024

View reviewed changes

requirements-test.txt Outdated Show resolved Hide resolved

robertgshaw2-redhat reviewed Aug 9, 2024

View reviewed changes

requirements-common.txt Outdated Show resolved Hide resolved

kylesayrs added 8 commits August 13, 2024 18:33

replace imports

06bb3df

remove vllm definitions

d6e6d4d

add compressed-tensors requirement

7d0a86e

fix import paths, remove unused imports

0e7f089

apply formatting

131fd53

move dependency to common

2f63099

bump to future ct version

6208983

bump test requirements version

ce29b08

kylesayrs force-pushed the compressed-tensors-reuse branch from 049dc9c to ce29b08 Compare August 13, 2024 18:36

fix rebase error

6d17c77

mgoin approved these changes Aug 13, 2024

View reviewed changes

mgoin merged commit 373538f into vllm-project:main Aug 13, 2024

kylesayrs deleted the compressed-tensors-reuse branch August 13, 2024 23:05

cermeng mentioned this pull request Aug 14, 2024

[Bug]: ImportError related to compressed tensors module #7516

Closed

kylesayrs added a commit to neuralmagic/vllm that referenced this pull request Aug 14, 2024

Revert "[Misc] compressed-tensors code reuse (vllm-project#7277)"

94d7f30

This reverts commit 373538f.

kylesayrs mentioned this pull request Aug 14, 2024

[Misc] Revert compressed-tensors code reuse #7521

Merged

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Misc] compressed-tensors code reuse (vllm-project#7277)

2c27c69

Signed-off-by: Alvant <[email protected]>

LeiWang1999 pushed a commit to LeiWang1999/vllm-bitblas that referenced this pull request Mar 26, 2025

[Misc] compressed-tensors code reuse (vllm-project#7277)

5da5fe9

Signed-off-by: LeiWang1999 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Misc] `compressed-tensors` code reuse #7277

[Misc] `compressed-tensors` code reuse #7277

Uh oh!

kylesayrs commented Aug 7, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Aug 7, 2024

Uh oh!

Uh oh!

dsikka commented Aug 8, 2024

Uh oh!

Uh oh!

dsikka commented Aug 8, 2024

Uh oh!

Uh oh!

Uh oh!

robertgshaw2-redhat commented Aug 9, 2024

Uh oh!

dsikka commented Aug 9, 2024

Uh oh!

kylesayrs commented Aug 9, 2024

Uh oh!

robertgshaw2-redhat commented Aug 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Misc] compressed-tensors code reuse #7277

[Misc] compressed-tensors code reuse #7277

Uh oh!

Conversation

kylesayrs commented Aug 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 7, 2024

Uh oh!

Uh oh!

dsikka commented Aug 8, 2024

Uh oh!

Uh oh!

dsikka commented Aug 8, 2024

Uh oh!

Uh oh!

Uh oh!

robertgshaw2-redhat commented Aug 9, 2024

Uh oh!

dsikka commented Aug 9, 2024

Uh oh!

kylesayrs commented Aug 9, 2024

Uh oh!

robertgshaw2-redhat commented Aug 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Misc] `compressed-tensors` code reuse #7277

[Misc] `compressed-tensors` code reuse #7277

kylesayrs commented Aug 7, 2024 •

edited

Loading