Skip to content

Conversation

liangan1
Copy link
Collaborator

@liangan1 liangan1 commented Aug 22, 2025

This PR is used to enable the Int4PlainInt32Tensor. The pacing format name is "plain_int32"
Testcase:
bash test/quantization/quantize_/workflows/int4/test_int4_plain_int32_tensor.py

Copy link

pytorch-bot bot commented Aug 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2845

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 78f6bb2 with merge base 568c193 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 22, 2025
@liangan1 liangan1 added the topic: new feature Use this tag if this PR adds a new feature label Aug 25, 2025
@liangan1 liangan1 changed the title [WIP]Add Int4XPUTensorIntZP Add Int4XPUTensorIntZP Aug 25, 2025
Comment on lines 44 to 45
"int4_xpu_int_zp is referring to the format used by int4 weight-only quantization on XPU with int zero point, which is a groupwise quantization format."
INT4_XPU_INT_ZP = "int4_xpu_int_zp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please don't include int4 and xpu in the name, can you name this in terms of of how the quantized data is packed?

Copy link
Collaborator Author

@liangan1 liangan1 Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The int4 weight xpu is a plain format tensor according to this doc, it just pack 2 int4 weight elements in a byte and then store the 4*int4 as int32. So I change it to the plain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we have plain that stores 2*int4 as int8, can you reuse it or would need a new one? https://github.com/pytorch/ao/blob/main/torchao/quantization/quantize_/workflows/int4/int4_tensor.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liangan1 can you use PLAIN_INT32 for packing_format, and rename things accordingly (tensor subclass, files etc.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jerryzh168. I have added PLAIN_INT32 to be used by the xpu int4. Per my understanding, the packing format should be a dispatch policy to select the right tensor subclassing and a tensor subclass should cover a specific quantization recipe. So I suppose I should keep the current tensor name for int4 xpu.
In this PR, we just want to enable the int xpu with int zp domain. The current oneDNN backend can not support the float zp as CUDA/CPU backend and the feature is WIP. I plain to reuse this packing format in the future and dispatch the tensor with the zero point domain information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can reuse the packing format and the tensor for float32 zero_point as well in the future I think, but today we structure tensor subclass by: dtype + packing_format, so Int4PlainInt32 might be better

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. change it to Int4PlainInt32. pls help to review again.

return Int4WeightOnlyConfig(
group_size=group_size,
packing_format="plain_int32",
zero_point_domain=ZeroPointDomain.INT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we don't need this anymore I think, also we want to remove ZeroPointDomain in the future

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. But I have a question that how to selelct the int zp domain for user if there is no this param?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll know how to quantize based on the type of tensor, so user just need to choose the packing_format

@liangan1 liangan1 requested a review from jerryzh168 August 29, 2025 02:05
@liangan1 liangan1 changed the title Add Int4XPUTensorIntZP Add Int4PlainInt32 Tensor Aug 29, 2025
@liangan1 liangan1 requested a review from jerryzh168 August 29, 2025 06:09
@jerryzh168
Copy link
Contributor

jerryzh168 commented Aug 29, 2025

please rebase, and also fix the CI error as well, need to skip the test when there is no xpu I think

maybe update the Summary to make sure the naming are correct as well

@liangan1
Copy link
Collaborator Author

liangan1 commented Sep 1, 2025

cc @xiaowangintel

@liangan1
Copy link
Collaborator Author

liangan1 commented Sep 1, 2025

please rebase, and also fix the CI error as well, need to skip the test when there is no xpu I think

maybe update the Summary to make sure the naming are correct as well

Done. @jerryzh168 pls help review again.

@liangan1 liangan1 requested a review from jerryzh168 September 1, 2025 05:10
@liangan1 liangan1 changed the title Add Int4PlainInt32 Tensor Add Int4PlainInt32Tensor Sep 1, 2025

@unittest.skipIf(not torch_version_at_least("2.8.0"), "Need pytorch 2.8+")
@unittest.skipIf(not torch.xpu.is_available(), "XPU not available")
class Int4PlainInt32Tensor(TestCase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably need more tests like serailization etc. but can add these later

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. we are working on the XPU CI enabling in other PRs. Pls refer to #2917

@jerryzh168 jerryzh168 merged commit 9d01b43 into pytorch:main Sep 4, 2025
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants