Skip to content

Conversation

lisjin
Copy link
Contributor

@lisjin lisjin commented Sep 5, 2025

The original plan was to manually replace each weight tensor with an instance of IntxUnpackedToInt8Tensor. However, I found that this already happens in IntxWeightOnlyConfig and Int8DynamicActivationIntxWeightConfig when they are initialized with version=2.

  • I leverage this code and call quantize_ once per regularized param_group in QuantOptimizer.torchao_convert
  • For params quantized with StretchedUnifTorchaoQuantizer, I fetch the qparams, scale, zero_point to initialize IntxUnpackedToInt8Tensor (as suggested by Scott)

The PR also adds the num_steps property to QuantOptimizer to mirror D81526700.

@lisjin lisjin requested a review from metascroy September 5, 2025 23:25
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 5, 2025
Copy link

pytorch-bot bot commented Sep 5, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2947

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 4cfa628 with merge base b99904b (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@lisjin lisjin added the topic: new feature Use this tag if this PR adds a new feature label Sep 5, 2025
@lisjin lisjin force-pushed the lvj/parq-convert branch 4 times, most recently from d593e74 to 8e39be5 Compare September 8, 2025 14:58
@lisjin lisjin changed the title First attempt at QuantOptimizer.torchao_convert Add torchao_convert to PARQ's QuantOptimizer Sep 8, 2025
@lisjin lisjin marked this pull request as ready for review September 8, 2025 16:41
@lisjin lisjin requested a review from metascroy September 8, 2025 17:19
@lisjin lisjin force-pushed the lvj/parq-convert branch 2 times, most recently from a3c4e94 to e608880 Compare September 8, 2025 17:37
@lisjin lisjin requested a review from andrewor14 September 9, 2025 12:44
@metascroy
Copy link
Contributor

Thanks! Overall looks great! Left some comments.

@lisjin lisjin force-pushed the lvj/parq-convert branch 3 times, most recently from 1beacfa to f352a85 Compare September 10, 2025 16:52
Copy link
Contributor

@andrewor14 andrewor14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good from my side, will let @metascroy do a final pass/stamp

@metascroy
Copy link
Contributor

Overall, I think it looks great! Approving PR, but address the comments above before landing.

@lisjin
Copy link
Contributor Author

lisjin commented Sep 11, 2025

Thanks for the comments @andrewor14 @metascroy :) I think I've addressed them all now, but let me know if I missed anything!

@lisjin lisjin force-pushed the lvj/parq-convert branch 3 times, most recently from e33264a to a854d79 Compare September 11, 2025 01:01
block_size = (1, group_size)
target_dtype = torch.int8
q_args = (weight, mapping_type, block_size, target_dtype, config.b)
if config.version == 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for this it's probably fine to break BC soon since it's a prototype feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. I kept the version=1 convention for this config to preserve the old functionality (StretchedAffineQuantizedTensor). The new version=2 will convert tensors to IntxUnpackedToInt8Tensor, which is consistent with other config classes like IntxWeightOnlyConfig

@metascroy Should we remove StretchedAffineQuantizedTensor entirely?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no objections to removing it. I assumed you wanted it for Int4CPULayout, but if you don't need that, I think it's fine to remove.

On ExecuTorch side, IntxUnpackedToInt8 tensor will be sufficient

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right about CPU support. I'll leave it for now but consider removing it in the future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Int4CPULayout is also moved to v2 already: #2845 you can set int4_packing_format to plain_int32 to get it

@lisjin lisjin merged commit 481be64 into main Sep 11, 2025
18 checks passed
@lisjin lisjin deleted the lvj/parq-convert branch September 11, 2025 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: new feature Use this tag if this PR adds a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants