Commit 80cc501
committed
Update on "Add NVFP4 QAT"
**Summary:** This commit adds a QAT flow for NVFP4, following the
numerics in `NVFP4Tensor` closely but without the dtyping casting,
swizzling, and the packing/unpacking. Users can call this flow as follows:
```
from torchao.quantization import quantize_
from torchao.quantization.qat import NVFP4FakeQuantizeConfig, QATConfig
qat_config = QATConfig(
activation_config=NVFP4FakeQuantizeConfig(),
weight_config=NVFP4FakeQuantizeConfig(),
step="prepare",
)
quantize_(model, qat_config)
```
**Test Plan:**
```
python test/quantization/test_qat.py -k test_qat_nvfp4
```
Initial benchmarks on fine-tuning Qwen3-1.7B on alpaca for 3 epochs:
```
# Without QAT
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------|------:|------|------|---------------|---|------:|---|------|
|wikitext| 2|none |None |bits_per_byte |↓ | 0.8322|± | N/A|
| | |none |None |byte_perplexity|↓ | 1.7804|± | N/A|
| | |none |None |word_perplexity|↓ |21.8611|± | N/A|
# With QAT
| Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr|
|--------|------:|------|------|---------------|---|------:|---|------|
|wikitext| 2|none |None |bits_per_byte |↓ | 0.8271|± | N/A|
| | |none |None |byte_perplexity|↓ | 1.7741|± | N/A|
| | |none |None |word_perplexity|↓ |21.4467|± | N/A|
```
[ghstack-poisoned]File tree
3 files changed
+4
-3
lines changed- docs/source
- torchao/quantization/qat
3 files changed
+4
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
31 | 30 | | |
32 | 31 | | |
33 | 32 | | |
| |||
63 | 62 | | |
64 | 63 | | |
65 | 64 | | |
| 65 | + | |
| 66 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
83 | | - | |
| 83 | + | |
84 | 84 | | |
85 | 85 | | |
86 | 86 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
100 | | - | |
| 100 | + | |
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| |||
0 commit comments