Commit fbe3df9
authored
Fix Float8Tensor quantize op kernrel preference dispatch (#2883)
Summary:
Previously if user specifies kernel_preference == "fbgemm", we'll use torch ops like `_choose_scale_float8` and `_quantize_affine_float8` to quantize the high precision Tensor
into a float8 Tensor
this PR makes sure we use fbgemm kernels when kernel_preference is "fbgemm", meaning: `torch.ops.triton.quantize_fp8_row` for per row, and `torch.ops.fbgemm.quantize_fp8_per_tensor`
for per tensor (while `torch.ops.fbgemm.quantize_fp8_per_tensor` has some issues right now and we'll enable later when it's fixed)
This doesn't have impact on BC, meaning old serialized model can still be loaded and run, only thing is fixing the kernel choice for fbgemm kernel preference
means users who requested FBGEMM kernelpreference now actually run fbgemm quantize op instead of torch op
Test Plan:
python test/quantization/quantize_/workflows/float8/test_float8_tensor.py -k test_expected_gpu_kernel_fbgemm
Reviewers:
Subscribers:
Tasks:
Tags:1 parent 7ea5410 commit fbe3df9
File tree
3 files changed
+86
-17
lines changed- test/quantization/quantize_/workflows/float8
- torchao/quantization/quantize_
- common
- workflows/float8
3 files changed
+86
-17
lines changedLines changed: 41 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
| |||
85 | 87 | | |
86 | 88 | | |
87 | 89 | | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
88 | 98 | | |
89 | 99 | | |
90 | 100 | | |
| |||
237 | 247 | | |
238 | 248 | | |
239 | 249 | | |
240 | | - | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
241 | 255 | | |
242 | 256 | | |
243 | 257 | | |
| |||
399 | 413 | | |
400 | 414 | | |
401 | 415 | | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
402 | 442 | | |
403 | 443 | | |
404 | 444 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
29 | | - | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| |||
Lines changed: 44 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | | - | |
| 25 | + | |
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| |||
178 | 178 | | |
179 | 179 | | |
180 | 180 | | |
181 | | - | |
| 181 | + | |
182 | 182 | | |
183 | 183 | | |
184 | 184 | | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
189 | 189 | | |
190 | | - | |
191 | | - | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
192 | 200 | | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
193 | 208 | | |
194 | 209 | | |
195 | 210 | | |
196 | 211 | | |
197 | 212 | | |
198 | 213 | | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
206 | 234 | | |
| 235 | + | |
207 | 236 | | |
208 | 237 | | |
209 | 238 | | |
| |||
0 commit comments