We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 62c239c commit 599319fCopy full SHA for 599319f
torchao/quantization/README.md
@@ -13,7 +13,6 @@ Using the lm_eval. The models used were meta-llama/Llama-2-7b-chat-hf and meta-l
13
| | int4wo-64 | 12.843 | 201.14 | 751.42 | 4.87 | 3.74 |
14
| | int4wo-64-GPTQ | 12.527 | 201.14 | 751.42 | 4.87 | 3.74 |
15
| | autoquant-int4hqq | 12.825 | 209.19 | 804.32 | 4.89 | 3.84 |
16
-
17
| Llama-3-8B | Base (bfloat16) | 7.441 | 95.64 | 1435.54 | 16.43 | 15.01 |
18
| | int8dq | 7.581 | 8.61 | 64.75 | 9.24 | 7.52 |
19
| | int8wo | 7.447 | 153.03 | 1150.80 | 10.42 | 7.52 |
0 commit comments