intel
diff --git a/‎README.md‎
Lines changed: 263 additions & 7 deletions b/‎README.md‎
Lines changed: 263 additions & 7 deletions
diff --git a/‎docs/backend_quant.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/backend_quant.md‎
Lines changed: 1 addition & 1 deletion
@@ -33,12 +33,14 @@ Click the image to enlarge it.
 #### Supported Frameworks
 
 Supported Intel-optimized DL frameworks are:
-* [TensorFlow\*](https://github.com/Intel-tensorflow/tensorflow), including [1.15.0 UP3](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up3), [1.15.0 UP2](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up2), [1.15.0 UP1](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up1), [2.1.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.1.0), [2.2.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.2.0), [2.3.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.3.0), [2.4.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.4.0), [2.5.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.5.0)
+* [TensorFlow\*](https://github.com/Intel-tensorflow/tensorflow), including [1.15.0 UP3](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up3), [1.15.0 UP2](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up2), [1.15.0 UP1](https://github.com/Intel-tensorflow/tensorflow/tree/v1.15.0up1), [2.1.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.1.0), [2.2.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.2.0), [2.3.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.3.0), [2.4.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.4.0), [2.5.0](https://github.com/Intel-tensorflow/tensorflow/tree/v2.5.0), [Official TensorFlow 2.6.0](https://github.com/tensorflow/tensorflow/tree/v2.6.0)
 
 >  **Note**: Intel Optimized TensorFlow 2.5.0 requires to set environment variable TF_ENABLE_MKL_NATIVE_FORMAT=0 before running LPOT quantization or deploying the quantized model.
 
+>  **Note**: From Official TensorFlow 2.6.0, oneDNN support has been upstreamed. User just need download official TensorFlow binary for CPU device and set environment variable TF_ENABLE_ONEDNN_OPTS=1 before running LPOT quantization or deploying the quantized model.
+
 * [PyTorch\*](https://pytorch.org/), including [1.5.0+cpu](https://download.pytorch.org/whl/torch_stable.html), [1.6.0+cpu](https://download.pytorch.org/whl/torch_stable.html), [1.8.0+cpu](https://download.pytorch.org/whl/torch_stable.html)
-* [Apache\* MXNet](https://mxnet.apache.org), including [1.6.0](https://github.com/apache/incubator-mxnet/tree/1.6.0), [1.7.0](https://github.com/apache/incubator-mxnet/tree/1.7.0)
+* [Apache\* MXNet](https://mxnet.apache.org), including [1.6.0](https://github.com/apache/incubator-mxnet/tree/1.6.0), [1.7.0](https://github.com/apache/incubator-mxnet/tree/1.7.0), [1.8.0](https://github.com/apache/incubator-mxnet/tree/1.8.0)
 * [ONNX\* Runtime](https://github.com/microsoft/onnxruntime), including [1.6.0](https://github.com/microsoft/onnxruntime/tree/v1.6.0), [1.7.0](https://github.com/microsoft/onnxruntime/tree/v1.7.0), [1.8.0](https://github.com/microsoft/onnxruntime/tree/v1.8.0)
 
 
@@ -152,7 +154,7 @@ python setup.py install
 
 **Deep Dive**
 
-* [Quantization](docs/Quantization.md) are processes that enable inference and training by performing computations at low-precision data types, such as fixed-point integers. LPOT supports Post-Training Quantization ([PTQ](docs/PTQ.md)) and Quantization-Aware Training ([QAT](docs/QAT.md)). Note that ([Dynamic Quantization](docs/dynamic_quantization.md)) currently has limited support.
+* [Quantization](docs/Quantization.md) are processes that enable inference and training by performing computations at low-precision data types, such as fixed-point integers. LPOT supports Post-Training Quantization ([PTQ](docs/PTQ.md)) with [different quantization capabilities](docs/backend_quant.md) and Quantization-Aware Training ([QAT](docs/QAT.md)). Note that ([Dynamic Quantization](docs/dynamic_quantization.md)) currently has limited support.
 * [Pruning](docs/pruning.md) provides a common method for introducing sparsity in weights and activations.
 * [Benchmarking](docs/benchmark.md) introduces how to utilize the benchmark interface of LPOT.
 * [Mixed precision](docs/mixed_precision.md) introduces how to enable mixed precision, including BFP16 and int8 and FP32, on Intel platforms during tuning.
@@ -198,10 +200,13 @@ Intel® Low Precision Optimization Tool requires installing the pertinent Intel-
 </thead>
 <tbody>
   <tr>
-    <td class="tg-nrix" rowspan="18">Cascade Lake<br><br>Cooper Lake<br><br>Skylake<br><br>Ice Lake</td>
-    <td class="tg-nrix" rowspan="18">CentOS 8.3<br><br>Ubuntu 18.04</td>
-    <td class="tg-nrix" rowspan="18">3.6<br><br>3.7<br><br>3.8<br><br>3.9</td>
-    <td class="tg-cly1" rowspan="9">TensorFlow</td>
+    <td class="tg-nrix" rowspan="20">Cascade Lake<br><br>Cooper Lake<br><br>Skylake<br><br>Ice Lake</td>
+    <td class="tg-nrix" rowspan="20">CentOS 8.3<br><br>Ubuntu 18.04</td>
+    <td class="tg-nrix" rowspan="20">3.6<br><br>3.7<br><br>3.8<br><br>3.9</td>
+    <td class="tg-cly1" rowspan="10">TensorFlow</td>
+    <td class="tg-7zrl">2.6.0</td>
+  </tr>
+  <tr>
     <td class="tg-7zrl">2.5.0</td>
   </tr>
   <tr>
@@ -243,6 +248,9 @@ Intel® Low Precision Optimization Tool requires installing the pertinent Intel-
   </tr>
   <tr>
     <td class="tg-cly1" rowspan="2">MXNet</td>
+    <td class="tg-7zrl">1.8.0</td>
+  </tr>
+  <tr>
     <td class="tg-7zrl">1.7.0</td>
   </tr>
   <tr>
@@ -265,6 +273,79 @@ Intel® Low Precision Optimization Tool requires installing the pertinent Intel-
 
 Intel® Low Precision Optimization Tool provides numerous examples to show promising accuracy loss with the best performance gain. A full quantized model list on various frameworks is available in the [Model List](docs/full_model_list.md).
 
+#### Validated MLPerf Models
+
+<table>
+<thead>
+  <tr>
+    <th>Model</th>
+    <th>Framework</th>
+    <th>Support</th>
+    <th>Example</th>
+  </tr>
+</thead>
+<tbody>
+  <tr>
+    <td rowspan="2">ResNet50 v1.5</td>
+    <td>TensorFlow</td>
+    <td>Yes</td>
+    <td><a href="https://github.com/intel/lpot/tree/master/examples/tensorflow/image_recognition">Link</a></td>
+  </tr>
+  <tr>
+    <td>PyTorch</td>
+    <td>Yes</td>
+    <td><a href="https://github.com/intel/lpot/tree/master/examples/pytorch/ipex/image_recognition/imagenet/cpu/ptq">Link</a></td>
+  </tr>
+  <tr>
+    <td>DLRM</td>
+    <td>PyTorch</td>
+    <td>Yes</td>
+    <td><a href="https://github.com/intel/lpot/tree/master/examples/pytorch/fx/recommendation">Link</a></td>
+  </tr>
+  <tr>
+    <td rowspan="2">BERT-large</td>
+    <td>TensorFlow</td>
+    <td>Yes</td>
+    <td><a href="https://github.com/intel/lpot/tree/master/examples/tensorflow/nlp/bert_large_squad">Link</a></td>
+  </tr>
+  <tr>
+    <td>PyTorch</td>
+    <td>Yes</td>
+    <td><a href="https://github.com/intel/lpot/tree/master/examples/pytorch/eager/language_translation/ptq">Link</a></td>
+  </tr>
+  <tr>
+    <td rowspan="2">SSD-ResNet34</td>
+    <td>TensorFlow</td>
+    <td>WIP</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>PyTorch</td>
+    <td>Yes</td>
+    <td><a href="https://github.com/intel/lpot/tree/master/examples/pytorch/fx/object_detection/ssd_resnet34/ptq">Link</a></td>
+  </tr>
+  <tr>
+    <td>RNN-T</td>
+    <td>PyTorch</td>
+    <td>WIP</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td rowspan="2">3D-UNet</td>
+    <td>TensorFlow</td>
+    <td>WIP</td>
+    <td></td>
+  </tr>
+  <tr>
+    <td>PyTorch</td>
+    <td>Yes</td>
+    <td><a href="https://github.com/intel/lpot/tree/master/examples/pytorch/eager/medical_imaging/3d-unet">Link</a></td>
+  </tr>
+</tbody>
+</table>
+
+#### Validated Quantized Models
+
 <table class="docutils">
 <thead>
   <tr>
@@ -531,6 +612,181 @@ Intel® Low Precision Optimization Tool provides numerous examples to show promi
 </tbody>
 </table>
 
+#### Validated Pruning Models
+
+<table>
+<thead>
+  <tr>
+    <th rowspan="2">Tasks</th>
+    <th rowspan="2">FWK</th>
+    <th rowspan="2">Model</th>
+    <th rowspan="2">fp32 baseline</th>
+    <th colspan="3">gradient sensitivity with 20% sparsity</th>
+    <th colspan="3">+onnx dynamic quantization on pruned model</th>
+  </tr>
+  <tr>
+    <td>accuracy%</td>
+    <td> drop%</td>
+    <td>perf gain (sample/s)</td>
+    <td>accuracy%</td>
+    <td> drop%</td>
+    <td>perf gain (sample/s)</td>
+  </tr>
+</thead>
+<tbody>
+  <tr>
+    <td>SST-2</td>
+    <td>pytorch</td>
+    <td>bert-base</td>
+    <td>accuracy = 92.32</td>
+    <td>accuracy = 91.97</td>
+    <td>-0.38</td>
+    <td>1.30x</td>
+    <td>accuracy = 92.20</td>
+    <td>-0.13</td>
+    <td>1.86x</td>
+  </tr>
+  <tr>
+    <td>QQP</td>
+    <td>pytorch</td>
+    <td>bert-base</td>
+    <td>[accuracy, f1] = [91.10, 88.05]</td>
+    <td>[accuracy, f1] = [89.97, 86.54]</td>
+    <td>[-1.24, -1.71]</td>
+    <td>1.32x</td>
+    <td>[accuracy, f1] = [89.75, 86.60]</td>
+    <td>[-1.48, -1.65]</td>
+    <td>1.81x</td>
+  </tr>
+</tbody>
+</table>
+
+<table>
+<thead>
+  <tr>
+    <th rowspan="2">Tasks</th>
+    <th rowspan="2">FWK</th>
+    <th rowspan="2">Model</th>
+    <th rowspan="2">fp32 baseline</th>
+    <th colspan="2">Pattern Lock on 70% Unstructured Sparsity</th>
+    <th colspan="2">Pattern Lock on 50% 1:2 Structured Sparsity</th>
+  </tr>
+  <tr>
+    <td>accuracy%</td>
+    <td> drop%</td>
+    <td>accuracy%</td>
+    <td> drop%</td>
+  </tr>
+</thead>
+<tbody>
+  <tr>
+    <td>MNLI</td>
+    <td>pytorch</td>
+    <td>bert-base</td>
+    <td>[m, mm] = [84.57, 84.79]</td>
+    <td>[m, mm] = [82.45, 83.27]</td>
+    <td>[-2.51, -1.80]</td>
+    <td>[m, mm] = [83.20, 84.11]</td>
+    <td>[-1.62, -0.80]</td>
+  </tr>
+  <tr>
+    <td>SST-2</td>
+    <td>pytorch</td>
+    <td>bert-base</td>
+    <td>accuracy = 92.32</td>
+    <td>accuracy = 91.51</td>
+    <td>-0.88</td>
+    <td>accuracy = 92.20</td>
+    <td>-0.13</td>
+  </tr>
+  <tr>
+    <td>QQP</td>
+    <td>pytorch</td>
+    <td>bert-base</td>
+    <td>[accuracy, f1] = [91.10, 88.05]</td>
+    <td>[accuracy, f1] = [90.48, 87.06]</td>
+    <td>[-0.68, -1.12]</td>
+    <td>[accuracy, f1] = [90.92, 87.78]</td>
+    <td>[-0.20, -0.31]</td>
+  </tr>
+  <tr>
+    <td>QNLI</td>
+    <td>pytorch</td>
+    <td>bert-base</td>
+    <td>accuracy = 91.54</td>
+    <td>accuracy = 90.39</td>
+    <td>-1.26</td>
+    <td>accuracy = 90.87</td>
+    <td>-0.73</td>
+  </tr>
+  <tr>
+    <td>QnA</td>
+    <td>pytorch</td>
+    <td>bert-base</td>
+    <td>[em, f1] = [79.34, 87.10]</td>
+    <td>[em, f1] = [77.27, 85.75]</td>
+    <td>[-2.61, -1.54]</td>
+    <td>[em, f1] = [78.03, 86.50]</td>
+    <td>[-1.65, -0.69]</td>
+  </tr>
+</tbody>
+</table>
+
+<table>
+<thead>
+  <tr>
+    <th>Framework</th>
+    <th>Model</th>
+    <th>fp32 baseline</th>
+    <th>Compression</th>
+    <th>dataset</th>
+    <th>acc(drop)%</th>
+  </tr>
+</thead>
+<tbody>
+  <tr>
+    <td>Pytorch</td>
+    <td>resnet18</td>
+    <td>69.76</td>
+    <td>30% sparsity on magnitude</td>
+    <td>ImageNet</td>
+    <td>69.47(-0.42)</td>
+  </tr>
+  <tr>
+    <td>Pytorch</td>
+    <td>resnet18</td>
+    <td>69.76</td>
+    <td>30% sparsity on gradient sensitivity</td>
+    <td>ImageNet</td>
+    <td>68.85(-1.30)</td>
+  </tr>
+  <tr>
+    <td>Pytorch</td>
+    <td>resnet50</td>
+    <td>76.13</td>
+    <td>30% sparsity on magnitude</td>
+    <td>ImageNet</td>
+    <td>76.11(-0.03)</td>
+  </tr>
+  <tr>
+    <td>Pytorch</td>
+    <td>resnet50</td>
+    <td>76.13</td>
+    <td>30% sparsity on magnitude and post training quantization</td>
+    <td>ImageNet</td>
+    <td>76.01(-0.16)</td>
+  </tr>
+  <tr>
+    <td>Pytorch</td>
+    <td>resnet50</td>
+    <td>76.13</td>
+    <td>30% sparsity on magnitude and quantization aware training</td>
+    <td>ImageNet</td>
+    <td>75.90(-0.30)</td>
+  </tr>
+</tbody>
+</table>
+
 ## Additional Content
 
 * [Release Information](releases_info.md)
 
@@ -37,7 +37,7 @@ This document provides a quantization support matrix for the following framework
 + Symmetric Quantization
     + int8: scale = 2 * max(abs(rmin), abs(rmax)) / (max(int8) - min(int8) - 1)
 + Asymmetric Quantization
-	+ uint8: scale = (rmax - rmin) / (max(uint8) - min(uint8)); zero_point = min(uint8)  - round(rmin / scale)
+    + uint8: scale = (rmax - rmin) / (max(uint8) - min(uint8)); zero_point = min(uint8)  - round(rmin / scale)
 
 ### Reference
 + oneDNN: [Lower Numerical Precision Deep Learning Inference and Training](https://software.intel.com/content/www/us/en/develop/articles/lower-numerical-precision-deep-learning-inference-and-training.html)