diff --git a/CHANGELOG.md b/CHANGELOG.md
index 252d4483..21b61710 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1 +1,7 @@
-# Changes
\ No newline at end of file
+# Changes
+
+## [Master]
+
+### Added
+
+- Added reduced precision documentation page
\ No newline at end of file
diff --git a/README.md b/README.md
index 4c5d0fc6..7f73c56a 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,7 @@
# torch2trt
+
+
torch2trt is a PyTorch to TensorRT converter which utilizes the
TensorRT Python API. The converter is
diff --git a/docs/images/check.svg b/docs/images/check.svg
new file mode 100644
index 00000000..cf59f02c
--- /dev/null
+++ b/docs/images/check.svg
@@ -0,0 +1 @@
+
\ No newline at end of file
diff --git a/docs/usage/reduced_precision.md b/docs/usage/reduced_precision.md
new file mode 100644
index 00000000..02b4bd35
--- /dev/null
+++ b/docs/usage/reduced_precision.md
@@ -0,0 +1,152 @@
+# Reduced Precision
+
+For certain platforms, reduced precision can result in substantial improvements in throughput,
+often with little impact on model accuracy.
+
+# Support Matrix
+
+Below is a table of layer precision support for various NVIDIA platforms.
+
+| Platform | FP16 | INT8 |
+|----------|------|------|
+| Jetson Nano |  | |
+| Jetson TX2 |  |  |
+| Jetson Xavier NX |  |  |
+| Jetson AGX Xavier |  |  |
+
+!!! note
+
+ If the platform you're using is missing from this table or you spot anything incorrect
+ please [let us know](https://github.com/NVIDIA-AI-IOT/torch2trt).
+
+## FP16 Precision
+
+To enable support for fp16 precision with TensorRT, torch2trt exposes the ``fp16_mode`` parameter.
+Converting a model with ``fp16_mode=True`` allows the TensorRT optimizer to select layers with fp16
+precision.
+
+
+```python
+model_trt = torch2trt(model, [data], fp16_mode=True)
+```
+
+!!! note
+
+ When ``fp16_mode=True``, this does not necessarily mean that TensorRT will select FP16 layers.
+ The optimizer attempts to automatically select tactics which result in the best performance.
+
+## INT8 Precision
+
+torch2trt also supports int8 precision with TensorRT with the ``int8_mode`` parameter. Unlike fp16 and fp32 precision, switching
+to in8 precision often requires calibration to avoid a significant drop in accuracy.
+
+### Input Data Calibration
+
+By default
+torch2trt will calibrate using the input data provided. For example, if you wanted
+to calibrate on a set of 64 random normal images you could do.
+
+```python
+data = torch.randn(64, 3, 224, 224).cuda().eval()
+
+model_trt = torch2trt(model, [data], int8_mode=True)
+```
+
+### Dataset Calibration
+
+In many instances, you may want to calibrate on more data than fits in memory. For this reason,
+torch2trt exposes the ``int8_calibration_dataset`` parameter. This parameter takes an input
+dataset that is used for calibration. If this parameter is specified, the input data is
+ignored during calibration. You create an input dataset by defining
+a class which implements the ``__len__`` and ``__getitem__`` methods.
+
+* The ``__len__`` method should return the number of calibration samples
+* The ``__getitem__`` method must return a single calibration sample. This is a list of input tensors to the model. Each tensor should match the shape
+you provide to the ``inputs`` parameter when calling ``torch2trt``.
+
+For example, say you trained an image classification network using the PyTorch [``ImageFolder``](https://pytorch.org/docs/stable/torchvision/datasets.html#imagefolder) dataset.
+You could wrap this dataset for calibration, by defining a new dataset which returns only the images without labels in list format.
+
+```python
+from torchvision.datasets import ImageFolder
+from torchvision.transforms import ToTensor, Compose, Normalize
+
+
+class ImageFolderCalibDataset():
+
+ def __init__(self, root):
+ self.dataset = ImageFolder(
+ root=root,
+ transform=Compose([
+ transforms.Resize((224, 224)),
+ transforms.ToTensor(),
+ transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
+ ])
+ )
+
+ def __len__(self):
+ return len(self.dataset)
+
+ def __getitem__(self, idx):
+ image, _ = self.dataset[idx]
+ image = image[None, ...] # add batch dimension
+ return [image]
+```
+
+You would then provide this calibration dataset to torch2trt as follows
+
+```python
+dataset = ImageFolderCalibDataset('images')
+
+model_trt = torch2trt(model, [data], int8_calib_dataset=dataset)
+```
+
+### Calibration Algorithm
+
+To override the default calibration algorithm that torch2trt uses, you can set the ``int8_calib_algoirthm``
+to the [``tensorrt.CalibrationAlgoType``](https://docs.nvidia.com/deeplearning/tensorrt/api/python_api/infer/Int8/Calibrator.html#iint8calibrator)
+that you wish to use. For example, to use the minmax calibration algoirthm you would do
+
+```python
+import tensorrt as trt
+
+model_trt = torch2trt(model, [data], int8_mode=True, int8_calib_algorithm=trt.CalibrationAlgoType.MINMAX_CALIBRATION)
+```
+
+### Calibration Batch Size
+
+During calibration, torch2trt pulls data in batches for the TensorRT calibrator. In some instances
+[developers have found](https://github.com/NVIDIA-AI-IOT/torch2trt/pull/398) that the calibration batch size can impact the calibrated model accuracy. To set the calibration batch size, you can set the ``int8_calib_batch_size``
+parameter. For example, to use a calibration batch size of 32 you could do
+
+```python
+model_trt = torch2trt(model, [data], int8_mode=True, int8_calib_batch_size=32)
+```
+
+## Binding Data Types
+
+The data type of input and output bindings in TensorRT are determined by the original
+PyTorch module input and output data types.
+This does not directly impact whether the TensorRT optimizer will internally use fp16 or int8 precision.
+
+For example, to create a model with half precision bindings, you would do the following
+
+```python
+model = model.float()
+data = data.float()
+
+model_trt = torch2trt(model, [data], fp16_mode=True)
+```
+
+In this instance, the optimizer may choose to use fp16 precision layers internally, but the
+input and output data types are fp32. To use fp16 precision input and output bindings you would do
+
+```python
+model = model.half()
+data = data.half()
+
+model_trt = torch2trt(model, [data], fp16_mode=True)
+```
+
+Now, the input and output bindings of the model are half precision, and internally the optimizer may
+choose to select fp16 layers as well.
\ No newline at end of file
diff --git a/mkdocs.yml b/mkdocs.yml
index 6c49574c..fcc8c127 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -35,6 +35,7 @@ nav:
- Getting Started: getting_started.md
- Usage:
- Basic Usage: usage/basic_usage.md
+ - Reduced Precision: usage/reduced_precision.md
- Custom Converter: usage/custom_converter.md
- Converters: converters.md
- Benchmarks: