Introduction

The quantization indicates a technique for performing computations and storing tensors with the bit widths lower than the floating-point accuracy. Quantized models perform some or all operations on the tensor using integers rather than floating-point values. Compared to typical FP32 models, horizon_plugin_pytorch supports INT8 quantization, resulting in a 4x reduction in model size and a 4x reduction in memory bandwidth requirements. The hardware support for INT8 computation is typically 2 to 4 times faster than FP32 computation. The quantization is primarily a technique to accelerate the inference, and the quantization operations are only supported for forward computation.

horizon_plugin_pytorch provides the BPU-adapted quantization operations and supports quantization-aware training (QAT). The QAT uses fake-quantization modules to model the quantization errors in forward computation and backpropagation. Note that the computation process of the QAT is performed by using floating-point operations. At the end of the QAT, horizon_plugin_pytorch provides the conversion functions to convert the trained model to a fixed-point model, using a more compact model for representation and high-performance vectorization on the BPU.

This section gives you a detailed introduction to horizon_plugin_pytorch quantitative training tool developed on the basis of PyTorch.

Horizon_plugin_pytorch is developed based on PyTorch, in order to reduce the learning cost of you, we refer to the design of PyTorch on quantized awareness training. This doc doesn't repeat the contents which PyTorch doc already contains, if you want to understand the details of the tool, we recommend that you read the Official source code or the python source code of this tool. To ensure a smooth experience, we recommend that you first read the PyTorch documentation and familiarize yourself with the quantized awareness training and deployment tools provided by PyTorch.

For the purpose of brevity, the code in the documentation has been replaced with the following aliases by default:

import horizon_plugin_pytorch as horizon