As with PyTorch, we recommend using the fx quantization mode as your preferred mode. For quantization in eager mode, currently you are still supported by our horizon_plugin_pytorch. The overall flow of the Eager mode is also based on PyTorch's quantization interface and ideas, so we recommend that you read the Eager mode section of the PyTorch official document.
The main differences between using eager mode and fx mode in horizon_plugin_pytorch are:
| Original floating-point operator | Operator to be replaced |
|---|---|
| torch.nn.functional.relu | torch.nn.ReLU() |
| a + b / torch.add | horizon.nn.quantized.FloatFunctional().add |
| Tensor.exp | horizon.nn.Exp() |
| torch.nn.functional.interpolate | horizon.nn.Interpolate() |
fuser_func provided in horizon_plugin_pytorch when calling it. As shown below:The other steps are consistent with FX mode, please refer to the "Quick Start" chapter.