Simulate the quantize and dequantize operations in training time.
The output of this module is given by
fake_quant_x = clamp(floor(x / scale + 0.5), quant_min, quant_max) * scale
scale defines the scale factor used for quantization.
zero_point specifies the quantized value to which 0 in floating
point maps to
quant_min specifies the minimum allowable quantized value.
quant_max specifies the maximum allowable quantized value.
fake_quant_enabled controls the application of fake quantization
on tensors, note that statistics can still be updated.
observer_enabled controls statistics collection on tensors
dtype specifies the quantized dtype that is being emulated with
fake-quantization, the allowable values is qint8 and qint16. The values
of quant_min and quant_max should be chosen to be consistent with the
dtype
Parameters:
User provided module that collects statistics on the input tensor and provides a method to calculate scale and zero-point.
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
Defines the computation performed at every call.
Should be overridden by all subclasses.
Although the recipe for forward pass needs to be defined within
this function, one should call the Module instance afterwards
instead of this since the former takes care of running the
registered hooks while the latter silently ignores them.
Set qparams, default symmetric.
Wrapper that allows creation of class factories.
This can be useful when there is a need to create classes with the same constructor arguments, but different instances. Can be used in conjunction with _callable_args
Example: