Writing Specifications of set_qconfig and Customization of qconfig
Writing Specifications of set_qconfig
When defining the model to be quantified, the model set_qconfig method needs to be implemented to configure the quantization method.
The current QConfig interface is provided by hat.utils.qconfig_manager. Call hat.utils.qconfig_manager in set_qconfig to implement the setting of the module Qconfig, e.g.:
# Note: This sample code illustrates the rules for implementing the set_qconfig method and is not the full quantitative model code.
class Head(nn.Module):
def __init__(self):
super(Head, self).__init__()
self.out_conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=1)
def forward(self):
...
def set_qconfig(self):
# If the network final output layer is conv, you can set it to out_qconfig separately to get a more accurate output
from hat.utils import qconfig_manager
self.out_conv.qconfig = qconfig_manager.get_default_qat_out_qconfig()
class Backbone(nn.Module):
def __init__(self):
super(Backbone, self).__init__()
self.conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=1)
def forward(self):
...
# When there is no special layer in backbone and no layer needs to set QConfig=None,
# i.e., when setting default_qat_qconfig is needed, you can leave out the set_qconfig() method
# def set_qconfig(self):
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.backbone = Backbone()
self.head = Head()
self.loss = nn.CrossEntropyLoss()
def forward(self):
...
# Need to implement set_qconfig method for the parent module
def set_qconfig(self):
from hat.utils import qconfig_manager
# 1. first specify the qconfig of the parent module,
# if qconfig is not set for the child module, the child module will
# automatically use the qconfig of the parent module
self.qconfig = qconfig_manager.get_default_qat_qconfig()
# 2. If a submodule has a special layer and implemented the set_qconfig method, call
if self.head is not None:
if hasattr(self.head, "set_qconfig"):
self.head.set_qconfig()
# 3. If there is a submodule that does not need to set Qconfig,
# you need to set Qconfig to None
if self.loss is not None:
self.loss.qconfig = None
Customize QAT QConfig Parameters
Using custom QConfig in QAT training is supported in HAT, simply configure the qconfig_params parameter in the qat_solver of the config file:
qat_solver = dict(
trainer=qat_trainer,
quantize=True,
...
qconfig_params=dict(
dtype="qint8",
activation_fake_quant="fake_quant",
weight_fake_quant="fake_quant",
activation_qkwargs=dict(
averaging_constant=0,
),
weight_qkwargs=dict(
averaging_constant=1,
),
),
...
)
qconfig_params has five main parameter configuration items: dtype, activation_fake_quant, weight_fake_quant, activation_qkwargs and weight_qkwargs.
-
dtype: dtype is the quantization bit type, supporting "qint8" (default).
-
activation_fake_quant: Quantifier for activation, supporting "fake_quant" (default), "lsq", and "pact".
-
weight_fake_quant: Quantifier for weight. Supporting "fake_quant" (default), "lsq", and "pact".
-
activation_qkwargs: Parameters of activation quantifier.
- When
activation_fake_quant is "fake_quant", activation_qkwargs can be set as below:
activation_qkwargs=dict(
observer=MovingAverageMinMaxObserver, # Specifies observer. In general, default can be used.
averaging_constant=0.01, # Sets scale update factor
)
- When
activation_fake_quant is "lsq", activation_qkwargs can be set as below:
activation_qkwargs=dict(
observer=MovingAverageMinMaxObserver, # Specifies observer. In general, default can be used.
scale=1.0, # Specifies the initial scale. In general, default can be used.
zero_point=0.0, # Specifies the initial zero_point. In general, default can be used.
use_grad_scaling=False, # Specifies whether the gradients of scale and zero_point are normalized
# by a constant. False by default. In general, default can be used.
)
- When
activation_fake_quant is "pact", activation_qkwargs can be set as below:
activation_qkwargs=dict(
observer=MovingAverageMinMaxObserver, # Specifies observer. In general, default can be used.
alpha=6.0, # Specifies the clip parameter of activation. The default value is 6.0.
# In general, default can be used.
)
-
weight_qkwargs: Specifies the parameters for the weight quantifier. Except that the default observer for weight_qkwargs is MovingAveragePerChannelMinMaxObserver, other parameters and usage are the same as activation_qkwargs.
Note: Generally you can just use the default configurations without changing activation_qkwargs and weight_qkwargs. However, when performing the QAT training after the calibration, you may need to modify averaging_constant.