Writing Specifications of set_qconfig and Customization of qconfig

Writing Specifications of set_qconfig

When defining the model to be quantified, the model set_qconfig method needs to be implemented to configure the quantization method.

The current QConfig interface is provided by hat.utils.qconfig_manager. Call hat.utils.qconfig_manager in set_qconfig to implement the setting of the module Qconfig, e.g.:

# Note: This sample code illustrates the rules for implementing the set_qconfig method and is not the full quantitative model code. class Head(nn.Module): def __init__(self): super(Head, self).__init__() self.out_conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=1) def forward(self): ... def set_qconfig(self): # If the network final output layer is conv, you can set it to out_qconfig separately to get a more accurate output from hat.utils import qconfig_manager self.out_conv.qconfig = qconfig_manager.get_default_qat_out_qconfig() class Backbone(nn.Module): def __init__(self): super(Backbone, self).__init__() self.conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=1) def forward(self): ... # When there is no special layer in backbone and no layer needs to set QConfig=None, # i.e., when setting default_qat_qconfig is needed, you can leave out the set_qconfig() method # def set_qconfig(self): class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.backbone = Backbone() self.head = Head() self.loss = nn.CrossEntropyLoss() def forward(self): ... # Need to implement set_qconfig method for the parent module def set_qconfig(self): from hat.utils import qconfig_manager # 1. first specify the qconfig of the parent module, # if qconfig is not set for the child module, the child module will # automatically use the qconfig of the parent module self.qconfig = qconfig_manager.get_default_qat_qconfig() # 2. If a submodule has a special layer and implemented the set_qconfig method, call if self.head is not None: if hasattr(self.head, "set_qconfig"): self.head.set_qconfig() # 3. If there is a submodule that does not need to set Qconfig, # you need to set Qconfig to None if self.loss is not None: self.loss.qconfig = None

Customize QAT QConfig Parameters

Using custom QConfig in QAT training is supported in HAT, simply configure the qconfig_params parameter in the qat_solver of the config file:

qat_solver = dict( trainer=qat_trainer, quantize=True, ... qconfig_params=dict( dtype="qint8", activation_fake_quant="fake_quant", weight_fake_quant="fake_quant", activation_qkwargs=dict( averaging_constant=0, ), weight_qkwargs=dict( averaging_constant=1, ), ), ... )

qconfig_params has five main parameter configuration items: dtype, activation_fake_quant, weight_fake_quant, activation_qkwargs and weight_qkwargs.

  • dtype: dtype is the quantization bit type, supporting "qint8" (default).

  • activation_fake_quant: Quantifier for activation, supporting "fake_quant" (default), "lsq", and "pact".

  • weight_fake_quant: Quantifier for weight. Supporting "fake_quant" (default), "lsq", and "pact".

  • activation_qkwargs: Parameters of activation quantifier.

    • When activation_fake_quant is "fake_quant", activation_qkwargs can be set as below:
    activation_qkwargs=dict( observer=MovingAverageMinMaxObserver, # Specifies observer. In general, default can be used. averaging_constant=0.01, # Sets scale update factor )
    • When activation_fake_quant is "lsq", activation_qkwargs can be set as below:
    activation_qkwargs=dict( observer=MovingAverageMinMaxObserver, # Specifies observer. In general, default can be used. scale=1.0, # Specifies the initial scale. In general, default can be used. zero_point=0.0, # Specifies the initial zero_point. In general, default can be used. use_grad_scaling=False, # Specifies whether the gradients of scale and zero_point are normalized # by a constant. False by default. In general, default can be used. )
    • When activation_fake_quant is "pact", activation_qkwargs can be set as below:
    activation_qkwargs=dict( observer=MovingAverageMinMaxObserver, # Specifies observer. In general, default can be used. alpha=6.0, # Specifies the clip parameter of activation. The default value is 6.0. # In general, default can be used. )
  • weight_qkwargs: Specifies the parameters for the weight quantifier. Except that the default observer for weight_qkwargs is MovingAveragePerChannelMinMaxObserver, other parameters and usage are the same as activation_qkwargs.

Note: Generally you can just use the default configurations without changing activation_qkwargs and weight_qkwargs. However, when performing the QAT training after the calibration, you may need to modify averaging_constant.