Common Usage Mistakes

Setting Error

Error

Modules that do not need to be quantized set a non-None qconfig, e.g., pre/post-processing, loss function, etc.

Correct practice: set qconfig only for modules that need to be quantized.


Error

The march is not set correctly, which may result in model compilation failures or inconsistent deployment accuracy.

Correct practice: select the correct BPU architecture based on the processor to be deployed, e.g. J6 requires Nash:

horizon.march.set_march(horizon.march.March.NASH)

Error

The model output node is not set to high accuracy output, resulting in quantization accuracy that is not as expected.

An error example is shown below:

Assume that the model is defined as follows:

class ToyNet(nn.Module): def __init__(self): self.conv0 = nn.Conv2d(4,4,3,3) self.relu0 = nn.ReLU() self.classifier = nn.Conv2d(4,4,3,3) def forward(self, x): out = self.conv0(x) out = self.relu(out) out = self.classifier(out) return out # example of setting qconfig incorrectly: float_model = ToyNet() # set the whole network to int8 quantization float_model.qconfig = default_qat_8bit_fake_quant_qconfig qat_model = prepare(float_model, example_input)

Correct practice: in order to improve the model accuracy, set the model output node to high accuracy, as shown in the example below:

qat_model = horizon.quantization.prepare( float_model, example_input, # Use default template to automatically enable high precision output. qconfig_setter = horizon.quantization.qconfig_template.default_qat_qconfig_setter, )

Method Error

Error

The Calibration process uses multi-cards.

Due to underlying limitations, currently Calibration does not support multi-cards, please use a single card for Calibration


Error

The model input image data is in a non-centered YUV444 format such as RGB, which may result in inconsistent model deployment accuracy.

Correct practice: since the image format supported by Horizon hardware is centered YUV444, it is recommended that you use the YUV444 format directly as the network input from the beginning of model training.


Error

Use the qat model for model accuracy evaluation and monitoring in quantized awareness training, which leads to the problem of failing to detect the abnormal accuracy at the time of deployment in a timely manner.

Correct practice: the reason for the error between QAT and Quantized is that the QAT stage cannot fully simulate the pure fixed-point computation logic in Quantized, so it is recommended to use the quantized model for model accuracy evaluation and monitoring.

quantized_hbir_model = hbdk4.compiler.convert(qat_hbir_model) acc = evaluate(quantized_hbir_model, eval_data_loader)

Network Error

Error

Call the same member defined by FloatFunctional()multiple times.

The error example is as follows:

class ToyNet(nn.Module): def __init__(self): self.add = FloatFunctional() def forward(self, x, y, z) out = self.add(x, y) return self.add(out, z)

Correct practice: prohibit calling the same variable defined by FloatFunctional() multiple times in forward.

class ToyNet(nn.Module): def __init__(self): self.add0 = FloatFunctional() self.add1 = FloatFunctional() def forward(self, x, y, z) out = self.add0.add(x, y) return self.add1.add(out, z)

Operator Error

Error

Some of the operators in the Quantized model have not gone through the calibration or QAT, for example, a post-processing operator wants to be accelerated on the BPU but has not gone through the quantization stage, which will lead to the failure of quantization inference or abnormal accuracy when deployed.

Correct practice: the Quantized phase is not completely unable to add operators directly, such as color space conversion operators, see the document for details on how to add operators. However, not all operators can be added directly, such as cat, this kind of operator must be obtained in the calibration or QAT phase of the statistics of the real quantization parameters in order not to affect the final accuracy, if you have a similar need to adjust the structure of the network, you can consult with the framework developers.

Model Error

Error

Floating-point model overfitting.

Common determination of model overfitting:

  • Large changes in the output after a slight transformation of the input data.
  • The model parameters are assigned large values.
  • The model activation is large.

Correct practice: Solve the floating-point model overfitting problem on your own.