Accuracy Debug Tool

The model conversion toolchain will quantize the calibration of the model based on the calibration samples you provide and guarantee efficient deployment of the model on the Horizon computing platform. In the process of model conversion, accuracy loss is inevitably introduced due to the quantization process from floating point to fixed point. Usually, the main reasons for accuracy loss may be the following:

  1. A part of the nodes in the model is more sensitive to quantizationwill introduce larger errors, i.e., sensitive node quantization problem.

  2. The error cumulation of each node in the model leads to a large calibration error in the model as a whole, which mainly contains: error cumulation caused by weight quantization, error cumulation caused by activation quantization and error cumulation caused by full quantization.

In this case, Horizon provides the accuracy debug tool to help you locate accuracy problems in the model quantization process on your own. The tool can help you to analyze the node-granularity quantization error of the calibration model, and finally help you to quickly locate the nodes with accuracy anomalies.

The accuracy debug tool provides a variety of analysis functions for you to use, such as:

  • Get the node quantization sensitivity.
  • Get the cumulative error curve of the model.
  • Get the data distribution of the specified node.
  • Get the box plot of data distribution between input data channels of a specified node, etc.

Quickstart

There are several steps to use the accuracy debug tool as follows:

  1. Configure the parameter debug_mode: "dump_calibration_data" in the model_parameters group in yaml to save the calibration data.

  2. Import the debug module and load the calibration model and data.

  3. Analyze the models with significant accuracy loss through the API provided by the accuracy debug tool.

The overall process is shown as the follow:

accuracy_debug_process

Calibration Models and Data Storage

First you need to configure debug_mode: "dump_calibration_data" in the yaml file to enable the accuracy debug function. Save the calibration data (calibration_data), and the related calibration model(calibrated_model.onnx) is saved in a constant state. In which:

  • Calibration data (calibration_data): In the calibration phase, the model obtains the quantization parameters of each quantized node by forward inference on these data, including: scale and threshold.
  • Calibration model (calibrated_model.onnx): the quantization parameters of each quantized node calculated in the calibration phase are saved in the calibration node to obtain the calibration model.
Note

What is the difference between the calibration data saved here and the calibration data generated by 02_preprocess.sh?

In J6 toolchain, 02_preprocess.sh gets the same calibration data as the one saved here, both in .npy format, which can be used directly in the debug tool to test the model accuracy. Note that you need to make sure that the folder structure of the calibration data from 02_preprocess.sh is the same as the folder structure of the calibration data saved here before feeding it into the debug tool.

Note

Calibration model (calibrated_model.onnx) interpretation

The calibration model is an intermediate product obtained by the model transformation tool chain by taking the floating point model after structural optimization, calculating the quantization parameters corresponding to each node from the calibration data and saving them in the calibration node. The main feature of the calibration model is that the model contains calibration nodes with the node type HzCalibration. These calibration nodes are divided into two main categories: activation calibration nodes and weight calibration nodes .

The input of activation calibration node is the output of the previous node of the current node, and the input data is quantized and inverse quantized based on the quantization parameters (scales and thresholds) saved in the current activation calibration node and then output.

The input of weight calibration node is the original floating point weights of the model, and the input original floating point weights are quantized and inverse quantized based on the quantization parameters (scales and thresholds) saved in the current weight calibration node and then output.

In addition to the above calibration nodes, other nodes in the calibration model are called general nodes by the accuracy debug tool. The types of general nodes are: Conv, Mul, Add, etc.

debug_node

The folder structure of calibration_data is as follows:

|--calibration_data : calibration data |----input.1 : folder named input node of the model and save the corresponding input data |--------0.npy |--------1.npy |-------- ... |----input.2 : for multi-input models multiple folders will be saved |--------0.npy |--------1.npy |-------- ...

Accuracy Debug Module Import and Usage

Next, you need to import the debug module into your code and get the quantization sensitivity of nodes (by default, the cosine similarity of the model output is used) via the get_sensitivity_of_nodes interface. A detailed description of the parameters of get_sensitivity_of_nodes can be found in get_sensitivity_of_nodes section.

# Import the debug module import horizon_nn.debug as dbg # Import the log module import logging # If verbose=True, you need to set the log level to INFO first. logging.getLogger().setLevel(logging.INFO) # Get node quantization sensitivity node_message = dbg.get_sensitivity_of_nodes( model_or_file='./calibrated_model.onnx', metrics=['cosine-similarity', 'mse'], calibrated_data='./calibration_data/', output_node=None, node_type='node', data_num=None, verbose=True, interested_nodes=None)

Analysis Result Display

The following is the result when verbose=True is printed:

=================node sensitivity================= node cosine-similarity mse --------------------------------------------------- Conv_60 0.77795 68.02103 Conv_48 0.78428 64.36318 Conv_82 0.80394 61.09268 Conv_94 0.80499 65.05224 Conv_42 0.83787 49.4949 Conv_88 0.84614 49.81132 Conv_54 0.86602 41.69972 Conv_71 0.87148 39.96296 Conv_65 0.87495 40.45997 Conv_25 0.89214 34.30351 Conv_20 0.89829 32.35053 Conv_77 0.89916 31.9907 Conv_14 0.90058 32.40179 Conv_9 0.90107 34.08191 Conv_37 0.91162 28.21194 Conv_31 0.91637 28.79291

In addition, the API will return the node quantization sensitivity information to you in the form of a dictionary (Dict) for subsequent analysis.

Out: {'Conv_60': {'cosine-similarity': 0.77795, 'mse': 68.02103}, 'Conv_48': {'cosine-similarity': 0.78428, 'mse': 64.36318}, 'Conv_82': {'cosine-similarity': 0.80394, 'mse': 61.09268}, 'Conv_94': {'cosine-similarity': 0.80499, 'mse': 65.05224}, 'Conv_42': {'cosine-similarity': 0.83787, 'mse': 49.4949}, 'Conv_88': {'cosine-similarity': 0.84614, 'mse': 49.81132}, 'Conv_54': {'cosine-similarity': 0.86602, 'mse': 41.69972}, 'Conv_71': {'cosine-similarity': 0.87148, 'mse': 39.96296}, 'Conv_65': {'cosine-similarity': 0.87495, 'mse': 40.45997}, 'Conv_25': {'cosine-similarity': 0.89214, 'mse': 34.30351}, 'Conv_20': {'cosine-similarity': 0.89829, 'mse': 32.35053}, 'Conv_77': {'cosine-similarity': 0.89916, 'mse': 31.9907}, 'Conv_14': {'cosine-similarity': 0.90058, 'mse': 32.40179}, 'Conv_9': {'cosine-similarity': 0.90107, 'mse': 34.08191}, 'Conv_37': {'cosine-similarity': 0.91162, 'mse': 28.21194}, 'Conv_31': {'cosine-similarity': 0.91637, 'mse': 28.79291}}

For more functions, see Function Description section.

For customer convenience, the precision debug tool can also be used via the command line, which can be viewed via hmct-debugger -h/--help the subcommands corresponding to each function. The detailed arguments and usage of each subcommand can be found in the Function Description section.

Function Description

get_sensitivity_of_nodes

Function: Get the node quantization sensitivity.

Command line format:

hmct-debugger get-sensitivity-of-nodes MODEL_OR_FILE CALIBRATION_DATA --other options

The parameters can be viewed via hmct-debugger get-sensitivity-of-nodes -h/--help.

Parameters:

PARAMETERAbbr for command line paramters              DESCRIPTIONSRequired/Optional
model_or_fileFixed parameterPURPOSE: Specify the calibration model.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration model to be analyzed.
required
metrics-mPURPOSE: The measure of node quantification sensitivity.
RANGE: 'cosine-similarity' , 'mse' , 'mre' , 'sqnr' , 'chebyshev' .
DEFAULT VALUE: 'cosine-similarity'.
DESCRIPTIONS: Specify how the node quantization sensitivity is calculated, and this parameter can be a List, i.e., calculating the quantized sensitivities in a variety of ways. However, the output is sorted by the calculation of the first position in the list only, and the higher the ranking indicates that the error introduced by quantifying that node is greater.
optional
calibrated_dataFixed parameterPURPOSE: Specify the calibration data.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration data needed for the analysis.
required
output_node-oPURPOSE: Specify the output node.
RANGE: General nodes with corresponding calibration nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter allows you to specify intermediate nodes as output and calculate the node quantization sensitivity. If the default parameter None is kept, the accuracy debug tool will obtain the final output of the model and calculate the quantization sensitivity of the nodes on this basis.
optional
node_type-nPURPOSE: The node type.
RANGE: 'node' , 'weight' , 'activation'.
DEFAULT VALUE: 'node'.
DESCRIPTIONS: The types of nodes that need to calculate the quantization sensitivity, including: node (general node), weight (weight calibration node), activation (activation calibration node).
optional
data_num-dPURPOSE: The amount of data needed to calculate the quantized sensitivities.
RANGE: Greater than 0, less than or equal to the total number of data in calibration_data.
DEFAULT VALUE: 1.
DESCRIPTIONS: Set the amount of data needed to calculate the node quantization sensitivity. The default is None, in which case the tool will default to using all the data in calibration_data for the calculation. The minimum setting is 1 and the maximum is the amount of data in calibration_data.
optional
verbose-vPURPOSE: Select whether to print the information to the terminal.
RANGE: True , False.
DEFAULT VALUE: False.
DESCRIPTIONS: If set to True, the quantization sensitivity information will be printed to the terminal. If the metrics contain multiple metrics, they will be sorted by first.
optional
interested_nodes-iPURPOSE: Set the node of interest.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: If specified, only the quantization sensitivity of the node will be obtained, and the rest of the nodes will not be obtained. Also, if this parameter is specified, the node type specified by node_type will be ignored. That is, this parameter has a higher priority than node_type. If the default parameter None is kept, the quantization sensitivity of all quantizable nodes in the model is calculated.
optional

API Usage:

# Import the debug module import horizon_nn.debug as dbg # Import the log module import logging # If verbose=True, you need to set the log level to INFO first. logging.getLogger().setLevel(logging.INFO) # Obtain node quantization sensitivity node_message = dbg.get_sensitivity_of_nodes( model_or_file='./calibrated_model.onnx', metrics=['cosine-similarity', 'mse'], calibrated_data='./calibration_data/', output_node=None, node_type='node', data_num=None, verbose=True, interested_nodes=None)

Command Line Usage:

hmct-debugger get-sensitivity-of-nodes calibrated_model.onnx calibration_data -m ['cosine-similarity','mse'] -v True

Analysis results presentation:

Description: First you set the node types that need to calculate the sensitivity by node_type, then the tool obtains all the nodes in the calibration model that match node_type and gets the quantization sensitivity of these nodes. When verbose is set to True, the tool sorts the node quantization sensitivity and prints it in the terminal. The higher the sort, the greater the quantization error introduced by the node quantization. Also for different node_types, the tool will display different node quantization sensitivity information.

When verbose=True and node_type='node', the following results are printed:

=================node sensitivity================= node cosine-similarity mse --------------------------------------------------- Conv_60 0.77795 68.02103 Conv_48 0.78428 64.36318 Conv_82 0.80394 61.09268 Conv_94 0.80499 65.05224 Conv_42 0.83787 49.4949 Conv_88 0.84614 49.81132 Conv_54 0.86602 41.69972 Conv_71 0.87148 39.96296 Conv_65 0.87495 40.45997 Conv_25 0.89214 34.30351 Conv_20 0.89829 32.35053 Conv_77 0.89916 31.9907 Conv_14 0.90058 32.40179 Conv_9 0.90107 34.08191 Conv_37 0.91162 28.21194 Conv_31 0.91637 28.79291

Where,

  • node: The node name.
  • cosine-similarity, mse: Quantized sensitivity values of each node.

When verbose=True and node_type='weight', the following results are printed:

====================================node sensitivity==================================== weight node cosine-similarity mse ----------------------------------------------------------------------------------------- 471_HzCalibration Conv_2 0.99978 0.07519 480_HzCalibration Conv_7 0.99986 0.04823 609_HzCalibration Conv_88 0.99997 0.01145 573_HzCalibration Conv_65 0.99997 0.00984 474_HzCalibration Conv_4 0.99997 0.00963 468_HzCalibration Conv_0 0.99997 0.0091 612_HzCalibration Conv_90 0.99997 0.00871 585_HzCalibration Conv_73 0.99997 0.0095 483_HzCalibration Conv_9 0.99998 0.00818 600_HzCalibration Conv_82 0.99998 0.00717 582_HzCalibration Conv_71 0.99998 0.00659 603_HzCalibration Conv_84 0.99998 0.00614 591_HzCalibration Conv_77 0.99998 0.00558 489_HzCalibration Conv_12 0.99998 0.00515 564_HzCalibration Conv_60 0.99999 0.00495 618_HzCalibration Conv_94 0.99999 0.00498 543_HzCalibration Conv_46 0.99999 0.00502 552_HzCalibration Conv_52 0.99999 0.00501 594_HzCalibration Conv_78 0.99999 0.00451 555_HzCalibration Conv_54 0.99999 0.00452 ...99classifier.1.weight_conv_weight_HzCalibration Gemm_99 0.99999 0.00359 558_HzCalibration Conv_56 0.99999 0.00369

Where,

  • weight: The weight calibration node name.
  • node: The name of the common node corresponding to the weight calibration node, i.e., the output of the weight calibration node is its input.
  • cosine-similarity, mse: The quantization sensitivity values of each node.

When verbose=True and node_type='activation', the following results are printed:

===================================node sensitivity=================================== activation node threshold bit cosine-similarity mse --------------------------------------------------------------------------------------- 406_HzCalibration Conv_60 0.91501 8 0.77851 67.82422 388_HzCalibration Conv_48 0.55422 8 0.78501 64.16379 440_HzCalibration Conv_82 2.01577 8 0.8041 61.0322 458_HzCalibration Conv_94 0.51507 8 0.80466 65.14515 379_HzCalibration Conv_42 0.53759 8 0.83837 49.35648 449_HzCalibration Conv_88 2.34071 8 0.84447 50.29965 397_HzCalibration Conv_54 0.81528 8 0.86499 41.99234 423_HzCalibration Conv_71 0.84753 8 0.87104 40.09385 414_HzCalibration Conv_65 0.80155 8 0.87443 40.63319 353_HzCalibration Conv_25 0.67858 8 0.89199 34.34844 345_HzCalibration Conv_20 1.06324 8 0.8984 32.31664 432_HzCalibration Conv_77 2.54515 16 0.89895 32.0417 336_HzCalibration Conv_14 1.01407 8 0.90091 32.30325 328_HzCalibration Conv_9 1.61622 8 0.90163 33.78012 371_HzCalibration Conv_37 0.91038 8 0.91277 27.84864 362_HzCalibration Conv_31 0.65606 8 0.91683 28.65791 403_HzCalibration Conv_58 0.95119 8 0.93365 21.32314 391_HzCalibration Conv_50;Add_55 2.92598 8 0.93984 19.72109 382_HzCalibration Conv_44;Add_49 2.75416 8 0.95122 17.74137 417_HzCalibration Conv_67;Add_72 2.85463 8 0.95139 15.81864

Where,

  • activation: The activation calibration node name.
  • node: The common node after the activation calibration node in the model structure, i.e., the output of the activation calibration node is its input.
  • threshold: The calibration threshold, and the maximum value is taken if there are multiple thresholds.
  • bit: The quantization bit.
  • cosine-similarity, mse: The quantization sensitivity values of each node.

API return value:

The API returns the value of the quantization sensitivity saved in dictionary format (Key is the node name, Value is the quantization sensitivity information of the node), in the following format:

Out: {'Conv_3': {'cosine-similarity': '0.999009567957658', 'mse': '0.027825591154396534'}, 'MaxPool_2': {'cosine-similarity': '0.9993462241612948', 'mse': '0.017706592209064044'}, 'Conv_6': {'cosine-similarity': '0.9998359175828787', 'mse': '0.004541242333988731'}, 'MaxPool_5': {'cosine-similarity': '0.9998616805443397', 'mse': '0.0038416787014844325'}, 'Conv_0': {'cosine-similarity': '0.9999297948984', 'mse': '0.0019312848587735342'}, 'Gemm_19': {'cosine-similarity': '0.9999609772975628', 'mse': '0.0010773885699633795'}, 'Conv_8': {'cosine-similarity': '0.9999629625907311', 'mse': '0.0010301886404004807'}, 'Gemm_15': {'cosine-similarity': '0.9999847687207736', 'mse': '0.00041888411550854263'}, 'MaxPool_12': {'cosine-similarity': '0.9999853235024673', 'mse': '0.0004039733791544747'}, 'Conv_10': {'cosine-similarity': '0.999985763659844', 'mse': '0.0004040437432614943'}, 'Gemm_17': {'cosine-similarity': '0.9999913985912616', 'mse': '0.0002379088904350423'}} ...}

plot_acc_error

Function: Only one node in the floating-point model will be quantized, and the error of that model and the output of the node in the floating-point model will be calculated sequentially to obtain the cumulative error curve.

Command line format:

hmct-debugger plot-acc-error MODEL_OR_FILE CALIBRATION_DATA --other options

The parameters can be viewed via hmct-debugger plot-acc-error -h/--help.

API parameters:

PARAMETERAbbr for command line paramtersDESCRIPTIONSRequired/Optional
save_dir-sPURPOSE: The save path.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONSOptional, specify the path to save the analysis results.
optional
calibrated_dataFixed parameterPURPOSE: Specify the calibration data.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration data to be analyzed.
required
model_or_fileFixed parameterPURPOSE: Specify the calibration model.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONSRequired, specify the calibration model to be analyzed.
required
quantize_node-qPURPOSE: Quantize only the specified nodes in the model and view the error accumulation curve.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional parameter. Specifies the nodes in the model that need to be quantized, while ensuring that none of the remaining nodes are quantized. It is determined whether the parameter is a nested list to decide whether to quantize a single node or a partial node.
For example:
  • quantize_node=['Conv_2','Conv_9']: Quantize only Conv_2 and Conv_9, respectively, while ensuring that the remaining nodes are not quantized.
  • quantize_node=[['Conv_2'],['Conv_9','Conv_2']]:Only Conv_2 and both Conv_2 and Conv_9 were quantified to test the model cumulative error separately.
  • quantize_node contains two special parameters: 'weight' and 'activation'.
  • quantize_node = ['weight']: Only quantify weights, not activation.
  • quantize_node = ['activation']: Only quantify activation, not weights.
  • quantize_node = ['weight','activation']: Weights and activations are quantified separately.
Note: quantize_node and non_quantize_node cannot be None at the same time, one of them must be specified.
optional
non_quantize_node-nqPURPOSE: Specify the node in the unquantized model to view the error accumulation curve.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional parameter. Specifies the nodes in the model that are not quantized, while ensuring that all the remaining nodes are quantized.This parameter determines whether a single node is unquantized or partially quantized by determining whether it is a nested list.
For example:
  • non_quantize_node=['Conv_2','Conv_9']: Unquantize the Conv_2 and Conv_9 nodes respectively, while ensuring that all the remaining nodes are quantized.
  • non_quantize_node=[[‘Conv_2’],[‘Conv_9’,’Conv_2’]]: Only Conv_2 quantization and both Conv_2 and Conv_9 quantization are unquantized to test the model cumulative error separately.
Note: quantize_node and non_quantize_node cannot be None at the same time, one of them must be specified.
optional
metric-mPURPOSE: The error metric method.
RANGE: 'cosine-similarity', 'mse', 'mre', 'sqnr', 'chebyshev'.
DEFAULT VALUE: 'cosine-similarity'.
DESCRIPTIONS: Set the calculation method for calculating the model error.
optional
average_mode-aPURPOSE: Specify the output mode of the cumulative error curve.
RANGE: True ,False.
DEFAULT VALUE: False.
DESCRIPTIONS: The default is False, if True, then the average of the cumulative error is obtained as the result.
optional
# Import the debug module import horizon_nn.debug as dbg dbg.plot_acc_error( save_dir: str, calibrated_data: str or CalibrationDataSet, model_or_file: ModelProto or str, quantize_node: List or str, non_quantize_node: List or str, metric: str = 'cosine-similarity', average_mode: bool = False):

Analysis results presentation:

1. Specify the node to quantify the cumulative error test:

  • Specify single-node quantization

Configuration method: quantize_node=['Conv_2', 'Conv_90'], quantize_node is single list.

API Usage:

# Import the debug module import horizon_nn.debug as dbg dbg.plot_acc_error( save_dir='./', calibrated_data='./calibration_data/', model_or_file='./calibrated_model.onnx', quantize_node=['Conv_2', 'Conv_90'], metric='cosine-similarity', average_mode=False)

Command Line Usage:

hmct-debugger plot-acc-error calibrated_model.onnx calibrated_data -q ['Conv_2','Conv_90']

Description: When quantize_node is a single list, for the quantize_node you set, quantize the nodes in the quantize_node separately and keep the other nodes in the model unquantized to get the corresponding model, and then calculate the error between the output of each node in the model and the output of the corresponding node in the floating-point model, and get the corresponding cumulative error curve.

When average_mode = False:

image

When average_mode = True:

image

Note

average_mode

average_mode defaults to False. For some models, it is not possible to determine which quantization strategy is more effective by the cumulative error curve. Therefore, we need to set average_mode to True, which takes the average value of the cumulative error of the first n nodes as the cumulative error of the nth node.

This is calculated as follows, for example:

When average_mode=False, accumulate_error=[1.0, 0.9, 0.9, 0.8].

But when average_mode=True, accumulate_error=[1.0, 0.95, 0.933, 0.9].

  • Specify multiple nodes for quantization

Configuration method: quantize_node=[['Conv_2'], ['Conv_2', 'Conv_90']],quantize_node is nested list.

API Usage:

# Import the debug module import horizon_nn.debug as dbg dbg.plot_acc_error( save_dir='./', calibrated_data='./calibration_data/', model_or_file='./calibrated_model.onnx', quantize_node=[['Conv_2'], ['Conv_2', 'Conv_90']], metric='cosine-similarity', average_mode=False)

Command Line Usage:

hmct-debugger plot-acc-error calibrated_model.onnx calibration_data -q [['Conv_2'],['Conv_2','Conv_90']]

Description: When quantize_node is a nested list, for the quantize_node you set, quantize the nodes specified for each single list in the quantize_node separately and keep the other nodes in the model unquantized to get the corresponding model, then calculate the error between the output of each node in the model and the output of the corresponding node in the floating-point model, and get the corresponding cumulative error curve.

  • partial_qmodel_0: Only quantize the Conv_2 node, the rest of the nodes are not quantized;
  • partial_qmodel_1: Only the Conv_2 and Conv_90 nodes are quantified, and the rest of the nodes are not quantified.

When average_mode = False:

image

When average_mode = True:

image

2.Cumulative error test after unquantizing some nodes of the model

  • Specify single node not quantified

Configuration method: non_quantize_node=['Conv_2', 'Conv_90'],non_quantize_node is single list.

API Usage:

# Import the debug module import horizon_nn.debug as dbg dbg.plot_acc_error( save_dir='./', calibrated_data='./calibration_data/', model_or_file='./calibrated_model.onnx', non_quantize_node=['Conv_2', 'Conv_90'], metric='cosine-similarity', average_mode=True)

Command Line Usage:

hmct-debugger plot-acc-error calibrated_model.onnx calibration_data -nq ['Conv_2','Conv_90'] -a True

Description: When the non_quantize_node is a single list, for the non_quantize_node you set, unquantize each node in the non_quantize_node separately while keeping all other nodes quantized, and then get the corresponding model, calculate the error between the output of each node in the model and the output of the corresponding node in the floating-point model, and get the corresponding cumulative error curve.

When average_mode = False:

image

When average_mode = True:

image

  • Specify multiple nodes not to quantize

Configuration method: non_quantize_node=[['Conv_2'], ['Conv_2', 'Conv_90']],non_quantize_node is nested list.

API Usage:

# Import the debug module import horizon_nn.debug as dbg dbg.plot_acc_error( save_dir='./', calibrated_data='./calibration_data/', model_or_file='./calibrated_model.onnx', non_quantize_node=[['Conv_2'], ['Conv_2', 'Conv_90']], metric='cosine-similarity', average_mode=False)

Command Line Usage:

hmct-debugger plot-acc-error calibrated_model.onnx calibration_data -nq [['Conv_2'],['Conv_2','Conv_90']]

Description: When non_quantize_node is a nested list, for the non_quantize_node you set, do not quantize each node specified in the non_quantize_node and keep all the other nodes in the model quantized, and get the corresponding model, then calculate the error between the output of each node in the model and the output of the corresponding node in the floating-point model, and get the corresponding cumulative error curve.

  • partial_qmodel_0: No quantization of Conv_2 nodes and quantization of the remaining nodes;
  • partial_qmodel_1: No quantization of Conv_2 and Conv_90 nodes, and quantization of the rest of the nodes.

When average_mode = False:

image

When average_mode = True:

image

Test skills:

When testing the accuracy of partial quantification, you may compare the accuracy of multiple sets of quantified strategies in order of quantified sensitivity, in which case you can refer to the following usage:

# Import the debug module import horizon_nn.debug as dbg # First, use the quantization sensitivity sorting function to obtain the quantization sensitivity sorting of the nodes in the model node_message = dbg.get_sensitivity_of_nodes( model_or_file='./calibrated_model.onnx', metrics='cosine-similarity', calibrated_data='./calibration_data/', output_node=None, node_type='node', verbose=False, interested_nodes=None) # node_message is a dictionary type, and its key value is the node name nodes = list(node_message.keys()) # Specify the unquantized nodes by nodes, which can be easily used dbg.plot_acc_error( save_dir='./', calibrated_data='./calibration_data/', model_or_file='./calibrated_model.onnx', non_quantize_node=[nodes[:1], nodes[:2]], metric='cosine-similarity', average_mode=True)

3. Activation weights are quantified separately

Configuration method: quantize_node=['weight','activation'].

API Usage:

import horizon_nn.debug as dbg dbg.plot_acc_error( save_dir='./', calibrated_data='./calibration_data/', model_or_file='./calibrated_model.onnx', quantize_node=['weight', 'activation'], metric='cosine-similarity', average_mode=False)

Command Line Usage:

hmct-debugger plot-acc-error calibrated_model.onnx calibration_data -q ['weight','activation']

Description: The quantize_node can also be specified directly as 'weight' or 'activation'.

  • quantize_node = ['weight']: Quantify weights, not activation.
  • quantize_node = ['activation']: Quantify activation, not weights.
  • quantize_node = ['weight', 'activation']: Weights and activations are quantified separately.

image

Note

In general, it is recommended that you pay more attention to the portion of the cumulative error curve near the model output location in the cumulative error curve graph. When the cumulative error curve obtained from the test after using a certain quantization method has a smaller cumulative error close to the model output position, i.e., a higher degree of similarity, then we recommend that you prioritize testing this quantitative method.

plot_distribution

Function:Select the node and obtain the output of that node in the floating point model and the calibration model respectively to get the output data distribution. In addition, the two outputs are subtracted to obtain the error distribution between the two outputs.

Command line format:

hmct-debugger plot-distribution MODEL_OR_FILE CALIBRATION_DATA --other options

The parameters can be viewed via hmct-debugger plot-distribution -h/--help.

Parameters:

PARAMETERAbbr for command line paramtersDESCRIPTIONSRequired/Optioanl
save_dir-sPURPOSE: The save path.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional, specify the path to save the analysis results.
Optional
model_or_fileFixed parameterPURPOSE: Specify the calibration model.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration model to be analyzed.
required
calibrated_dataFixed parameterPURPOSE: Specify the calibration data.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration data needed for the analysis.
required
nodes_list-nPURPOSE: Specify the nodes to be analyzed.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the nodes to be analyzed. If the node type in nodes_list is:
  • Weights calibration node: plots the data distribution of the original weights and the weights after calibration.
  • Activate calibration nodes: Plot the input data distribution of activated calibration nodes.
  • Common node: Plot the output data distribution of this node before and after quantization, and also plot the error distribution between the two.
Note: nodes_list is a list type, a series of nodes can be specified, and all three types of nodes can be specified at the same time.
required
# Import the debug module import horizon_nn.debug as dbg dbg.plot_distribution( save_dir: str, model_or_file: ModelProto or str, calibrated_data: str or CalibrationDataSet, nodes_list: List[str] or str)

Analysis results presentation:

API Usage:

# Import the debug module import horizon_nn.debug as dbg dbg.plot_distribution( save_dir='./', model_or_file='./calibrated_model.onnx', calibrated_data='./calibration_data', nodes_list=['317_HzCalibration', # Activation Node '471_HzCalibration', # Weight Node 'Conv_2']) # Common Node

Command Line Usage:

hmct-debugger plot-distribution calibrated_model.onnx calibration_data -n ['317_HzCalibration','471_HzCalibration','Conv_2']

When node_type = 'node_output':

image

When node_type = 'weight':

image

When node_type = 'activation':

image

Note

In the above three pictures, the blue triangle indicates that the maximum value of the absolute data. The red dashed line indicates that the maximum calibration threshold.

get_channelwise_data_distribution

Function: Draw the box line plot of the data distribution between the input data channels of the specified calibration node.

Command line format:

hmct-debugger get-channelwise-data-distribution MODEL_OR_FILE CALIBRATION_DATA --other options

The parameters can be viewed via hmct-debugger get-channelwise-data-distribution -h/--help.

Parameters:

PARAMETERAbbr for command line paramtersDESCRIPTIONSRequired/Optioanl
save_dir-sPURPOSE: The save path.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional, specify the path to save the analysis results.
Optional
model_or_fileFixed parameterPURPOSE: Specify the calibration model.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration model to be analyzed.
required
calibrated_dataFixed parameterPURPOSE: Specify the calibration data.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration data needed for the analysis.
required
nodes_list-nPURPOSE: Specify the calibration node.
RANGE: All weight calibration nodes and activation calibration nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration node.
required
axis-aPURPOSE: Specify the dimension where the channel is located.
RANGE: Less than the dimension of node input data.
DEFAULT VALUE: None.
DESCRIPTIONS: The position of the channel information in the shape. The default parameter is None, at this time, for the activation calibration node, the second dimension of the node input data is considered to represent the channel information by default, i.e. axis=1; for the weight calibration node, the axis parameter in the node's attributes will be read as the channel information.
optional
# Import the debug module import horizon_nn.debug as dbg dbg.get_channelwise_data_distribution( save_dir: str, model_or_file: ModelProto or str, calibrated_data: str or CalibrationDataSet, nodes_list: List[str], axis: int = None)

Analysis results presentation:

Description: For your set calibration node list(node_list), get the dimension where the channel is located from the parameter axis, and get the data distribution among the node input data channels. Where axis is None by default, if the node is a weight calibration node, then the dimension where the channel is located is 0 by default; if the node is an activation calibration node, then the dimension where the channel is located is 1 by default.

Weight calibration node:

weight_calibration_node

Activation calibration node:

activate_calibration_node

The output is shown as follows:

box_plot

In the figure:

  • The horizontal coordinate indicates the number of channels of input data of the node, and there are 96 channels of input data in the figure example.
  • The vertical coordinate indicates the data distribution range of each channel, where the red solid line indicates the median of the data in that channel and the blue dashed line indicates the mean. The upper and lower limits of each box indicate the upper quartile and the lower quartile, respectively, and the discrete points outside the upper and lower limits indicate the outliers, and the maximum of the absolute value of these outliers is observed to determine whether the current node input data is experiencing large fluctuations.

sensitivity_analysis

Function: For quantization-sensitive nodes, the model accuracy after quantizing these nodes individually as well as partially is analyzed and tested separately.

Command line usage:

hmct-debugger sensitivity-analysis MODEL_OR_FILE CALIBRATION_DATA --other options

The parameters can be viewed via hmct-debugger sensitivity-analysis -h/--help.

Parameters:

PARAMETERAbbr for command line paramtersDESCRIPTIONSRequired/Optioanl
model_or_fileFixed parameterPURPOSE: Specify the calibration model.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration model to be analyzed.
required
calibrated_dataFixed parameterPURPOSE: Specify the calibration data.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration data needed for the analysis.
required
pick_threshold-pPURPOSE: Setting the sensitivity threshold for selected nodes.
RANGE: None.
DEFAULT VALUE: 0.999.
DESCRIPTIONS: Optional parameter. This function calculates the quantitative sensitivity of common nodes and selects nodes with sensitivity less than pick_threshold as sensitive nodes for analysis and testing.
Note: When sensitive_nodes is set, the sensitive_nodes will be tested directly, without calculating node sensitivity and selecting sensitive nodes according to pick_threshold.
optional
data_num-dPURPOSE: Amount of data needed to calculate quantitative sensitivity.
RANGE: Greater than 0, less than or equal to the total number of data in calibration_data.
DEFAULT VALUE: 1.
DESCRIPTIONS: Set the amount of data needed to calculate the node quantization sensitivity.
optional
sensitive_nodes-snPURPOSE: Specify the sensitive nodes to be analyzed.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional, specify the sensitive nodes to be analyzed.
Note: When this parameter is set, the nodes in this parameter are tested directly without calculating the node sensitivity and selecting sensitive nodes based on pick_threshold.
optional
save_dir-sdPURPOSE: Save Path.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional, specifies the path where the analysis results are saved.
optional

API Usage:

import horizon_nn.debug as dbg dbg.sensitivity_analysis(model_or_file='calibrated_model.onnx', calibrated_data='calibration_data', pick_threshold=0.9999, data_num=1, sensitive_nodes=[])

Command Line Usage:

hmct-debugger sensitivity-analysis calibrated_model.onnx calibration_data

Analysis results presentation:

partial_quantization

In the figure:

  • Blue dashed line: baseline, i.e., the cosine similarity of the floating point model output to itself, is 1.
  • Green cross: Quantize only the current node to get a partially quantized model, and compute the similarity between the partially quantized model and the final output of the floating-point model.
  • Red solid line: without quantizing the current node and all nodes before the current node, the similarity between the partially quantized model and the floating-point model is calculated. For example, the similarity value of Conv_92 in the above figure is around 0.995, indicating that the cosine similarity between the final output of the partially quantized model and the floating-point model is around 0.995 when we unquantize the nodes of Conv_2, Conv_7, and Conv_92, and keep all the rest of the nodes quantized to get the partially quantized model. The first none of the horizontal coordinates, in the red solid line, means calibrated_model.

runall

Function: Run all the functions in the original debug tool with one click.

Command line usage:

hmct-debugger runall MODEL_OR_FILE CALIBRATION_DATA --other options

The parameters can be viewed via hmct-debugger runall -h/--help.

Parameters:

PARAMETERAbbr for command line paramtersDESCRIPTIONSRequired/Optioanl
model_or_fileFix parameterPURPOSE: Specify the calibration model.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration model to be analyzed.
required
calibrated_dataFix parameterPURPOSE: Specify the calibration data.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Required, specify the calibration data needed for the analysis.
required
save_dir-sPURPOSE: The save path.
RANGE: None.
DEFAULT VALUE: None.
DESCRIPTIONS: Specify the path to save the analysis results.
optional
ns_metrics-nmPURPOSE: The measure of node quantification sensitivity.
RANGE: 'cosine-similarity', 'mse', 'mre', 'sqnr', 'chebyshev'.
DEFAULT VALUE: 'cosine-similarity'.
DESCRIPTIONS: Specify how the node quantization sensitivity is calculated, and this parameter can be a List, i.e., calculating the quantized sensitivities in a variety of ways. However, the output is sorted by the calculation of the first position in the list only, and the higher the ranking indicates that the error introduced by quantifying that node is greater.
optional
output_node-oPURPOSE: Specify the output node.
RANGE:General nodes with corresponding calibration nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: This parameter allows you to specify intermediate nodes as output and calculate the node quantization sensitivity. If the default parameter None is kept, the accuracy debug tool will obtain the final output of the model and calculate the quantization sensitivity of the nodes on this basis.
optional
node_type-ntPURPOSE: The node type.
RANGE: 'node', 'weight'.
DEFAULT VALUE: 'node'.
DESCRIPTIONS: The types of nodes that need to calculate the quantization sensitivity, including: node (general node), weight (weight calibration node), activation (activation calibration node).
optional
data_num-dnPURPOSE: The amount of data needed to calculate the quantized sensitivities.
RANGE: Greater than 0, less than or equal to the total number of data in calibration_data.
DEFAULT VALUE: None.
DESCRIPTIONS: Set the amount of data needed to calculate the node quantization sensitivity. The default is None, in which case the tool will default to using all the data in calibration_data for the calculation. The minimum setting is 1 and the maximum is the amount of data in calibration_data.
optional
verbose-vPURPOSE: Select whether to print the information to the terminal.
RANGE: True, False.
DEFAULT VALUE: False.
DESCRIPTIONS: If set to True, the quantization sensitivity information will be printed to the terminal. If the metrics contain multiple metrics, they will be sorted by first.
optional
interested_nodes-iPURPOSE: Set the node of interest.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: If specified, only the quantization sensitivity of the node will be obtained, and the rest of the nodes will not be obtained. Also, if this parameter is specified, the node type specified by node_type will be ignored. That is, this parameter has a higher priority than node_type. If the default parameter None is kept, the quantization sensitivity of all quantizable nodes in the model is calculated.
optional
dis_nodes_list-dnlPURPOSE: Specify the nodes to be analyzed.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Specify the nodes to be analyzed. If the node type in nodes_list is:
  • Weights calibration node: plots the data distribution of the original weights and the weights after calibration.
  • Activate calibration nodes: Plot the input data distribution of activated calibration nodes.
  • Common node: Plot the output data distribution of this node before and after quantization, and also plot the error distribution between the two.
Note: nodes_list is a list type, a series of nodes can be specified, and all three types of nodes can be specified at the same time.
optional
cw_nodes_list-cnPURPOSE: Specify the calibration node.
RANGE: All weight calibration nodes and activation calibration nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Specify the calibration node.
optional
axis-aPURPOSE: Specify the dimension where the channel is located.
RANGE: Less than the dimension of node input data.
DEFAULT VALUE: None.
DESCRIPTIONS: The position of the channel information in the shape. The default parameter is None , at this time, for the activation calibration node, the second dimension of the node input data is considered to represent the channel information by default, i.e. axis=1; for the weight calibration node, the axis parameter in the node's attributes will be read as the channel information.
optional
quantize_node-qnPURPOSE: Quantize only the specified nodes in the model and view the error accumulation curve.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional parameter. Specifies the nodes in the model that need to be quantized, while ensuring that none of the remaining nodes are quantized. It is determined whether the parameter is a nested list to decide whether to quantize a single node or a partial node.
For example:
  • quantize_node=['Conv_2', 'Conv_9']: Quantize only Conv_2 and Conv_9, respectively, while ensuring that the remaining nodes are not quantized.
  • quantize_node=[['Conv_2'], ['Conv_9', 'Conv_2']]: Only Conv_2 and both Conv_2 and Conv_9 were quantified to test the model cumulative error separately.
  • quantize_node contains two special parameters: 'weight' and 'activation'.
    • quantize_node = ['weight']: Only quantify weights, not activation.
    • quantize_node = ['activation']: Only quantify activation, not weights.
    • quantize_node = ['weight', 'activation']: Weights and activations are quantified separately.
Note: quantize_node and non_quantize_node cannot be None at the same time, one of them must be specified.
optional
non_quantize_node-nqnPURPOSE: Specify the node in the unquantized model to view the error accumulation curve.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional parameter. Specifies the nodes in the model that are not quantized, while ensuring that all the remaining nodes are quantized. This parameter determines whether a single node is unquantized or partially quantized by determining whether it is a nested list.
For example:
  • non_quantize_node=['Conv_2', 'Conv_9']: Unquantize the Conv_2 and Conv_9 nodes respectively, while ensuring that all the remaining nodes are quantized
  • non_quantize_node=[['Conv_2'], ['Conv_9', 'Conv_2']]: Only Conv_2 quantization and both Conv_2 and Conv_9 quantization are unquantized to test the model cumulative error separately.
Note: quantize_node and non_quantize_node cannot be None at the same time, one of them must be specified.
optional
ae_metric-amPURPOSE: The error metric method.
RANGE: 'cosine-similarity', 'mse', 'mre', 'sqnr', 'chebyshev'.
DEFAULT VALUE: 'cosine-similarity'.
DESCRIPTIONS: Set the calculation method for calculating the model error.
optional
average_mode-avmPURPOSE: Specify the output mode of the cumulative error curve.
RANGE: True, False.
DEFAULT VALUE: False.
DESCRIPTIONS: The default is False, if True, then the average of the cumulative error is obtained as the result.
optional
pick_threshold-ptPURPOSE: Setting the sensitivity threshold for selected nodes.
RANGE: None.
DEFAULT VALUE: 0.999.
DESCRIPTIONS: Optional parameter. This function calculates the quantitative sensitivity of common nodes and selects nodes with sensitivity less than pick_threshold as sensitive nodes for analysis and testing.
Note: When sensitive_nodes is set, the sensitive_nodes will be tested directly, without calculating node sensitivity and selecting sensitive nodes according to pick_threshold.
optional
sensitive_nodes-snPURPOSE: Specify the sensitive nodes to be analyzed.
RANGE: All nodes in the calibration model.
DEFAULT VALUE: None.
DESCRIPTIONS: Optional, specify the sensitive nodes to be analyzed.
Note: When this parameter is set, the nodes in this parameter are tested directly without calculating the node sensitivity and selecting sensitive nodes based on pick_threshold.
optional

API Usage:

# 导入debug模块 import horizon_nn.debug as dbg dbg.runall(model_or_file='calibrated_model.onnx', calibrated_data='calibration_data')

Command Line Usage:

hmct-debugger runall calibrated_model.onnx calibration_data

runall Process:

runall

When all parameters are left as default, the tool performs the following functions in sequence:

  • STEP1 and STEP2: Obtain the quantization sensitivity of the weight calibration node and activation calibration node, respectively.
  • STEP3: Based on the results of STEP1 and STEP2, take the TOP5 of the weight calibration nodes and the TOP5 of the activation calibration nodes to plot their data distribution, respectively.
  • STEP4: For the nodes obtained in STEP3, draw box-and-line plots of their inter-channel data distribution, respectively.
  • STEP5: Plot the cumulative error curves for quantizing only the weights and only the activations, respectively.
  • STEP6: Partial quantization and single-node quantization accuracy analysis for sensitive nodes. Since the example in the figure does not specify sensitive_nodes, the debug tool needs to calculate the quantization sensitivity of common nodes by itself and select nodes with sensitivity less than the specified pick_threshold for testing and analysis.

When node_type='node' is specified, the tool will get the top5 nodes and find the calibration nodes corresponding to each node separately, and get the data distribution and box line diagram of the calibration nodes.