Custom OP Development
General Descriptions
In most cases, your models should be able to be deployed into Horizon's computing platform because the algorithm toolchain has provided rich OPs.
The supported operators can be found in Toolchain Operator Support Constraint List-ONNX Operator Support List section.
But if you find that there are unsupported OP(s) in the models, we strongly suggested you to try to replace the unsupported OP(s) with those supported ones, so as to utilize Horizon's computing platform capacities to the full, and the development cost will be lower.
The customized OP provides the ability to allow a customized operator to be computed on the CPU.
A complete custom OP development process should include template creation, OP implementation, OP compilation, model conversion of customized OPs included and model execution of customized OPs included. Please refer to the following diagram:
As above shown, it takes 2 stages to define a customized OP: At model conversion stage, there must be the Python code of the customized OP;
At simulator/dev board inference stage, there must be the customized OP and its C++ code.
In addition, computation the codes in 2 stages must be consistent.
Environment Configuration
The environment needs to be configured before performing model conversion with custom operators, make sure that your environment meets the requirements described within this section.
Attention
Please note that if you need to compile or inference the model with custom operators in X86 environment, you need to make sure that the gcc version is greater than 5.0.
Please refer to the following to configure environmental variables:
1.GCC Environment Configuration
x86_gcc_path="/bin/g++-11"
x86_gcc_lib="/usr/lib/gcc/x86_64-linux-gnu/11/"
cross_gcc_root="/usr/"
cross_gcc_aarch64_lib="${cross_gcc_root}/aarch64-linux-gnu/lib/"
cross_gcc_x86_lib="${cross_gcc_root}/lib/x86_64-linux-gnu/"
2.Compiler Dependency Configuration
hbdk_path="$(pip show hbdk4-compiler |grep Location |awk -F " " '{printf$2}')/hbdk4"
hbtl_aarch64="${hbdk_path}/runtime/aarch64_unknown_linux_gnu/nash"
hbtl_x86_64="${hbdk_path}/runtime/x86_64_unknown_linux_gnu/nash"
export INTERPRETER_ENABLE=0
3.X86 Environment Variable Setting
export HBDK_TARGET_X86_64_UNKNOWN_LINUX_GNU_CXX="${x86_gcc_path}"
export HBDK_TARGET_X86_64_UNKNOWN_LINUX_GNU_CXXFLAGS="-I${hbtl_x86_64}/include/ -L${hbtl_x86_64}/lib -L${hbdk4_path}/compiler -L${hbdk4_path}/compiler/_mlir_libs -std=c++14"
export HBDK_TARGET_X86_64_UNKNOWN_LINUX_GNU_LDFLAGS="-L${hbtl_x86_64}/lib -static-libstdc++ -L${x86_gcc_lib} -Wl,-rpath,${x86_gcc_lib}"
4.Arm Environment Variable Setting
export HBDK_TARGET_AARCH64_UNKNOWN_LINUX_GNU_CXX="${cross_gcc_root}/bin/aarch64-linux-gnu-g++"
export HBDK_TARGET_AARCH64_UNKNOWN_LINUX_GNU_CXXFLAGS="-I${hbtl_aarch64}/include/ -L${hbtl_aarch64}/lib -L${cross_gcc_aarch64_lib} -L${hbdk4_path}/compiler/_mlir_libs -L${hbdk4_path}/compiler/_mlir_libs -v"
export HBDK_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LDFLAGS="-L${hbtl_aarch64}/lib -L${cross_gcc_aarch64_lib} -static-libstdc++ -Wl,-rpath,${cross_gcc_aarch64_lib}"
5.Associate Dynamic Libraries Of Custom Operator
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"${x86_gcc_lib}:${cross_gcc_x86_lib}"
Model Conversion with Custom Operators (intermediate nodes)
Custom operators (intermediate nodes) refer to the scenario where one/some operator in the model registered as the custom operator.
Modify Model File
After preparing your customized operator implementation, to run the customized OP, you need to modify both the original model file and the configuration file for model conversion (Take the Caffe model and the ONNX model as examples respectively below):
In the original model file, change the OP type mark of corresponding customized OP into Custom and fill in a custom_param group as shown below:
Caffe Model
In the original model file, the operator type corresponding to the custom operator is marked asCustom, and a set of custom_param is provided. The example is as follows.
layer {
name: "hr_op"
type: "Custom"
bottom: "res3d_in"
top: "res3d"
custom_param {
kind: "CustomIdentity"
shape {
dim: 1
dim: 512
dim: 28
dim: 28
}
params: "'kernel_size': 10 \n'threshold': 0.5"
}
}
In the above custom_param set example:
- The
kind refers to the name of costum OP's internal implementation, as the custom OP is an identical OP, it is named as CustomIdentity.
This name will be shown in the succeeding Python and C++ codes.
- The
shape refers to OP's output size and needs to be completely specified.
- The
paramsrefers to OP's incoming parameters and it should be specified like this: 'param_name': param_value.
Note that multiple parameters should be separated using \n.
While in the configuration file for model conversion, a new custom OP parameter must be added into the file as shown below:
#...
custom_op:
# custom OP's calibration method
custom_op_method: register
# custom OP's implementing file
op_register_files: sample_custom.py
For the Caffe model, both parameters in the above parameter group must be configured. custom_op_method should be specified as register.
op_register_files is the implementation file of the custom operator calculation, please use the relative path.
When all configurations are done, the succeeding model conversion steps are the same as the other ordinary models.
ONNX Model
1.Obtain the Onnx model with custom operators:
- Converted from other frameworks such as pytorch
import torch
from horizon_nn.horizon_onnx.onnx_pb import TensorProto
from torch.onnx.symbolic_helper import parse_args
from torch.onnx.utils import register_custom_op_symbolic
from torch import Tensor
model = torch.hub.load('pytorch/vision:v0.10.0', 'googlenet', pretrained=True)
def _transform_input(x: Tensor) -> Tensor:
return x
model._transform_input = _transform_input
@parse_args("v", "v")
def horizon_pool(g, input, output_size):
return g.op(
'horizon.custom::PyOp', #required, ! must be 'horizon.custom' domain !
input,
class_name_s="GlobalAveragePool", #required ! must match the class def name in sample_custom python file !
compute_s="compute", #optional, 'compute' by default
module_s="sample_custom", #required ! must match the file name of the "op_register_files" !
input_types_i=[TensorProto.FLOAT], #required
output_types_i=[TensorProto.FLOAT], #required
output_shape_s=["1, 1024, 1, 1"]) #required
d_input = torch.rand(1, 3, 224, 224)
register_custom_op_symbolic('::adaptive_avg_pool2d',
horizon_pool,
opset_version=11)
torch.onnx.export(model, d_input, "googlenet_cop.onnx", opset_version=11)
- Generate the onnx model directly
Reference Code:
import onnx
import numpy as np
from onnx import helper, checker, shape_inference, numpy_helper, TensorProto
def make_normal_data(shape):
return np.random.normal(loc=0.0, scale=1.0, size=shape).astype(np.float32)
# conv
def make_simple_model():
# create nodes
conv_input_shape = (1, 3, 224, 224)
conv_output_shape = (1, 3, 224, 224)
add_param_shape = (1, 3, 224, 224)
add_1_param_data = np.zeros(add_param_shape).astype(np.float32)
add_2_param_data = np.ones(add_param_shape).astype(np.float32)
conv_weight_shape = (3, 3, 3, 3)
conv_output_shape = (1, 3, 224, 224)
conv_weight_data = make_normal_data(conv_weight_shape)
add_1_node = helper.make_node(
"PyOp", # required, the type must be 'PyOp'
name="add_1", # required, different op names cannot be the same
inputs=["input0", "add_1_param"], # required, it needs to be a list, and it needs to be consistent with the number of inputs in the implementation file
outputs=["add_1_out"], # required, it needs to be a list, and it needs to be consistent with the number of outputs in the implementation file
domain="horizon.cop1", # required, Custom operator implementations with different implementation logics need to be implemented with different domain names
class_name="Cop1", # required, it needs to be the same as the class name in the implementation file of the custom operator
module="custom_op.horizon_ops", # required, it needs to be the same as the path to the implementation file containing the custom operator
compute="compute", # required, it needs to be consistent with the computational logic functions in the custom operator implementation class
ext_compute = "func1", # it is required when configuring numba, and is used to specify the name of the numba function
input_types=[
TensorProto.FLOAT,
TensorProto.FLOAT,
], # required, it needs to be a list, its length needs to be the same as the number of inputs attributes of the operator, and the same as the number of inputs in the implementation file
output_types=[
TensorProto.FLOAT
], # required, it needs to be a list, its length needs to be the same as the number of outputs attributes of the operator, and the same as the number of inputs in the implementation file
output_shape=["1, 3, 224, 224"], # optional, if the output value_info of pyop is not added to the model, it must be filled in
)
add_2_node = helper.make_node(
"PyOp",
name="add_2",
inputs=["input1", "add_1_out", "add_2_param"],
outputs=["add_2_out", "output0"],
domain="horizon.cop2",
class_name="Cop2",
module="custom_op.horizon_ops",
compute='compute',
input_types=[TensorProto.FLOAT, TensorProto.FLOAT,
TensorProto.FLOAT], #required
output_types=[TensorProto.FLOAT, TensorProto.FLOAT], #required
output_shape=["1, 3, 224, 224", "1, 3, 224, 224"])
conv_1_node = helper.make_node("Conv",
inputs=["add_2_out", "W0"],
outputs=["output1"],
dilations=(1, 1),
group=1,
kernel_shape=(3, 3),
pads=(1, 1, 1, 1),
name="conv_1")
# nodes
nodes = [add_1_node, add_2_node, conv_1_node]
# inputs
model_input_1 = helper.make_tensor_value_info("input0", TensorProto.FLOAT,
conv_input_shape)
model_input_2 = helper.make_tensor_value_info("input1", TensorProto.FLOAT,
conv_input_shape)
# Outputs
model_output_1 = helper.make_tensor_value_info("output0",
TensorProto.FLOAT,
conv_output_shape)
model_output_2 = helper.make_tensor_value_info("output1",
TensorProto.FLOAT,
conv_output_shape)
# Intermediate tensors
add_1_out = helper.make_tensor_value_info("add_1_out", TensorProto.FLOAT,
conv_output_shape)
add_2_out = helper.make_tensor_value_info("add_2_out", TensorProto.FLOAT,
conv_output_shape)
# create constant tensor
W0_tensor = helper.make_tensor("W0", TensorProto.FLOAT, conv_weight_shape,
conv_weight_data.flatten())
add_1_param = helper.make_tensor("add_1_param",
TensorProto.FLOAT, add_param_shape,
add_1_param_data.flatten())
add_2_param = helper.make_tensor("add_2_param",
TensorProto.FLOAT, add_param_shape,
add_2_param_data.flatten())
# make graph
graph = helper.make_graph(
nodes,
"simple_conv_model",
inputs=[model_input_1, model_input_2], # input
outputs=[model_output_1, model_output_2], # output
initializer=[W0_tensor, add_1_param, add_2_param], # initializer
value_info=[add_1_out, add_2_out], # value_info
)
# make model
onnx_model = helper.make_model(graph,
opset_imports=[
helper.make_opsetid("", 11),
helper.make_opsetid("horizon.cop1", 1),
helper.make_opsetid("horizon.cop2", 1)
],
producer_name="onnx-test")
# shape inference
onnx_model = shape_inference.infer_shapes(onnx_model)
# # model check
checker.check_model(onnx_model)
# save model
onnx.save(onnx_model, "custom_op.onnx")
Attention
Points to note about the PyOp attributes in the Onnx model:
- The domain attribute must be set, otherwise it will be defaulted to the onnx standard domain and an error will be reported.
Different implementations of custom operators need to be set under different domains.
- The module needs to have the same name as the registration file used during registration.
If the registration file is in a subfolder of the current directory, you need to modify the content of the module.
For example: If
sample_custom.py is in the custom_op folder of the current path, the module should set to custom_op.sample_custom .
- Currently only the onnx model supports multiple types of custom operators, if you need to support multiple types of custom operators in other frameworks please contact horizon.
- Consistent with the
Caffe model, you need to add a new custom op parameter group to the configuration file to use a custom operator in the model conversion configuration as follows:
#...
custom_op:
# Customize the calibration method of op
custom_op_method: register
# Custom OP's implementation file
op_register_files: sample_custom.py
For the ONNX model, both parameters in the above parameter group must be configured.
custom_op_method always uses register; op_register_files is the implementation file of the custom operator calculation, please use the relative path.
After completing these configurations, the subsequent steps of model conversion are consistent with other general model conversion processes.
OP Implementation
In the model conversion phase, a Python implementation of a customized operator is provided, which the tool uses to complete the inference phase necessary for model calibration.
Attention
Please note that, as the tool will use working_dir as the working directory during the PTQ conversion, we strongly recommend when you need to specify the working directory during the implementation of the operator, please specify it as the absolute path, and if you need to specify it as the relative path, please specify it as the relative path with working_dir as the working directory.
A Python template file (sample_custom.py) is shown as follows:
from horizon_nn.custom.op_registration import op_implement_register, op_shape_infer_register
@op_implement_register("CustomIdentity")
class CustomIdentity(object):
def __init__(self, kernel_size, threshold):
self._kernel_size = kernel_size
self._default_threshold = threshold
def compute(self, X):
return X
@op_shape_infer_register("CustomIdentity")
def infer_shape(inputs_shape):
outputs_shape = inputs_shape
return outputs_shape
The configuration file (horizon_ops.py) in the custom_op example is shown as follows:
from horizon_nn.custom.op_registration import op_implement_register
import numba
@numba.njit
def func1(x1, x2):
out = x1 + x2
return out
@numba.njit
def func2(x1, x2, x3):
out = x1 + x2 + x3 + 2
return out, out
@op_implement_register("Cop1")
class Cop1(object):
def __init__(self, ext_compute):
self.func1 = ext_compute
def compute(self, x1, x2):
return eval(f"{self.func1}(x1, x2)")
@op_implement_register("Cop2")
class Cop2(object):
def __init__(self, ext_compute):
self.func2 = ext_compute
def compute(self, x1, x2, x3):
return eval(f"{self.func2}(x1, x2, x3)")
The filename (sample_custom.py) must be filled into the op_register_files in YAML configuration file, otherwise the tool will not be able to import custom operator definition;
the op_implement_register modifier registered custom op name CustomIdentity must be the same with the property kind of the Caffe custom op or the property class_name of the Onnx custom op.
For the Caffe model, the (kernel_size, threshold) parameters of the init function are passed in from the params in prototxt file, they are used for initiating the custom OP module.
op_shape_infer_register is used for the registration of the operator shape for the Caffe model.
For the Onnx model, there are two ways to resolve the shape of the custom op, either by adding the value_info of the pyop output to the onnx model when creating the onnx model,
or by creating the output_shape attribute in the corresponding pyop. Note also that the module in the custom operator must be consistent with the file that holds the custom operator implementation.
If the property is set to custom_op horizon_ops, then the custom operator implementation file is named horizon_ops and should be placed in the custom_op folder, maintaining a hierarchical relationship with the onnx model hierarchy.
Since the implementation of an operator with the same name in the same domain must be the same, the domain property needs to be different for different custom operators.
Model conversion can be executed to get the hbm file and the dynamic library .so file used for the deployment of models containing custom operators when all abovementioned operations are done.
Model Conversion with Custom Operators (pre- and post-processing)
Custom operators (pre/post-processing) refer to the scenario where pre/post-processing logic integrated into the model by adding custom operators.
In order to reduce the difficulty of deploying the model board side of the model containing custom operators, and at the same time to improve the reusability of the pre/post-processing code, we also provide this scenario. This scenario involves setting environment variables and writing pre/post-processing code, you can refer to the example below.
Attention
Please note that before performing the following actions, make sure that you have completed the configuration of the environment variables according to section Environment Configuration.
Pre- and Post-processing Example
The example code below, based on the GoogleNet model, integrates pre-and post-processing into the HBIR model.
import os
import sys
import numpy as np
from numba import njit, float32
from hbdk4.compiler import translate, compile, load, save, leap, convert
import hbdk4
# Pre-processing numba implementation
@njit
def preprocess(data):
...
return preprocessed_data
# Post-processing numba implementation
@njit
def postprocess(result):
...
return res
module = load("googlenet_224x224_yuv444_quantize.bc")
# pipeline link
def pipeline(image):
# Pre-processing
preprocessed_data = preprocess(image)
# Schedule model inference
infer_result = leap.call(module, "googlenet_224x224_yuv444", preprocessed_data)
# Post-processing
res = postprocess(infer_result)
return res
# The input data,used to correlate model and verify the link
random_data = np.random.randn(1, 3, 224, 224)
data = (random_data * 127.5 + 128).astype(np.uint8)
# Integrate pre- and post-processing into HBIR model
module = translate(pipeline, data)
# HBIR model quantization
b25_module = convert(module, "b25")
#save(b25_module, "pipeline.bc")
# Recompile the HBIR model to generate a new HBM
compile(b25_module, "googlenet_pipeline.hbm", "b25", opt=0, progress_bar=True)
Output Description
After completing the above, you will be provided with a series of outputs files, of which the files to focus on include:
- libpipeline_xxxxxxxx_arm.so: Arm custom operator dynamic library used for model inference on the board.
- *.hbm: Offline model file containing pre/post-processing used for on board.
Model Run on the Board with Custom Operators
After getting the hbm file and the dynamic library .so file related to the running of the custom operator, you can't run it on the development board directly.
If you want to run on the board, you also need to configure the corresponding dynamic library load path, the configuration method is as follows:
# Create a folder to store dynamic link libraries related to the running of the custom operator, xxx needs to be replaced by your real path (absolute path)
mkdir -p xxx/custom_libs
cp xxx.so xxx/custom_libs
# Configure environment variables
export HB_DNN_CUSTOM_HBTL_PATH=xxx/custom_libs