The HBRuntime is a x86-side model inference library provided by Horizon, which supports inference on the original ONNX models directly exported by commonly used training frameworks, the ONNX models at various stages generated during the PTQ conversion process of the Horizon toolchain, and the HBIR(*.bc) models generated during the Horizon toolchain conversion process. The usage flow is shown as follows:
Reference usage when using HBRuntime for model inference is as follows:
In addition, HBRuntime supports you to view model attribute information during usage, the following model attribute information is supported to be viewed. For example, if you want to print to see the model input number, you can use print(f"input_num: {sess.input_num}").
| model_attribute | DESCRIPTION |
|---|---|
| input_num | Number of model input |
| output_num | Number of model output |
| input_names | Names of model input |
| output_names | Names of model output |
| input_types | Types of model input |
| output_types | Types of model output |
| input_shapes | Shapes of model input |
| ouput_shapes | Shapes of model output |
In the following, we provide you with the usage examples of HBRuntime for two scenarios, ONNX model inference and HBIR model inference, respectively.
The basic flow for loading ONNX model inference using HBRuntime is shown below, and this sample code applies to inference for all ONNX models. Prepare the data according to the input type and layout requirements of different models:
The basic flow for loading HBIR model inference using HBRuntime is shown below, and this sample code applies to inference for all HBIR models. Prepare the data according to the input type and layout requirements of different models: