Model Inference

hbDNNInferV2

int32_t hbDNNInferV2(hbUCPTaskHandle_t *taskHandle, hbDNNTensor *output, hbDNNTensor const *input, hbDNNHandle_t dnnHandle);

Create synchronous/asynchronous inference tasks based on input parameters. For the asynchronous inference task, the caller can use the returned taskHandle across functions and threads.

  • Parameter
    • [out] taskHandle Task handle pointer.
    • [in/out] output Output of the inference task.
    • [in] input Input of the inference task.
    • [in] dnnHandle DNN handle pointer.
  • Return Value
    • If the return value is 0, the API is executed successfully, otherwise the execution fails.
Note
  1. If taskHandle is set to nullptr, a synchronous task will be created, as returned by the interface, and the interface will be complete.
  2. If *taskHandle is set to nullptr, an asynchronous task will be created, and the taskHandle returned by the interface can be used for subsequent blocking or callbacking.
  3. If *taskHandle is not null and points to a previously created but uncommitted task, a new task will be created and added to it.

Up to 32 coexisting model tasks are supported.

hbDNNRoiInferV2

int32_t hbDNNRoiInferV2(hbUCPTaskHandle_t *taskHandle, hbDNNTensor *output, hbDNNTensor const *input, hbDNNRoi *rois, int32_t roiCount, hbDNNHandle_t dnnHandle);

Create ROI synchronous/asynchronous inference tasks based on input parameters. For the asynchronous task, the caller can use the returned taskHandle across functions and threads.

  • Parameter
    • [out] taskHandle Task handle pointer.
    • [in/out] output Output of the inference task.
    • [in] input Input of the inference task.
    • [in] rois Info of the ROI box.
    • [in] roiCount Number of ROI boxes.
    • [in] dnnHandle DNN handle pointer.
  • Return Value
    • If the return value is 0, the API is executed successfully, otherwise the execution fails.
Note

Concept Description:

  • input_count: number of input branches for the model.
  • output_count: number of output branches for the model.
  • resizer_count: number of branches for the model with resizer input source(≤input_count), and each resizer input source needs to correspond to a ROI.
  • roiCount: number of Roi boxes, its value should be an integer multiple of resizer_count.
  • data_batch: number of data batches that the model needs to infer, the value is roiCount / resizer_count.
  • input: input of the inference task, the number should be input_count * data_batch.
  • output: number of inference task outputs is consistent with output_count, and the memory required for each output is data_batch times the memory required by the corresponding tensor of the model.

Input/Output Description:

Taking a more complex multi-input model as an example, suppose the model has 3 input branches (2 resizer inputs and 1 ddr input) and 1 output branch. The model needs to process 3 batches of data with a total of 6 rois (i.e., each batch of data has 2 rois). Then the following information is available:

  • input_count = 3
  • output_count = 1
  • resizer_count = 2
  • roiCount = 6
  • data_batch = roiCount / resizer_count = 3
  • input = input_count * data_batch = 9
  • output = output_count = 1

Additionally, suppose the static information for the model's inputs/outputs is as follows:

  • model input(model_info):
    • tensor_0_resizer: [1, 3, 128, 128]
    • tensor_1_resizer: [1, 3, 256, 256]
    • tensor_2_ddr: [1, 80, 1, 100]
  • modle output(model_info):
    • tensor_out: [1, 100, 1, 56]

Then, the dynamic information during model inference would be:

  • model input( input ):
    • [1x3x128x128, 1x3x256x256, 1x80x1x100, 1x3x128x128, 1x3x256x256, 1x80x1x100, 1x3x128x128, 1x3x256x256, 1x80x1x100]
  • model output( output ):
    • [3x100x1x56]

Interface Limitation Description:

  • When committing tasks using this interface, taskHandle should be set to nullptr in advance, unless appending tasks to a specified taskHandle.
  • The original image size requirement is 1<=W<=40961 < = W < = 4096, 32<=stride<=13107232 < = stride < = 131072 and the stride must be a multiple of 32.
  • The size requirement for roi is that it must not exceed the boundaries of the input image, this limit will be relaxed in subsequent versions.
  • The output size requirement is 2<=Wout2 < = Wout, 2<=Hout2 < = Hout.
  • The overall requirement for roi size and output size is widthheight+WoutHout<1.5MBwidth * height + Wout * Hout < 1.5MB, this limit will be relaxed in subsequent versions.
  • The roi scaling factor limitation is 0<=step<=2293750 < = step < = 229375, with the step calculation formula being step=(src_len65536+dst_len/2)/dst_lenstep = (src\_len * 65536 + dst\_len/2)/dst\_len, where src_len is the W or H of roi, and dst_len is the W or H required by the model.
  • Up to 32 coexisting model tasks are supported.