Environment Deployment

Foreword

Horizon OpenExplorer currently provides 2 sets of model quantization schemes at the same time.

  • PTQ: Post-training Quantization.
  • QAT: Quantization aware training, which only supports the Pytorch framework for now.

Among them:

  • Both solutions do not interfere with the training phase of the floating-point model, which is your own responsibility. Horizon has also provided some open-source implementations of the public version of Pytorch for efficient models in classification, detection, segmentation, and other scenarios in the samples/ai_toolchain/horizon_model_train_sample for reference, with the support for training and replication on the host.

  • For PTQ scheme, you need to quantize the model in the host development environment, and then copy the compiled .hbm model to the dev board environment for subsequent deployment.

  • For the QAT scheme, you need to complete the QAT training of the model in the host development environment, perform the quantization conversion, and then copy the compiled .hbm model to the dev board environment for subsequent deployment.

For both of the above quantization schemes and the development environment of the efficient model, Horizon provides both local manual installation and Docker containers. We strongly recommend using Docker containers as they do not pollute the local environment and is easy to use, descriptions of these two are included in the following sections.

Development Environment Deployment

Development Machine Preparation

In order to use the toolchain smoothly, we recommends that the development machine you choose should meet the following requirements:

HW/OSREQUIREMENTS
CPUCPU above I3 or same level processor as E3/E5
Memory Size16G or above
GPUCUDA11.8, Drive Version: Linux: >= 510.39.01*
(Recommended Drive Version: Linux:520.61.05)
Adapted graphics cards include but are not limited to:
1)GeForce RTX 3090
2)GeForce RTX 2080 Ti
3)NVIDIA TITAN V
4)Tesla V100S-PCIE-32GB
5)A100
OSNative Ubuntu 22.04

For more information about CUDA compatibility with graphics cards, refer to NVIDIA website information.

Docker Container Deployment

Docker Base Environment

Horizon requires the following Docker base environment, please complete the installation on your host computer in advance.

After completing the installation of the Docker environment, remember to add non-root users into Docker users group by running below command:

sudo groupadd docker sudo gpasswd -a ${USER} docker sudo service docker restart

Please obtain the required Docker image from following:

The naming form is as follows:

  • GPU Docker: openexplorer/ai_toolchain_ubuntu_20_j6_gpu:{version}
  • CPU Docker: openexplorer/ai_toolchain_ubuntu_20_j6_cpu:{version}
Tip

Replace the {version} by the actual version number.

Docker Image Usage

To help you quickly use the toolchain, we provide a Docker image containing the complete development environment, which greatly simplifies the deployment process of the development environment.

Note

If you have downloaded the offline image, you need to use the following command to load the image locally first.

docker load -i docker_openexplorer_xxx.tar.gz

You can start the Docker container corresponding to the current OE version by running the following script directly from the first level of the OE package(the script will automatically pull the image from the official Docker hub if there is no local image):

sh run_docker.sh data/

Where data is the path to the evaluation dataset folder. Please create the path before running the command, otherwise loading problems may occur.

If you want to use the CPU version of the Docker image, you need to add the cpu parameter:

sh run_docker.sh data/ cpu

The download links for the relevant publicly available evaluation datasets that the OE package samples rely on can be accessed by referring to the introduction in section Dataset Download.

If you want to start the Docker container manually, you can refer to the following command, where {version} is the OE version number you are currently using.

Note

For your convenience, we provide CPU Docker and GPU Docker for you to choose on demand.

CPU Docker
GPU Docker
# CPU Docker docker pull openexplorer/ai_toolchain_ubuntu_20_j6_cpu:{version} # Start CPU Docker image manually docker run -it --rm --network host \ # Adjust network mode to host -v {OE package path}:/open_explorer \ # Mount OE package -v ./dataset:/data/horizon_j6/data \ # Mount dataset openexplorer/ai_toolchain_ubuntu_20_j6_cpu:{version}
Attention
  1. Since the environment variables PATH and LD_LIBRARY_PATH are configured during the build process of the OE Docker image, not using the recommended way (e.g., docker attach) to enter the container may result in the environment variables not being loaded correctly, which may lead to the use of abnormalities in tools such as Cmake, GCC, CUDA, and so on.

  2. If you want the Docker container to exit without removing it, use the command line docker run -it to start it manually, without the --rm option.

  3. If you want the Docker container to run background after startup, add the -d option after the command line docker run -it, the container ID will be returned after the container is started, and then you can enter the container again with the command docker exec -it {container ID} /bin/bash.

Local Manual Installation

This section describes the local manual installation environment method, and introduces the environment-related dependencies and descriptions for each of the two quantization schemes and Horizon open-source efficient model training. We recommend prioritizing the easy-to-use PTQ quantization scheme after the floating-point model training, and switch to the QAT quantization scheme only when accuracy issues cannot be solved.

Local Manual Installation Environment Method

To manually install the environment locally, simply run the script below to complete the environment installation in one click.

cd package/host bash install.sh

The installation script will automatically check the environment. If there are missing dependencies or configurations, it will interrupt the installation process and give suggestions to fix it, as shown below:

Run the script again after adding the dependencies as suggested.

Note
  • If you need to generate a board-side executeable program, use the cross-compilation tool aarch64-linux-gnu-gcc and aarch64-linux-gnu-g++. Its version is (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0.

  • If you need to generate an X86 emulation environment executable program, use the X86 gcc, if the script suggests that gcc/g++ versions are incorrect, you need to recreate the gcc and g++ soft links as gcc-11.4.0 and g++-11.4.0 after the required versions are installed.

  • For host-side (x86) dependent libraries (isl, gmp, mpc, mpfr, etc.), if you use lib/x86_64-linux-gnu, please specify them by LD_LIBRARY_PATH in the compiled project if the compilation reports errors.

  • If there is a glibc library version conflict problem during compilation, for example: the error of undefined symbols for xxx@GLIBC_xxx, please specify the path to aarch64-linux-gnu/lib of the toolchain by using -rpath-link in the compiled project, and at the same time add -lxxx to the compiled project, for example: -lpthread.

    In addition, you should pay special attention to the variable SRCS (boxed below) that records the source file. It is better to put it in front of the ${LIBS} link library, otherwise it may also report undefined symbols.

    gcc_SRCS

  • The installation script will automatically check the environment. If there are missing dependencies or configurations, it will interrupt the installation process and you can run the script again after adding the dependencies as suggested.

  • After the installation script is completed successfully, it will add path and other information to the ~/.bashrc system environment variable(The environment variable LD_LIBRARY_PATH is often used and it is recommended that you check that the environment variable is as expected), run source ~/.bashrc to make the current terminal configuration take effect.

  • The torch version should be 2.1.0 and the torchvision version should be 0.16.0.

PTQ Quantization Environment Dependence

The PTQ scheme has the following software dependencies on the base software of the development machine operating environment:

  • Operating system: Ubuntu22.04
  • Python3.10
  • libpython3.10
  • python3-devel
  • python3-pip
  • gcc&g++: 11.4.0
  • graphviz

QAT Quantization Environment Dependence

The QAT quantization environment is installed in the local environment and you need to ensure that the following basic environmental conditions are met.

The environmental dependencies required for the quantitative training tool to be trained are listed below:

HW/OSGPUCPU
osUbuntu22.04Ubuntu22.04
cuda11.8N/A
python3.103.10
torch2.1.0+cu1182.1.0+cpu
torchvision0.16.0+cu1180.16.0+cpu
Recommended Graphics Cardstitan v/2080ti/v100/3090N/A

After completing the training of the QAT model, you can install the relevant toolkits in the current training environment and complete the subsequent model conversion directly through the interface call.

Efficient Model Floating-point Training Environment Instruction

Horizon provides the source code of several open-source efficient models in samples/ai_toolchain/horizon_model_train_sample. For information on the floating-point and QAT base environment, refer to section QAT Quantization Environment Deployment.

Runtime Environment Deployment

Once the model has been quantized, the compiled model can be deployed on the dev board environment for inference and execution.

To deploy the runtime environment, you need to prepare a dev board with the system image programmed, and then copy the relevant supplementary files to the dev board.

Dev Board Preparation

At this stage, you need to verify the usability of the dev board and program the available system images to the board.

Board Side Tool Installation

Some of the supplementary tools of the toolchain are not included in the system image, but can be copied to the dev board by running the installation script in the OE package in the host environment, as follows:

cd package/board bash install.sh ${board_ip}
Note

Where ${board_ip} is the IP address you set for the dev board. Make sure that you can successfully access this IP on the development PC.

After the supplementary files are successfully installed, please restart the dev board and execute hrt_model_exec --help on the dev board to verify if the installation is successful.