This tutorial focuses on showing how to train a GaNet model from scratch on the lane line dataset CuLane using HAT, including floating-point, quantitative, and fixed-point models.
CuLane is one of the most commonly used datasets for lane line detection, and many advanced lane line detection studies are preferentially based on this dataset for good validation.
Prior to the model training, we need to prepare the dataset first.
Here we download the official dataset and the corresponding labeled data CuLaneDataset.
Note that the annotations_new.tar.gz file must be extracted at the end.
The structure of the data directory after unzipping is as follows:
Where list/train.txt contains the path to the training data and list/test.txt contains the path to the test data.
If you just want to simply train the GaNet model, then you can read this section first.
Similar to other tasks, HAT performs all training tasks and evaluation tasks in the form of tools + config.
After preparing the original dataset, take the following process to complete the whole training process.
To improve the training speed, we packed the original dataset and converted it to the LMDB format.
Simply run the following script to complete the conversion:
The above two commands correspond to the transformation of the training dataset and the verification dataset, respectively.
After packaging, the file structure in the ${target-data-dir} directory should look as follows:
train_lmdb and test_lmdb are the packaged training dataset and verification dataset. You can then start training the model.
Before the network starts the training, you can first calculate the number of network operations and parameters using the following command:
The next step is to start the training. Training can also be done with the following script. Before training, you need to make sure that the dataset path specified in the configuration has already been changed to the path of the packaged dataset.
Since the HAT algorithm package uses the registration mechanism, it allows each training task to be started in the form of train.py plus a config file.
The train.py is a uniform training script and independent of the task, and the tasks we need to train, the datasets we need to use, and the hyperparameter settings related to training are all in the specified config file.
The parameters after --stage in the above command can be "float", "calibration", which, respectively, indicates the training of the floating-point model and the quantitative model,
and the conversion of the quantitative model to the fixed-point model, where the training of the quantitative model depends on the floating-point model produced by the previous floating-point training.
Once you've completed your quantization training, you can start exporting your fixed-point model. You can export it with the following command:
After completing the training, we get the trained floating-point, quantitative, or fixed-point model. Similar to the training method, we can use the same method to complete metrics validation on the trained model and get the metrics of Float, Calibration, and Quantized,
which are floating-point, quantitative, and fully fixed-point metrics, respectively.
Similar to the model training, we can use --stage followed by "float", "calibration", to validate the trained floating-point model, and quantitative model, respectively.
The following command can be used to verify the accuracy of a fixed-point model, but it should be noted that hbir must be exported first:
HAT provides the infer_hbir.py script to visualize the inference results for the fixed-point model:
In addition to the above model validation, we provide an accuracy validation method identical to the on-board environment, which can be accomplished by:
As the quantitative training toolchain integrated in HAT is mainly prepared for Horizon's processors, it is a must to check and compile the quantitative models.
We provide an interface for model checking in HAT, which allows the user to define a quantitative model and then check whether it can work properly on the BPU first.
After the model is trained, you can use the compile_perf_hbir script to compile the quantitative model into an HBM file that supports on-board running.
The tool can also predict the performance on the BPU.
The above is the whole process from data preparation to the generation of quantitative and deployable models.
In this note, we explain some things that need to be considered for model training, mainly including settings related to config.
The network structure of GaNet can be found in the Paper, which is not described in detail here.
We can easily define and modify the model by defining a dict type variable like model in the config file.
In addition to backbone , the model also has neck , head , targets , post_process and losses modules.
In GaNet , backbone is mainly to extract the features of the image, neck is mainly for feature enhancement, and head is mainly to get the predicted fraction and offset of key points of lane lines from the features.
targets are training targets from gt and post_process is mainly the post-processing part, which is used in inference.
The losses part uses LaneFastFocalLoss and L1Loss from the paper as the training losses and loss_weight is the weight of the corresponding loss.
Like the definition of model, the data enhancement process is implemented by defining train_data_loader and val_data_loader in the config file, which correspond to the processing of the training sets and verification sets, respectively.
Taking train_data_loader as an example, the data enhancement uses FixedCrop, RandomFlip, Resize, RandomSelectOne, RGBShift, HueSaturationValue, JPEGCompress,
MeanBlur, MedianBlur, RandomBrightnessContrast, ShiftScaleRotate and RandomResizedCrop to increase the diversity of training data and enhance the generalization ability of the model.
Since the final model running on BPU uses a YUV444 image input, and the general training image input is in the RGB format, HAT provides BgrToYuv444 data enhancement to convert RGB to YUV444.
To optimize the training process, HAT uses batch_processor, which allows some enhancement processes to be placed in batch_processor to optimize the training:
In which loss_collector is a function that gets the loss of the current batch data.
The data transformation of the validation set is relatively simpler, as follows:
Train the floating-point model on the CuLane dataset using the Cosine learning strategy with Warmup, as well as applying L2 norm to the weight parameters.
The float_trainer, calibration_trainer, and int_trainer in the configs/lane_pred/ganet/ganet_mixvargenet_culane.py file correspond to the training strategies for floating-point, quantitative, and fixed-point models, respectively.
The following is an example of float_trainer training strategy:
For key steps in quantitative training, such as preparing the floating-point model, operator substitution, inserting quantization and inverse quantization nodes, setting quantitative parameters, and operator fusion, please read the Quantized Awareness Training (QAT) section. Here we focus on how to define and use the quantization models in lane prediction of HAT.
If the model is ready, and some existing modules are quantized, HAT uses the following script in the training script to map the floating-point model to the fixed-point model uniformly.
The overall strategy of quantitative training can directly follow the strategy of floating-point training, but the learning rate and training length need to be adjusted appropriately.
Due to the existence of the floating-point pre-training model, the learning rate Lr for quantitative training can be very small.
Generally, you can start from 0.001 or 0.0001, and you can do one or two times Lr adjustments of scale=0.1 with StepLrUpdater without prolonging the training time.
In addition, weight decay will also have some effect on the training results.
The quantitative training strategy for the GaNet example model can be found in the configs/lane_pred/ganet/ganet_mixvargenet_culane.py file.