Framework

Core Modules

Abstract Flow Chart

The above figure shows the abstract flowchart of the overall organization of the HAT framework. You can see that the training and verification process of HAT is composed of four core modules, namely Data, Model, Callback, and Engine. Here is a brief introduction to each of these core modules.

  • Data: Responsible for all the data producing processes in HAT, including Dataset (iterative output), Transforms (data enhancement for tasks), Collate (data concatenation and batch packing), and Sampler (data sampling process). All the data producing processes are finally organized in a unified way through the Dataloader interface.

  • Model: Responsible for the building process of all the models in HAT. In HAT, the model is generally divided into sub-modules such as backbone, neck, head, task module, etc., and a unified structure is used to link all the sub-modules to build the final model. Besides the common tasks, structure also uses a GraphModel to specially handle the multitasking related model structure building.

  • Callback: Responsible for dynamically adjusting the training state during the execution of the Engine. Similar to the Hooks of the model in Torch, it can perform dynamic adjustments in the specified modifiable positions according to the training state provided by Engine without modifying the Engine code. The modifiable positions of the whole Engine mainly include: on_loop_begin(end), on_epoch_begin(end), on_step_begin(end), and on_batch_begin(end).

  • Engine: Mainly responsible for building and executing the training or prediction process, including the training module Trainer and the prediction module Precitor. All other modules, such as Data, Model, Callback, etc., will be fed to Engine after building, and Engine will implement unified scheduling to complete the whole process of the training or prediction.

In addition to the four core modules, there are other supporting modules such as Profiler, Metric, Visualize, etc.

  • Profiler: As the prof tool of HAT, it mainly helps to locate the speed bottleneck during the training or validation.

  • Metric: Mainly used for metrics validation during dataset training or testing, which is actually a special case of Model and strongly bound to specific datasets.

  • Visualize: Mainly used for the visualization of relevant datasets.

Training Building Process

1.For any dataset, build all the submodules required by Data.

First build Dataset for iterative output, and process the data with Transform in the iterative output, e.g., data enhancement operations in training, data preprocessing in testing, etc. Then use Sampler to control the output order of Dataset, and use Collate to concatenate the data one by one, and finally pack a Batch of training data. The DataLoader unifies the scheduling of all processes and feeds the training data of the Batch into the training process as structures.

2.For any model, build all the submodules required by Model, e.g., Backbone, Neck, etc.

Use Structure to link all the sub-modules together to form a complete model with training states, which will also be fed to the training process as a training object.

3.For the training task, select or define a suitable Callback to dynamically adjust the training state during the training.

For example, in each training session, output training results at regular intervals or dynamically adjust the learning rate of the training. Although the definition of Callback and Engine are separate, the execution process is embedded in the complete process of Engine.

4.For the training environment, build a suitable Engine as the training engine.

For example, you can choose DistributedDataParallelTrainer or DataParallelTrainer for common multi-card training environments. The Engine can organize all the already-built modules together to complete all the environmental initialization needed by the training, including Data, Model, or other modules such as Callback, Metric, Profiler. Note that not all the modules in Engine are required.

5.Finally, uniformly use the selected fit interface in Engine to complete the whole training process.

The above is the overall structure of the HAT framework and the abstract flow of the training. The diagram at the beginning of this section not only reflects the data flow of the construction, but also includes invocation relationships between modules. For training, Engine is the core part. A comprehensive understanding of the operation flow of Engine will allow us to understand the entire data flow of HAT.