[halcon deep learning] A library function determine_dl_model_detection_param in the data preparation process of target detection

determine_dl_model_detection_param

"determine_dl_model_detection_param" literally translates to "determine deep learning model detection parameters".

This process automatically estimates certain high-level parameters of the model for a given dataset, and is highly recommended to optimize training and inference performance.

Insert image description here

process signature

determine_dl_model_detection_param(
    : : DLDataset, ImageWidthTarget, ImageHeightTarget, GenParam : DLDetectionModelParam)

describe

This process is used to analyze the provided deep learning dataset (DLDataset) for object detection to determine the model parameters relevant to anchor generation. The generatedDLDetectionModelParam is a dictionary containing suggested values ​​for parameters of various object detection models.

parameter

  • DLDataset: Dictionary of deep learning datasets for object detection.
  • ImageWidthTarget: Target image width as model input (image width after preprocessing).
  • ImageHeightTarget: The target image height as input to the model (image height after preprocessing).
  • GenParam: Dictionary containing generic input parameters.
  • DLDetectionModelParam: Output dictionary containing suggested model parameters.

Parameter analysis

The first parameterDLDataset is the data set we read. The data set (the data set is our annotated image data set, we can read it through read_dict() The data set provided by halcon. You can also read the general coco data set through read_dl_dataset_from_coco)

Image zoom

The second and third parameters, are the size settings of the image. We know that the data set contains data describing the original size of the image. Here you need to enter the size of the preprocessed image, which means you can scale the image through these two parameters. Generally we will set a smaller size to speed up training!

GeneParam

GenParam is a dictionary containing some common input parameters that can be used to influence the determination of parameters in the determine_dl_model_detection_param process.
Using the input dictionary GenParam can further influence the determination of parameters. Different key-value pairs can be set to affect anchor point generation and determination of model parameters.
You can set different key-value pairs in GenParam to adjust the behavior of the algorithm according to your needs. Here are the keys and corresponding values:

  1. ‘anchor_num_subscales’: An integer value (greater than 0) that determines the upper limit of the number of anchor subscales to search for. The default value is 3.

  2. ‘class_ids_no_orientation’: Tuple containing an integer value representing a class identifier. Sets the flags for categories whose orientation should be ignored. The bounding boxes of these ignored categories are treated as axis-aligned bounding boxes with orientation 0. Only applicable when detecting instance type is 'rectangle2'.

  3. ‘display_histogram’: Determines whether to display a data histogram for visual analysis of the data set. Possible values ​​are 'true' and 'false' (default is 'false').

  4. ‘domain_handling’: Specifies how image domains are handled. Possible values ​​are:

    • 'full_domain'(Default): The image is not cropped.
    • 'crop_domain': The image is reduced to its domain definition.
    • 'ignore_direction': Boolean (or 'true'/'false') that determines whether the orientation of the bounding box is considered. Only available if the detected instance type is 'rectangle2'. Refer to the ‘get_dl_model_param’ documentation for more information on this parameter.
  5. ‘max_level’: An integer value (greater than 1) that determines the upper limit of the maximum level to be searched. The default value is 6.

  6. ‘max_num_samples’: An integer value (greater than 0 or -1) that determines the maximum number of samples used to determine the parameter value. If set to -1, all samples are selected. Be careful not to set this value too high as this may result in excessive memory consumption and a high load on the machine. However, if ‘max_num_samples’ is set too low, the determined detection parameters may not represent the data set well. The default value is 1500.

  7. ‘min_level’: An integer value (greater than 1) that determines the lower limit for searching the minimum level. The default value is 2.

  8. ‘preprocessed_path’: Specifies the path to the preprocessed directory. The preprocessing directory contains a dictionary of DLDatasets (.hdict files), and a subdirectory called 'samples' containing preprocessed samples (e.g. generated by the procedure 'preprocess_dl_dataset'). For already preprocessed datasets, the input parameters ImageWidthTarget and ImageHeightTarget are ignored and can be set to []. This parameter only applies if the dataset has been preprocessed for the application.

  9. ‘image_size_constant’: If this parameter is set to ‘true’, all images in the dataset are assumed to be of the same size, to speed up processing. Image size is determined by the first sample in the dataset. This parameter only applies if the dataset has not been preprocessed and 'domain_handling' is 'full_domain'. The default value is 'true'.

  10. ‘split’: Determines the split of the data set used for analysis. Possible values ​​include 'train' (default), 'validation', 'test' and 'all'. If the specified split is invalid or the dataset does not create a split, all samples are used.

  11. ‘compute_max_overlap’: If this parameter is set to ‘true’, the detection parameters ‘max_overlap’ and ‘max_overlap_class_agnostic’ will be determined for the dataset.

Recommended model parameters DLDetectionModelParam

DLDetectionModelParam is the output parameter of the model
The output dictionary (DLDetectionModelParam) includes recommended values ​​for the following parameters:

  • ‘class_ids’: category identifier
  • ‘class_names’: category names
  • ‘image_width’: image width
  • ‘image_height’: image height
  • ‘min_level’: minimum level
  • ‘max_level’: maximum level
  • ‘instance_type’: instance type
  • ‘anchor_num_subscales’: Number of anchor subscales
  • ‘anchor_aspect_ratios’: anchor aspect ratio
  • ‘anchor_angles’: Anchor point angles (only for models where ‘instance_type’ is ‘rectangle2’)
  • ‘ignore_direction’: whether to ignore the direction (only for models where ‘instance_type’ is ‘rectangle2’)
  • 'max_overlap': maximum overlap (if 'compute_max_overlap' is set to 'true')
  • 'max_overlap_class_agnostic': Maximum overlap (if 'compute_max_overlap' is set to 'true')

Precautions

The return values ​​mentioned in the documentation are approximations of the trade-off between model run time and detection performance, further experimentation may be required to optimize the parameters. Furthermore, the suggested parameters are based on the original dataset without considering possible data augmentation during training. If certain data augmentation methods (e.g. 'mirror', 'rotate') are applied, the generated parameters may need to be adjusted to cover all bounding box shapes.

summary

determine_dl_model_detection_param will obtain some advanced parameters of the model based on the input data set. These advanced parameters will be used in subsequent training and inference. In other words, training and inference require some advanced parameters. This function can help you analyze the input data set and get the values ​​of these advanced parameters for you to use in subsequent operations! This function gives us a certain basis for subsequent parameter adjustment!

code context

Insert image description here


* 
* ************************
* **   Set parameters  ***
* ************************
* 
* Set obligatory parameters.
Backbone := 'pretrained_dl_classifier_compact.hdl'
NumClasses := 10
* Image dimensions of the network. Later, these values are
* used to rescale the images during preprocessing.
ImageWidth := 512
ImageHeight := 320


* Read in a DLDataset.
* Here, we read the data from a COCO file.
* Alternatively, you can read a DLDataset dictionary
* as created by e.g., the MVTec Deep Learning Tool using read_dict().
read_dl_dataset_from_coco (PillBagJsonFile, HalconImageDir, dict{
    
    read_segmentation_masks: false}, DLDataset)
* 
* Split the dataset into train/validation and test.
split_dl_dataset (DLDataset, TrainingPercent, ValidationPercent, [])
* 
* **********************************************
* **   Determine model parameters from data  ***
* **********************************************
* 
* Generate model parameters min_level, max_level, anchor_num_subscales,
* and anchor_aspect_ratios from the dataset in order to improve the
* training result. Please note that optimizing the model parameters too
* much on the training data can lead to overfitting. Hence, this should
* only be done if the actual application data are similar to the training
* data.
GenParam := dict{
    
    ['split']: 'train'}
* 
determine_dl_model_detection_param (DLDataset, ImageWidth, ImageHeight, GenParam, DLDetectionModelParam)
* 
* Get the generated model parameters.
MinLevel := DLDetectionModelParam.min_level
MaxLevel := DLDetectionModelParam.max_level
AnchorNumSubscales := DLDetectionModelParam.anchor_num_subscales
AnchorAspectRatios := DLDetectionModelParam.anchor_aspect_ratios
* 
* *******************************************
* **   Create the object detection model  ***
* *******************************************
* 
* Create dictionary for generic parameters and create the object detection model.
DLModelDetectionParam := dict{
    
    }
DLModelDetectionParam.image_width := ImageWidth
DLModelDetectionParam.image_height := ImageHeight
DLModelDetectionParam.image_num_channels := ImageNumChannels
DLModelDetectionParam.min_level := MinLevel
DLModelDetectionParam.max_level := MaxLevel
DLModelDetectionParam.anchor_num_subscales := AnchorNumSubscales
DLModelDetectionParam.anchor_aspect_ratios := AnchorAspectRatios
DLModelDetectionParam.capacity := Capacity
* 
* Get class IDs from dataset for the model.
ClassIDs := DLDataset.class_ids
DLModelDetectionParam.class_ids := ClassIDs
* Get class names from dataset for the model.
ClassNames := DLDataset.class_names
DLModelDetectionParam.class_names := ClassNames
* 
* Create the model.
create_dl_model_detection (Backbone, NumClasses, DLModelDetectionParam, DLModelHandle)
* 
* Write the initialized DL object detection model
* to train it later in part 2.
write_dl_model (DLModelHandle, DLModelFileName)
* 
* 
* *********************************
* **   Preprocess the dataset   ***
* *********************************
* 
* Get preprocessing parameters from model.
create_dl_preprocess_param_from_model (DLModelHandle, 'none', 'full_domain', [], [], [], DLPreprocessParam)
* 
* Preprocess the dataset. This might take a few minutes.
GenParam := dict{
    
    overwrite_files: 'auto'}
preprocess_dl_dataset (DLDataset, DataDirectory, DLPreprocessParam, GenParam, DLDatasetFilename)
* 
* Write preprocessing parameters to use them in later parts.
write_dict (DLPreprocessParam, PreprocessParamFileName, [], [])


From here, we can see that these parameters are used when create_dl_model_detection creates the detection model! It will also be used in the subsequent training process. We will see you in the next article.

Guess you like

Origin blog.csdn.net/songhuangong123/article/details/135033234