Using the padim unsupervised algorithm of the Anomalib project for model training and ONNX deployment of self-made industrial defect data sets (1) - model training

Table of contents

foreword 

1. Introduction of Anomalib for unsupervised learning defect detection

2. Anomalib code structure

3. Task description and model training reasoning

4. Summary and Outlook


foreword 

        This article focuses on the training process of the padim algorithm on a self-made data set. Bloggers have limited skills and have an attitude towards the neural network model. Therefore, the article does not involve the interpretation of the network structure and the details of the paper. Students who want to read these, please find another Information ha~

1. Introduction of Anomalib for unsupervised learning defect detection

        The new task recently given by the group is to detect various defects on the surface of metal materials. I used the supervised yolov5 network before, and the labeled data set was really painful. Moreover, the industrial defect data has a more significant feature: sample imbalance . Most of the collected industrial data is flawless. In this way, the data of positive samples does not play a role in the model training, and the negative samples are too few, so it is difficult to train an effective model. There is another problem with the supervised learning method: the appearance of negative samples is very accidental, and there may be no negative samples of a certain type in the data set. The model trained in this way is likely to overturn, so we can only find another one. Law.

        Check the data and find that unsupervised algorithms are more suitable for industrial defect detection scenarios. The unsupervised algorithm only uses positive samples for training. After learning a large number of positive samples, the network will know that the negative samples and positive samples "look different" when encountering negative samples, and then output a probability map with the same size as the original image. A distribution map to indicate the probability that a certain place is an abnormal area.

        The unsupervised learning method is good, but where should I go for the code as a hand-out party? Personally, the best place is Anomalib , which is implemented in Python , has complete and easy-to-understand information, and is still being updated. The link is as follows: GitHub - openvinotoolkit/anomalib: An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference. https://github.com/openvinotoolkit/anomalib

You can know it         by seeing the English name: this is an abnormal detection (Abnormal) library (library) . It is an unsupervised learning method, such as padim algorithm, fastflow algorithm, etc.

        Having found this treasure trove, our next step is to explore its code structure and usage.

2. Anomalib code structure

        I use Pycharm to program locally. After creating a new project, the code structure of Anomalib is as follows:

        If you want to understand the code structure of the project and facilitate subsequent training and deployment, you need to analyze it a little bit. Among them, the more important parts are:

1. Anomalib folder: In fact, this folder is the source code of the library released by the team. We can also install it through pip install.

1.1 The models subfolder         under this folder contains more than ten kinds of defect detection algorithms, which can be called by readers at will;

        1.2 The pre_processing and post_processing subfolders are pre-processing and post-processing functions respectively, and the visualizer under post_processing is used to finally display the inference results .

        1.3 The inferencers in the deploy folder are various reasoners. Readers who want to use pytorch can pay attention to torch_inferencer , and readers who want to use onnx reasoning should use openvino_inferencer .

2. datasets folder: As the name suggests, this folder should store the datasets to be trained, such as the industrial dataset MVTec, or our self-made datasets.

3. results folder: This folder stores the results of training and inference, and will only appear after the training or inference task is completed.

4. Tools folder: The inference subfolder in this folder stores a series of inference codes, which respectively call different inferencers (reasoners) in anomalib.deploy . The train.py under this folder is the entry of the training model.

        Summary: We need to use tools/train.py for model training later , and use an inference code in tools/inference for model inference prediction . With the above preliminary knowledge, we can finally start the training of the model.

3. Task description and model training reasoning

        If the reader only wants to see the effect of an algorithm in pycharm, then according to the official example, use train.py to train and then use lightning_inference.py for inference. You can see the inference result in the new result folder. It is very convenient and will not be described here.

        But for readers, including me, who need to deploy on other platforms later, we need to train our own data set and get the onnx model , so we need to modify the config.yaml file, the method is as follows:

        First of all, according to the official plan, see the method of training Custom Dataset in Readme :

        Here we use the padim algorithm, so we modify the dataset part of anomalib/models/padim/config.yaml according to the above part and run the following training command on the terminal:

python tools/train.py --model padim --config anomalib/models/padim/config.yaml

        But it will report an error: "normalization" and other attributes cannot be found. Don't worry at this time. In fact, these attributes are in the previous config.yaml. The official did not write it here. It may be negligent. I put my config The datasets part of .yaml is placed here. Readers can rewrite it according to the path of their own self-made datasets. I have commented on the places that need to be added and noticed:

dataset:
  name: tube               # 数据集的名字,如MVTec等,这个不重要
  format: folder
  path: ./datasets/img_192 # 自制数据集路径
  normal_dir: normal       # 自制数据集正样本子文件夹
  abnormal_dir: abnormal   # 自制数据集负样本子文件夹
  mask_dir: null           # 二值掩膜路径,自制数据集一般没有,填null
  normal_test_dir: null # name of the folder containing normal test images.
  task: classification # classification or segmentation
  extensions: null
  normalization: imagenet  # 此处添加imagenet
  split_ratio: 0.2 # ratio of the normal images that will be used to create a test split
  image_size: 256
  train_batch_size: 32
  test_batch_size: 32
  num_workers: 8
  transform_config:
    train: null
    val: null
  test_split_mode: from_dir # 此处添加
  test_split_ratio: 0.2
  val_split_mode: same_as_test
  val_split_ratio: 0.5
  create_validation_set: true
  tiling:
    apply: false
    tile_size: null
    stride: null
    remove_border_count: 0
    use_random_tiling: False
    random_tile_count: 16

        Special attention should be paid to: Since we use self-made data sets, there is often no ground truth binary mask (mask), so here we need to set the task field to classification instead of segmentation . Then delete the pixel part of the metrics part, otherwise an error will be reported. The reason is that the binarization mask is not provided, and the pixel-based test accuracy cannot be calculated , only the image part is reserved. Modify as follows:

metrics:
  image:
    - F1Score
    - AUROC
#  pixel:
#    - F1Score
#    - AUROC
  threshold:
    method: adaptive #options: [adaptive, manual]
    manual_image: null
    manual_pixel: null

        We also need the onnx model, here we need to add the onnx field to the optimization section in config.yaml:

optimization:
  export_mode: onnx # options: torch, onnx, openvino

        At this point, we can run the training command again to start training:

python tools/train.py --model padim --config anomalib/models/padim/config.yaml

        I haven't studied the paper of the padim algorithm, but it seems that the training process only needs one round. After the training is completed, you can see the following structure in the results folder:

         tube is the name of the blogger’s own data set. The images folder below contains the test results of positive and negative samples, which are given in the form of probability heatmaps; the weights folder contains the pytorch-lightning model and our The required onnx model, and the metadata.json containing the training data set information (very important for deployment, more about it in the next article).

        After completing the above steps, we have obtained the onnx model, but before deploying, we still need to check whether the converted model has accuracy loss, and enter the following command in the terminal:

 python tools/inference/openvino_inference.py --weights results/padim/tube/run/weights/onnx/model.onnx --metadata results/padim/tube/run/weights/onnx/meta_data.json  --input datasets/img_192/abnormal/cam0_17_04_54.jpg  --output results/padim/tube/run/images --config src/anomalib/models/padim/config.yaml

        The following inference results are obtained:

         The effect is good. In terms of industrial defect detection, I can finally say goodbye to yolov5~

4. Summary and Outlook

        We used the padim algorithm to obtain good detection results on the self-made industrial data set, and obtained the onnx model through training; in this process, the most important thing to pay attention to is the modification of the configuration file config.yaml, according to the official website tutorial, this blog and Configure and modify the path of your own data set to prevent errors.

        After getting the onnx model, we want to get rid of the huge and complicated Anomalib project and environment, and complete our own software algorithm deployment, but what preprocessing steps are needed before the image data is sent to the onnx model? What are the model input and output? What post-processing is required for the output data? How to draw a very concise and clear heat map in Anomalib? These are the problems we need to solve.

        In the next blog, we will continue to analyze the code structure and process of Anomalib, use the onnxruntime engine to complete the deployment of onnx in C++, and realize the implementation of the algorithm.

Guess you like

Origin blog.csdn.net/m0_57315535/article/details/131004027