Interpretation and reproduction of SCI paper [NO.3] MSFT-YOLO: Improved YOLOv5 steel surface defect detection based on transformer (code has been reproduced)

        I published the target detection algorithm improvement column before, but for what scenario it is applied to, what improvement method is needed to be effective for its own application scenario, and how many improvements can be published at what level of articles, in order to solve everyone's confusion, this series of articles aims to I will explain to you the SCI papers published in high-level academic journals , and introduce the corresponding SCI journals to help you answer your doubts and facilitate the submission of scientific research papers. For the series of articles interpreted, I will reproduce the code of innovative points . Friends in need can pay attention to private messages and get them.

Baidu network disk link: https://pan.baidu.com/s/10LeM9LPAG1q8fFPwtV1ngQ

Extraction code: Get it by private message after following.

1. Summary

With the development of artificial intelligence technology and the popularization of intelligent production projects, intelligent detection systems have gradually become a hot topic in the industrial field. As a basic problem in the field of computer vision, how to realize target detection in industry while taking into account the accuracy and real-time performance of detection is an important challenge in the development of intelligent detection systems. The detection of steel surface defects is an important application of object detection in industry. Correct and rapid detection of surface defects can greatly improve productivity and product quality. To this end, this paper introduces the MSFT-YOLO model, which is improved on the basis of single-stage detectors. The MSFT-YOLO model is proposed for industrial scenarios where image background interference is large, defect categories are easily confused, defect scales vary greatly, and small defect detection is poor. By adding the TRANS module based on Transformer design in the backbone and the detection head , the features are combined with the global information. By combining the multi-scale feature fusion structure to fuse the features of different scales , the dynamic adjustment of the detector to different scale objects is enhanced. To further improve the performance of MSFT-YOLO, we also introduce a large number of effective strategies, such as data augmentation and multi-step training methods. The test results on the NEU-DET data set show that MSPF-YOLO can achieve real-time detection, and the average detection accuracy of MSFT-YOLO is 75.2, which is about 7% higher than the baseline model (YOLOv 5), and about 7% higher than Faster R-CNN. 18%, with certain advantages and inspiration.  

2. Network model and core innovation points

 The overall schematic diagram of MSFT-YOLO is shown in the figure, which mainly includes three parts: backbone network part, feature enhancement part and prediction part. In the first part of the backbone, we did not use the original convolutional layer of YOLOv5, but mainly used the self-developed TRANS structure, and expanded the receiving field of convolution by assembling it into CSPDarknet. TRANS provides multi-level features with global information for detection, which enhances MSFT-YOLO's ability to recognize background features on steel surfaces. At the neck of the network, a simple and effective BiFPN structure is used instead of PANet to weight the multi-level feature combination of the backbone network, and the TRANS module is integrated into the prediction head to replace the original prediction head, and the prediction potential of YOLOv5 self-attention is tapped , can accurately locate objects in high-density scenes, and can handle large-scale changes in objects. The specific details of TRANS are given in Section 3.2.

3. Application data set

The main data set NEU-DET used in the paper is a surface defect database issued by Northeastern University, which collects six typical surface defects of hot-rolled strip steel, including silver streaks, inclusions, patches, pockmarks, rolling scale and scratches. mark. The database includes 1800 grayscale images of 6 different types of typical surface defects, each containing 300 samples.

 4. Experimental results (partial display)

Results of an ablation study. It can be observed from the experimental data that by adding the TRANS module at the backbone, the detection of two more obvious defect spots and scratches has been greatly improved, while the detection of crack spots, pockmarks, rolled-in scale and scratches has been greatly improved by BiFPN. The detection effect of scratches is greater, and the bon of these defect spots, pockmarks, rolled-in scale and scratches are all greater than 3%. By analyzing the inspection results, TRANS enables the model to adapt to a wider range of aspect ratios, solving the problem of samples with uneven distribution of defects, such as sensor 2022, 22, 3467 13 out of 15 surface depressions, rolled-in scale and scratches , all with over 3% bonuses. Through the analysis of the test results, TRANS enables the model to adapt to a wider range of aspect ratios, and solves the problem of samples with uneven distribution of defect aspect ratios. At the same time, since defects often appear in a combination of independent and irregular shapes, the high robustness of TRANS to severe interference, perturbation and region movement and the ability to integrate high-level visual semantic information enable the collection of features related to defects in larger neighborhoods. For the information, MSFT-YOLO integrates the TRANS module and the BiFPN module, which can capture the information of different positions in the model, which again improves the detection accuracy of the two categories of scratches and pitted surfaces. The BiFPN structure enables the model to adapt to larger defect size changes, which solves the problem of large differences in defect size distribution. Our method has been improved with YOLOv 5 as the baseline. Although the detection speed is reduced by 40%, it still has the potential for real-time detection. The detection accuracy rate has increased from 0.682 to 0.757, and the detection accuracy rate has been greatly improved. Through the analysis of the detection samples, we added the TRANS module and fused BiFPN to play a positive role in improving the accuracy of the model. It can be seen that our method combines global features for multi-level feature fusion. The effect of defect detection in industrial scenes is very obvious.

 Compare the experimental results. Table 2 shows the results of our models evaluated on the NEU-DET dataset. In industrial scenarios, not only the accuracy of object detection tasks is important, but also the detection efficiency is one of the factors to measure whether it can be put into use in industrial scenarios. Only under the premise of ensuring the detection results and detection speed, can we truly make a correct judgment and meet the requirements of industrial production. Therefore, in this part of the discussion, the defect detection model will be comprehensively evaluated using mean precision (mAP) as the model and frames per second (FPS).

 5. Experimental conclusion

This paper designs a steel surface defect detector MSFT-YOLO based on YOLOv5. MSFT-YOLO combines some existing techniques in computer vision, including Transformer encoding blocks, multi-level feature fusion, data expansion and some training techniques. Aiming at the problem of cluttered defect image background and easy confusion of defect categories, it is proposed to add Transformer-based TRANS module in the backbone and detection head. Aiming at the problem of large defect scale variation and poor detection effect of small defects, a BiFPN structure is proposed, which enhances the ability of the detector to adjust objects of different scales by fusing features of different scales. Through the test on the NEU-DET data set, MSFT-YOLO reached 0.752mAP, which is 7.5% higher than the baseline, and the FPS is 30.6, indicating that the algorithm has achieved good accuracy and has the potential for real-time detection. A target detection algorithm with practical value. In future research, richer datasets will be introduced to the model to enhance its generalization ability, and the model will be compressed to better adapt to real-time monitoring in industrial scenarios. During the experiment, we have accumulated a lot of experience in data processing and detection algorithm design of steel surface defects. We hope this paper can be helpful to more developers and researchers of steel surface defects.

6. Introduction to journals

Note: The original text of the paper is from MSFT-YOLO: Improved YOLOv5 Based on Transformer for
Detecting Defects of Steel Surface. This paper is only for academic sharing. If there is any infringement, please contact the background for deletion.

Interpretation of the series of articles, I have reproduced the innovative code, friends in need are welcome to pay attention to private message me to get .

Guess you like

Origin blog.csdn.net/m0_70388905/article/details/128525702#comments_26958945