To assist "intelligent manufacturing" in industrial production, we develop and construct a cloth defect detection and identification system in textile production scenarios based on the full range of YOLOv8 models [n/s/m/l/x].

Pure industrial manufacturing cannot have a long-term development process. Transforming manufacturing into full-process and full-scenario intelligent manufacturing is the most competitive production scenario in the future. In the previous development practices, we have already been involved in a lot of industrial production scenarios. On-site project development, such as: PCB circuit board defect detection, welding defect detection, nut and screw defect detection, etc. The main purpose of this article is to develop and construct cloth defect detection and identification in textile production scenarios based on the latest v8 series model system.

Cloth defect detection has been practiced in our previous articles. If you are interested, you can read it by yourself:

"Practical Cloth Defect Detection Based on YOLO"

"A complete collection of cloth defect detection practices, developing and constructing cloth defect detection models based on the full range of yolov5 models [n/s/m/l/x], and comparing and analyzing the performance differences of each model" " Integrated attention mechanism is developed and constructed based on YOLOv5 to detect and identify cloth defects system"

 This article mainly chooses the latest YOLOv8 to develop and implement the detection model. We have developed five models with different parameter magnitudes for overall comparative analysis. First, let’s look at the example results:

Let’s take a brief look at the example data:

The training data configuration file looks like this:

# Dataset
path: ./dataset
train:
  - /data/dataset/images/train
val:
  - /data/dataset/images/test
test:
  - /data/dataset/images/test
 
 
# Classes
names:
  0: chongzhan
  1: cuohua
  2: fengtou
  3: fengtouyin
  4: huamao
  5: laban
  6: louyin
  7: podong
  8: qita
  9: secha
  10: shuiyin
  11: wangzhe
  12: zhanwu
  13: zhezi
  14: zhici

If you have any questions about YOLOv8 development and building your own target detection project, you can read the following article, as shown below:

"Super detailed tutorial on developing and building a target detection model based on YOLOv8 [taking the weld quality inspection data scenario as an example]"

Very detailed practical development tutorial. This article will not be expanded on here, because starting from YOLOv8 it has become an installation package, and the overall usage difference with v5 and v7 is still relatively large.

The core features and changes of YOLOv8 are as follows:
1. Provides a new SOTA model (state-of-the-art model), including P5 640 and P6 1280 resolution target detection networks and YOLACT-based instance segmentation models. Like YOLOv5, models of different sizes in N/S/M/L/X scales are also provided based on scaling factors to meet the needs of different scenarios.
2. The backbone network and Neck part may refer to the YOLOv7 ELAN design idea, and the C3 of YOLOv5 The structure was replaced by a C2f structure with richer gradient flow, and different channel numbers were adjusted for different scale models. This was a careful fine-tuning of the model structure. It no longer applied a set of parameters to all models, which greatly improved the model performance.
3. The Head part has undergone major changes compared to YOLOv5. It has been replaced by the current mainstream decoupling head structure, which separates the classification and detection heads. It also changes from Anchor-Based to Anchor-Free.
4. Loss calculation uses TaskAlignedAssigner. Sample distribution strategy, and the introduction of Distribution Focal Loss
5. The data enhancement part of training introduces the operation of turning off Mosiac enhancement in YOLOX for the last 10 epochs, which can effectively improve accuracy.

The official project address is here , as shown below:

At present, more than 1.7w of stars have been harvested. The officially provided pre-training model is as follows:

Model size
(pixels)
mAPval
50-95
Speed
CPU ONNX
(ms)
Speed
A100 TensorRT
(ms)
params
(M)
FLOPs
(B)
YOLOv8n 640 37.3 80.4 0.99 3.2 8.7
YOLOv8s 640 44.9 128.4 1.20 11.2 28.6
YOLOv8m 640 50.2 234.7 1.83 25.9 78.9
YOLOv8l 640 52.9 375.2 2.39 43.7 165.2
YOLOv8x 640 53.9 479.1 3.53 68.2 257.8

Another set of pre-trained models is as follows:

Model size
(pixels)
mAPval
50-95
Speed
CPU ONNX
(ms)
Speed
A100 TensorRT
(ms)
params
(M)
FLOPs
(B)
YOLOv8n 640 18.4 142.4 1.21 3.5 10.5
YOLOv8s 640 27.7 183.1 1.40 11.4 29.7
YOLOv8m 640 33.6 408.5 2.26 26.2 80.6
YOLOv8l 640 34.9 596.9 2.43 44.1 167.4
YOLOv8x 640 36.3 860.6 3.56 68.7 260.6

It is built based on the Open Image V7 data set and can be selected and used according to your own needs.

The positioning of YOLOv8 is not just target detection, but a powerful and comprehensive tool library. Therefore, it supports multiple types of tasks: attitude estimation, detection, classification, segmentation, and tracking. You can choose to use it according to your own needs. Here I won’t elaborate further.

A simple example implementation is as follows:

from ultralytics import YOLO
 
# yolov8n
model = YOLO('yolov8n.yaml').load('yolov8n.pt')  # build from YAML and transfer weights
model.train(data='data/self.yaml', epochs=100, imgsz=640)
 
 
# yolov8s
model = YOLO('yolov8s.yaml').load('yolov8s.pt')  # build from YAML and transfer weights
model.train(data='data/self.yaml', epochs=100, imgsz=640)
 
 
# yolov8m
model = YOLO('yolov8m.yaml').load('yolov8m.pt')  # build from YAML and transfer weights
model.train(data='data/self.yaml', epochs=100, imgsz=640)
 
 
# yolov8l
model = YOLO('yolov8l.yaml').load('yolov8l.pt')  # build from YAML and transfer weights
model.train(data='data/self.yaml', epochs=100, imgsz=640)
 
 
# yolov8x
model = YOLO('yolov8x.yaml').load('yolov8x.pt')  # build from YAML and transfer weights
model.train(data='data/self.yaml', epochs=100, imgsz=640)

Here we select five models of n, s, m, l and x with different parameter magnitudes for development.

The model file of yolov8 is given here as follows:

# Parameters
nc: 15 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]  # YOLOv8n summary: 225 layers,  3157200 parameters,  3157184 gradients,   8.9 GFLOPs
  s: [0.33, 0.50, 1024]  # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients,  28.8 GFLOPs
  m: [0.67, 0.75, 768]   # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients,  79.3 GFLOPs
  l: [1.00, 1.00, 512]   # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
  x: [1.00, 1.25, 512]   # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs
 
# YOLOv8.0n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 9
 
# YOLOv8.0n head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2f, [512]]  # 12
 
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 4], 1, Concat, [1]]  # cat backbone P3
  - [-1, 3, C2f, [256]]  # 15 (P3/8-small)
 
  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 12], 1, Concat, [1]]  # cat head P4
  - [-1, 3, C2f, [512]]  # 18 (P4/16-medium)
 
  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 9], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2f, [1024]]  # 21 (P5/32-large)
 
  - [[15, 18, 21], 1, Detect, [nc]]  # Detect(P3, P4, P5)

It includes five models with different parameter magnitudes. Keep the same parameter settings during training settlement. After the training is completed, we will perform horizontal comparison and visualization for overall comparison and analysis.

[Precision Curve]
The Precision-Recall Curve is a visual tool used to evaluate the precision performance of a binary classification model under different thresholds. It helps us understand how the model performs at different thresholds by plotting the relationship between precision and recall at different thresholds.
Precision refers to the ratio of the number of samples that are correctly predicted as positive examples to the number of samples that are predicted to be positive examples. Recall refers to the ratio of the number of samples that are correctly predicted as positive examples to the number of samples that are actually positive examples.
The steps for plotting a precision curve are as follows:
Convert predicted probabilities into binary class labels using different thresholds. Usually, when the predicted probability is greater than the threshold, the sample is classified as a positive example, otherwise it is classified as a negative example.
For each threshold, the corresponding precision and recall are calculated.
Plot precision and recall at each threshold on the same graph to form a precision curve.
Based on the shape and changing trend of the accuracy curve, an appropriate threshold can be selected to achieve the required performance requirements.
By observing the precision curve, we can determine the best threshold according to our needs to balance precision and recall. Higher precision means fewer false positives, while higher recall means fewer false negatives. Depending on specific business needs and cost trade-offs, appropriate operating points or thresholds can be selected on the curve.
Precision curves are often used together with recall curves to provide a more comprehensive analysis of classifier performance and help evaluate and compare the performance of different models.

[Recall Curve]
Recall Curve is a visualization tool used to evaluate the recall performance of a binary classification model under different thresholds. It helps us understand the performance of the model under different thresholds by plotting the relationship between the recall rate at different thresholds and the corresponding precision rate.
Recall refers to the ratio of the number of samples that are correctly predicted as positive examples to the number of samples that are actually positive examples. Recall rate is also called sensitivity (Sensitivity) or true positive rate (True Positive Rate).
The steps for plotting a recall curve are as follows:
Convert predicted probabilities into binary class labels using different thresholds. Usually, when the predicted probability is greater than the threshold, the sample is classified as a positive example, otherwise it is classified as a negative example.
For each threshold, the corresponding recall rate and the corresponding precision rate are calculated.
Plot recall and precision at each threshold on the same graph to form a recall curve.
Based on the shape and changing trend of the recall curve, an appropriate threshold can be selected to achieve the required performance requirements.
By observing the recall curve, we can determine the best threshold according to our needs to balance recall and precision. Higher recall means fewer false negatives, while higher precision means fewer false positives. Depending on specific business needs and cost trade-offs, appropriate operating points or thresholds can be selected on the curve.
Recall curves are often used together with precision curves to provide a more comprehensive analysis of classifier performance and help evaluate and compare the performance of different models.

[F1 value curve]
The F1 value curve is a visualization tool used to evaluate the performance of a binary classification model under different thresholds. It helps us understand the overall performance of the model by plotting the relationship between Precision, Recall and F1 score at different thresholds.
The F1 score is the harmonic average of precision and recall, which takes into account both performance indicators. The F1 value curve can help us determine a balance point between different precision and recall rates to choose the best threshold.
The steps for plotting an F1 value curve are as follows:
Convert the predicted probabilities into binary class labels using different thresholds. Usually, when the predicted probability is greater than the threshold, the sample is classified as a positive example, otherwise it is classified as a negative example.
For each threshold, the corresponding precision, recall and F1 score are calculated.
Plot the precision, recall and F1 score at each threshold on the same graph to form an F1 value curve.
According to the shape and changing trend of the F1 value curve, an appropriate threshold can be selected to achieve the required performance requirements.
F1 value curves are often used together with receiver operating characteristic curves (ROC curves) to help evaluate and compare the performance of different models. They provide a more comprehensive analysis of classifier performance, allowing the selection of appropriate models and threshold settings based on specific application scenarios.

Comprehensive comparison: There is no obvious gap in performance between models with different parameter levels. In comparison, the n-series model has the lowest effect, and the m-series model is more excellent. In the end, we chose the m-series model as our model. Online inference model.

Next, let’s take a detailed look at the results of the m-series model:

【Batch example】

【Training Visualization】

If you are interested, you can try it yourself!

Guess you like

Origin blog.csdn.net/Together_CZ/article/details/135382432