Defect detection 4

I moved here to learn for myself, thank you, big brother, don’t blame it

Defect detection is widely used in cloth defect detection, workpiece surface quality detection, aerospace fields, etc. Traditional algorithms can work well for situations with rule defects and relatively simple scenes, but they are no longer suitable for situations with inconspicuous features, diverse shapes, and chaotic scenes. In recent years, recognition algorithms based on deep learning have become more and more mature, and many companies have begun to try to apply deep learning algorithms to industrial situations.

defect data

As shown in the figure below, here we take cloth data as an example. There are three common defects, wear, white spots, and multiple lines.

 

 

How to make training data? Here is an interception on the original image, and a small image is intercepted. For example, the above image is 512x512. Here I crop it into a small image of 64x64. Here we take the first type of defect as an example, and the following is the method of making data.

 

Note: When making defect data, the defect area accounts for at least 2/3 of the intercepted image, otherwise it will be discarded and not used as a defect image.

Generally speaking, the defect data is much less than the background data. In addition, through the enhanced data, defect: background = 1:1, and each category is about 1000~~~

network structure

The specific network structure used is as follows, the input size is 64x64x3, and the size of the intercepted small image is used. Each Conv convolutional layer is followed by a BN layer, and the specific layer parameters are as follows.

Conv1: 64x3x3
Conv2: 128x3x3
ResNetBlock and DenseNetBlock, two each. For details, please refer to Residual Network and DenseNet.
Add: Add the output result of the residual module and the output result of DenseNetBlock on the corresponding feature map. The addition method is the same as that of the residual module. Note that this is actually for better feature extraction, the method is not necessarily the residual module + DenseNetBlock, it can also be inception, or other.
Conv3: 128x3x3
Maxpool: stride=2, size=2x2
FC1: 4096
Dropout1: 0.5
FC2: 1024
Dropout1: 0.5
Softmax: Corresponding to the category to be divided, here I am a two-category.

Regarding the final loss function, it is recommended to choose Focal Loss, which is the masterpiece of He Kaiming. The source code is as follows: 

Once the data is ready, you can start training~~~

Defect detection of the entire scene image

The input of the network trained above is 64x64x3, but the entire scene image is 512x512. This input does not match the input of the model. What should I do? In fact, you can extract the trained model parameters, and then assign them to another new model, and then change the input of the new model to 512x512, but the feature map extracted in the conv3+maxpool layer is relatively large, this Then map the feature map to the original image. For example, after the last maxpool layer of the original model, the output feature map size is 8x8x128, of which 128 is the number of channels. If the input is changed to 512x512, the output feature map becomes 64x64x128, where each 8x8 corresponds to 64x64 on the original image, so that an 8x8 sliding window can be used to slide and crop features on the 64x64x128 feature map. Then fatten the cropped features and send them to the fully connected layer. The details are shown in the figure below.

The fully connected layer also needs to rebuild a model, the input is the input after flatten, and the output is the output of the softmax layer. This is a simple little model.

Here is a code to read the trained model parameters into another model

#Large model for extracting features
def read_big_model(inputs):
   # The first convolution and maximum pooling layer
   X = Conv2D(16, (3, 3), name="conv2d_1")(inputs)
   X = BatchNormalization(name= "batch_normalization_1")(X)
   X = Activation('relu', name="activation_1")(X)
   X = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), name="max_pooling2d_1") (X)
   # google_inception module
   conv_1 = Conv2D(32, (1, 1), padding='same', name='conv2d_2')(X) conv_1
   = BatchNormalization(name='batch_normalization_2')(conv_1)
   conv_1 = Activation( 'relu', name='activation_2')(conv_1)
   conv_2 = Conv2D(32, (3, 3), padding='same', name='conv2d_3')(X)
   conv_2 = BatchNormalization(name='batch_normalization_3')(conv_2)
   conv_2 = Activation('relu', name='activation_3')(conv_2)
   conv_3 = Conv2D(32, (5, 5), padding='same', name='conv2d_4')(X)
   conv_3 = BatchNormalization(name='batch_normalization_4')(conv_3)
   conv_3 = Activation('relu', name='activation_4')(conv_3)
   pooling_1 = MaxPooling2D(pool_size=(2, 2), strides=(1, 1), padding='same', name='max_pooling2d_2')(X)
   X = merge([conv_1, conv_2, conv_3, pooling_1], mode='concat', name='merge_1')
   X = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), name='max_pooling2d_3')(X)  # 这里的尺寸变成16x16x112
   X = Conv2D(64, (3, 3), kernel_regularizer=regularizers.l2(0.01), padding='same', name='conv2d_5')(X)
   X = BatchNormalization(name='batch_normalization_5')(X)
   X = Activation('relu', name='activation_5')(X)
   X = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), name='max_pooling2d_4')(X)  # 这里尺寸变成8x8x64
   X = Conv2D(128, (3, 3), padding='same', name='conv2d_6')(X)
   X = BatchNormalization(name='batch_normalization_6')(X)
   X = Activation('relu', name='activation_6')(X)
   X = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same', name='max_pooling2d_5')(X)  # 这里尺寸变成4x4x128

return X

def read_big_model_classify(inputs_sec):
   X_ = Flatten(name='flatten_1')(inputs_sec)
   X_ = Dense(256, activation='relu', name="dense_1")(X_)
   X_ = Dropout(0.5, name="dropout_1")(X_)
   predictions = Dense(2, activation='softmax', name="dense_2")(X_)
   return predictions
#The small model built

inputs=Input(shape=(512,512,3))
X=read_big_model(inputs)#Read the network parameters of the trained model
#Build the first model
model=Model(inputs=inputs, outputs=X)
model.load_weights(' model_halcon.h5', by_name=True)

Identify positioning results

The above sliding window method can locate the original image, and the 8x8 sliding window is positioned to the original image to be 64x64. Similarly, in the original image, according to the sliding window method (here, the step size of left and right and up and down is 16 pixels) for recognition There is also more than one defect location to be located, which involves positioning accuracy. The method of voting here is actually to count each marked pixel position on the original image. When the number is greater than the specified threshold, it is judged as a defective pixel.

The recognition result is shown in the figure below:

 

some tricks

For the above case, the 64x64 size positioning frame is actually not accurate enough. You can consider training a 32x32 size model, and then apply it in the same way as the 64x64 model. Finally, vote based on the 32x32 positioning position and the 64x64 positioning position, but this will It involves a problem, that is, the time will increase a lot, so it should be used with caution. whaosoft aiot  http://143ai.com   

When the difference between the background and the foreground is not too big, try not to make the network too deep, because the things that are too deep in the network are basically the same, and there is no good ability to distinguish. This is why I don’t use object detection here. The reason is that the depth of these detection model networks is often 50+, but the effect is not good, although there is a residual module as the backbone.

But when there is a big difference between the background and the foreground, you can choose a deeper network. At this time, the object detection method comes in handy.

Guess you like

Origin blog.csdn.net/qq_29788741/article/details/131282282