The lightweight recognition model has been practiced a lot in our previous blog posts. If you are interested, you can read it by yourself:
The core idea of this article is to develop and build a ball screw drive surface defect image recognition system in industrial production and manufacturing scenarios based on GhostNet. First, let’s look at the example effect:
GhostNet is a lightweight convolutional neural network specifically designed for use on mobile devices. Its main building block is the Ghost module, a novel plug-and-play module. The original intention of the Ghost module is to generate more features by using fewer parameters.
Kanpo 论文地址位这り、See below:
The official has also open sourced the project, the address ishere, as shown below:
You can read the official code examples in detail, and then you can develop and build models based on your own data sets.
Here is the core implementation part of GhostNet, as shown below:
class GhostNet(nn.Module):
def __init__(self, cfgs, num_classes=1000, width_mult=1.0):
super(GhostNet, self).__init__()
self.cfgs = cfgs
output_channel = _make_divisible(16 * width_mult, 4)
layers = [
nn.Sequential(
nn.Conv2d(3, output_channel, 3, 2, 1, bias=False),
nn.BatchNorm2d(output_channel),
nn.ReLU(inplace=True),
)
]
input_channel = output_channel
block = GhostBottleneck
for k, exp_size, c, use_se, s in self.cfgs:
output_channel = _make_divisible(c * width_mult, 4)
hidden_channel = _make_divisible(exp_size * width_mult, 4)
layers.append(
block(input_channel, hidden_channel, output_channel, k, s, use_se)
)
input_channel = output_channel
self.features = nn.Sequential(*layers)
output_channel = _make_divisible(exp_size * width_mult, 4)
self.squeeze = nn.Sequential(
nn.Conv2d(input_channel, output_channel, 1, 1, 0, bias=False),
nn.BatchNorm2d(output_channel),
nn.ReLU(inplace=True),
nn.AdaptiveAvgPool2d((1, 1)),
)
input_channel = output_channel
output_channel = 1280
self.classifier = nn.Sequential(
nn.Linear(input_channel, output_channel, bias=False),
nn.BatchNorm1d(output_channel),
nn.ReLU(inplace=True),
nn.Dropout(0.2),
nn.Linear(output_channel, num_classes),
)
self._initialize_weights()
def forward(self, x, need_fea=False):
if need_fea:
features, features_fc = self.forward_features(x, need_fea)
x = self.classifier(features_fc)
return features, features_fc, x
else:
x = self.forward_features(x)
x = self.classifier(x)
return x
def forward_features(self, x, need_fea=False):
if need_fea:
input_size = x.size(2)
scale = [4, 8, 16, 32]
features = [None, None, None, None]
for idx, layer in enumerate(self.features):
x = layer(x)
if input_size // x.size(2) in scale:
features[scale.index(input_size // x.size(2))] = x
x = self.squeeze(x)
return features, x.view(x.size(0), -1)
else:
x = self.features(x)
x = self.squeeze(x)
return x.view(x.size(0), -1)
def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode="fan_out", nonlinearity="relu")
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.fill_(1)
m.bias.data.zero_()
def cam_layer(self):
return self.features[-1]
Let’s take a brief look at the data set:
The dataset distribution visualization looks like this:
The distribution visualization is realized based on the tsne algorithm, and it can be clearly seen that the distinction between the two types of data is still very obvious.
The difficulty of overall model training and recognition is also relatively low. Let’s take a look at the loss trend:
acc curve:
It can be seen that the accuracy of the model is very high.
Examples of enhancement processing effects on original image data based on commonly used data enhancement algorithms are as follows:
The confusion matrix is as follows:
The development practice of the project is a process of continuous optimization and harvest. If you are interested, you can participate!