Group Convolution of Deep Learning

Recently I was reading "Interleaved Group Convolutions for Deep Neural Networks" by researcher Wang Jingdong of MSRA. The concept of group convolution is mentioned many times in the paper, so I specially learned about group convolution.

Group convolution first appeared in AlexNet. In order to solve the problem of insufficient video memory, the network is deployed on two GTX 580 graphics cards for training. Alex believes that the group conv method can increase the diagonal correlation between filters, and can reduce the training parameters, which is not easy to overfit. Similar to regular effect.

We assume that the output feature map of the previous layer has None, that is, the number of channels channel=N, which means that the previous layer has Na convolution kernel. Also assume the number of groups of group convolutions M. Then the operation of the group convolutional layer is to channeldivide it into Mcomponents first. Each groupcorresponds to N/Mone channeland is independently connected to it. Then, after each groupconvolution is completed, the outputs are concatenated together as the output of this layer channel.

The figure below shows the structure of AlexNet. It can be seen that the network is divided into upper and lower parts.
write picture description here
The following figure is the result of visualizing the convolution kernel of the upper and lower parts.
write picture description here

AlexNet conv1 filter separation: as noted by the authors, filter groups appear to structure learned filters into two distinct groups, black-and-white and colour filters.
The visualization of the first convolutional layer shows that after learning, one group becomes A black and white filter, the other seems to be a color filter.

The figure below is a normal, ungrouped convolutional layer structure. The following figure shows the structure of CNN from a third-dimensional perspective, and a filter corresponds to an output channel. As the number of network layers deepens, the number of channels increases sharply, while the spatial dimension decreases, because the convolutional layer has more and more convolution kernels, but with the convolution pooling operation, the feature map becomes smaller and smaller. Therefore, in the deep network, the importance of the channel is increasing.

write picture description here
The figure below is a group convolutional CNN structure. Filters are divided into two groups. Each group has only half of the original feature map.
write picture description here

参考:A Tutorial on Filter Groups (Grouped Convolution)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324682566&siteId=291194637