Depthwise separable convolution

Depth separable convolution is a combination of depthwise (DW) channel-by-channel convolution and pointwise (PW) pointwise convolution to extract feature map

regular convolution

insert image description here
As shown in the figure above, for a 5x5x3 image to perform conventional convolution, assuming that a 3x3 convolution kernel is used for convolution, and the output channel is 4, the shape of the convolution kernel is 3x3x3x4, and 4 Features are obtained after convolution. Map.
The number of parameters of the convolutional layer N = 4 × 3 × 3 × 3 = 108
calculations C = 3 x 3 x (5-2) x (5-2) x 3 x 4 = 972 times

Depthwise Separable Convolution

  • Channel-by-channel convolution

A convolution kernel of Depthwise Convolution is responsible for one channel, and one channel is convolved by only one convolution kernel

insert image description here
As shown in the figure above, for a 5x5x3 image to be convolved, assuming that a 3x3 convolution kernel is used for convolution, the number of convolution kernels is the same as the number of input channels, that is, the channels and convolution kernels correspond one-to-one, then the convolution kernel The shape is 3x3x3, and three Feature Maps are obtained after convolution.
The number of parameters of the convolutional layer N = 3 × 3 × 3 = 27
calculations C = 3 x 3 x (5-2) x (5-2) x 3 = 243 times

The number of Feature maps after Depthwise Convolution is the same as the number of channels in the input layer, and the Feature map cannot be expanded. Moreover, this operation independently performs convolution operations on each channel of the input layer, and does not effectively use the feature information of different channels at the same spatial position. Therefore, Pointwise Convolution is needed to combine these Feature maps to generate a new Feature map

  • Pointwise convolution

The operation of Pointwise Convolution is very similar to the conventional convolution operation. The size of its convolution kernel is 1×1×M, and M is the number of channels in the previous layer. Therefore, the convolution operation here will weight and combine the maps in the previous step in the depth direction to generate a new Feature map. There are several convolution kernels and several output Feature maps

insert image description here
As shown in the figure above, the Feature Map obtained by channel-by-channel convolution is convolved, and a convolution kernel of 1x1x3 size is used for convolution. The shape of the convolution kernel is 1x1x3x4, and four Feature Maps are obtained after convolution.
The number of parameters of the convolutional layer N = 1 × 1 × 3 x 4 = 12
calculations C = 1 x 1 x 3 x 3 x 3 x 4 = 108 times

Compared

Conventional convolution operation:
parameter amount N=108
calculation amount S=972

Depth separable convolution operation:
parameter amount N=27+12=39
calculation amount S=243+108=351

With the same input, 4 Feature maps are also obtained, and the number of parameters of Separable Convolution is about 1/3 of that of conventional convolution. Therefore, under the premise of the same amount of parameters, the number of neural network layers using Separable Convolution can be made deeper.

Guess you like

Origin blog.csdn.net/qq_40042726/article/details/122547348