Depthwise convolution convolution with Pointwise

Original link: https://blog.csdn.net/tintinetmilou/article/details/81607721

Depthwise (DW) convolved with Pointwise (PW) convolution, collectively referred Depthwise Separable Convolution (see Google's Xception), the structure and operation similar to a conventional convolution, can be used to extract features, but compared to conventional convolution operation, a lower computation cost and the amount of its parameters. So in some lightweight network will encounter such a structure as MobileNet.

Conventional convolution operation

For a 5 × 5 pixels, three-channel color input image (shape of 5 × 5 × 3). Convolutionally layer 3 × 3 convolution kernel (assuming that the output channels is 4, the convolution kernel shape of 3 × 3 × 3 × 4), the final output 4 Feature Map, if the input size of the same padding layer the same (5 × 5), if none of dimensions to 3 × 3.

Here Insert Picture Description

Depthwise Separable Convolution

Depthwise Separable Convolution is a full convolution is decomposed into two steps, i.e. Depthwise Convolution with Pointwise Convolution.

Depthwise Convolution

Unlike conventional convolution operation, a convolution kernel Depthwise Convolution is responsible for a channel, only one channel is a convolution kernel convolution. Each conventional convolution kernel convolution mentioned above is the simultaneous operation of each channel of the input picture.
Also for a 5 × 5 pixels, three-channel color input image (shape of 5 × 5 × 3), Depthwise Convolution first passes through the first convolution operation, unlike the conventional convolution above, is entirely in a two-dimensional DW the plane. Number to the number of the convolution kernel layer on the same channel (channel convolution kernel and one correspondence). Therefore, a three-channel image through the operation into three epigenetic Feature map (if the same padding size is the same as the input layer 5 × 5), as shown below.

Here Insert Picture Description

The same number of channels and the number of input layer Feature map after Depthwise Convolution completed, can not be extended Feature map. And this computation for each channel independently of the input layer convolution operation, without effective use of feature information of different channels at the same spatial position. Pointwise Convolution it is necessary to combine these Feature map to generate a new Feature map.

Pointwise Convolution

Pointwise Convolution operation is very similar to a conventional convolution, the convolution kernel which a size of 1 × 1 × M, M is the number of the channel layer. Therefore, the step will map where convolution with weights in the depth direction, generate a new Feature map. There are several convolution kernel, there are several output Feature map. As shown below.

Here Insert Picture Description

Guess you like

Origin blog.csdn.net/qq_32642107/article/details/102729491