Depthwise Convolution Depth separable convolution

"dw weight" may refer to the weight parameter of Depthwise Convolution (depthwise separable convolution), "dw" is the abbreviation of "depthwise". Depthwise Convolution is a lightweight convolution operation that reduces the amount of computation by splitting the standard convolution into two parts: Depthwise Convolution and Pointwise Convolution. Depthwise separable convolution is widely used in mobile applications, which can effectively reduce the amount of computation and parameters, and improve the reasoning speed and operating efficiency of the model. In depthwise separable convolution, "dw weight" usually refers to the weight parameter of depthwise convolution.

** Depthwise Separable Convolution (Depthwise Separable Convolution)** is a technique used in convolutional neural networks to reduce the amount of parameters and calculations.

The parameter principle of depthwise separable convolution can be divided into two steps:

Depthwise Convolution
Depthwise convolution is an operation that only convolutes each channel of the input. Suppose the size of the input data is H × W × C in H \times W \times C_{in}H×W×Cin, the size of the convolution kernel is K × K × C in K \times K \times C_{in}K×K×Cin, where KKK is the size of the convolution kernel,C in C_{in}Cinis the number of channels of input data. In depth convolution, the size of the convolution kernel is K × K × 1 K \times K \times 1K×K×1 , for each input channel, a convolution kernel of this size is used for convolution. Therefore, the parameter amount of depth convolution isK × K × C in K \times K \times C_{in}K×K×Cin, while the number of channels of the output feature map is still C in C_{in}Cin

Pointwise Convolution (Pointwise Convolution)
Pointwise convolution is a method using 1 × 1 1 \times 11×1 The operation of the convolution kernel. Suppose the size of the input data isH × W × C in H \times W \times C_{in}H×W×Cin, the size of the output data is H × W × C out H \times W \times C_{out}H×W×Cout, where C out C_{out}Coutis the number of channels of the output data. In point-by-point convolution, the size of the convolution kernel is 1 × 1 × C in 1 \times 1 \times C_{in}1×1×Cin, for each position, use a convolution kernel of this size to perform a convolution operation on all channels. Therefore, the parameter amount of point-by-point convolution is C in × C out C_{in} \times C_{out}Cin×Cout

Combining depthwise convolution with pointwise convolution yields depthwise separable convolution. The parameter amount of depth separable convolution is K × K × C in + C in × C out K \times K \times C_{in} + C_{in} \times C_{out}K×K×Cin+Cin×Cout, relative to the parameter quantity of ordinary convolution K × K × C in × C out K \times K \times C_{in} \times C_{out}K×K×Cin×CoutMuch smaller. When using depthwise separable convolutions, you can use smaller convolution kernels and fewer parameters to build deeper and wider neural networks.

Guess you like

Origin blog.csdn.net/qq_37464479/article/details/129238440