The difference between one-dimensional convolution and fully connected layer

        When building a convolutional neural network structure and requiring dimensionality reduction operations, I suddenly thought of this problem. Dimensionality reduction can be achieved by one-dimensional convolution (the convolution kernel is 1X1). In addition, the fully connected neural network (FC) seems to be able to perform dimensionality reduction operations. How should I choose at this time?

Mathematical definition of fully connected layer and one-dimensional convolution:

The mathematical definitions of the two network layers are roughly the same:

 It's all multiplied by a weight and then added with an offset.

the difference

One-dimensional convolution 1×1

       A convolution kernel has only one value a. If it is just an image, it has no effect. It is equivalent to multiplying each pixel of this image by a, and no other new information is generated.

      But if the picture is multi-channel, assuming C channel, according to the above mathematical definition, the meaning of one-dimensional convolution is that the pixels at the same position of the C channels can be linearly weighted to obtain a new value. If there are N convolution kernels, channel-level dimensionality enhancement or dimensionality reduction can be achieved, from C channels to N channels. Equivalent to linear weighting at the channel level.

full connection

          Full connection is similar to one-dimensional convolution, but the difference is that one-dimensional convolution is linearly weighted for all channels at a single pixel position, while full connection is to first tile all inputs (or use a pooling layer) into a one-dimensional vector , ie Prefers linear weighting on the pixel level.

In addition, the fully connected network in a broad sense will also have an activation function.

       In short, in a word, the difference between the two is that the one-dimensional convolution is an operation on the image channel level, while the full connection is more inclined to the pixel-level operation.

how to choose

        The use of 1*1 convolution instead of full connection should be based on the consideration of input size; the input of full connection is to multiply all elements of the feature map by weight and then sum, but this weight vector needs to be fixed when designing the network, so all Connections cannot adapt to changes in input size and can only be fixed. However, the output of 1*1 convolution is the same size as the input size, and the output size can change with the input size, so 1*1 convolution does not need to fix the output size.

        In layman's terms, convolution is weight sharing, so the parameters learned are only related to the convolution kernel and have nothing to do with the feature map. As for the effect, if it is classification, there is basically no difference. If it is segmentation, then convolution is more appropriate.

Guess you like

Origin blog.csdn.net/qq_37925923/article/details/126956049