Why does the fully connected layer require a fixed-size vector as input? What problems will occur if the vector input does not meet a specific size?

Problem description: Why does the fully connected layer require a fixed-size vector as input? What problems will occur if the vector input does not meet a specific size?

Questions and Answers:

The fully connected layer requires that the size of the input vector is fixed, because the size of the weight matrix of the fully connected layer is determined by the dimensions of the input and output. Specifically, if the dimension of the input vector is Din​ and the dimension of the output vector is Dout​, then the size of the weight matrix of the fully connected layer will be Din​×Dout​.

If the size of the input vector does not meet the expected fixed size, the following problems will result:

  1. Weight matrix incompatibility: If the dimension of the input vector does not match the input dimension expected by the fully connected layer, the size of the weight matrix will be incompatible and the matrix multiplication operation cannot be performed.

  2. Inconsistent number of parameters: The number of parameters of a fully connected layer is affected by the input and output dimensions. If the size of the input vector changes, it will cause the number of parameters of the fully connected layer to change, which may be inconsistent with the previously defined model.

  3. Inconsistent network structure: The hierarchical structure of neural networks is usually designed to have fixed input and output dimensions to ensure consistency in the parameters and structure of the entire network. If the input vectors are of inconsistent sizes, the structural consistency of the network will be destroyed.

In order to solve this problem, some preprocessing layers, such as convolutional layers or pooling layers, are usually added in front of the neural network to ensure that the input data has a fixed size before entering the fully connected layer. In addition, batch processing techniques are often used to organize input data into fixed-size batches to ensure that the network can handle inputs of different sizes.

If the number of neurons in the output layer is fixed and the number of input layers is 4, then the parameter matrix will be 3 rows and 4 columns, as shown in the following formula.

\begin{bmatrix} \theta _{10} &\theta_{11} & \theta _{12} &\theta _{13} \\ \theta _{20}& \theta _{21} & \theta _{22} &\theta _{23} \\ \theta _{30}& \theta _{31} & \theta _{32} &\theta _{33} \end{bmatrix}

However, if the number of input layers is 3, then the parameter matrix will be 3 rows and 3 columns, as shown in the following formula.

\begin{bmatrix} \theta_{10} &\theta_{11} &\theta_{12} \\ \theta_{20} & \theta_{21} & \theta_{22} \\ \theta_{30} &\theta_{31} & \theta_{32} \end{bmatrix}

Guess you like

Origin blog.csdn.net/weixin_43501408/article/details/135424218