Receptive Field Computation in Convolutional Neural Networks

The receptive field is a very important concept in the convolutional neural network. When tuning parameters, sometimes paying attention to the receptive field can often play an unexpected role.

As the number of network layers deepens, it is often unrealistic to push the receptive field layer by layer. This article mainly introduces the calculation formula of the receptive field and its derivation process. (I noticed that the calculation methods in some blogs are wrong. You still need to understand the essence of the problem, and manually deduce it a few times by yourself. Don’t accept all the mistakes.)

[The following calculations do not consider the problem of padding, and do not consider the lack of receptive fields at the border]

The receptive field is the area covered by a node in the neural network feature map in the original image.

Symbol definition: Suppose that in the i-th layer, the receptive field of a single node is  rf_i , the size of the convolution kernel acting on this layer is  k_i , and the convolution step is s_i (for convenience, the receptive field and convolution kernel here are both considered to be rectangles, the above are all dimensions in one direction)

According to the definition above, the first layer is the original image, rf_1=1, s_0=1.

See the following example:

Example of receptive field inference
layer id  Receptive field size (width) The range covered by each node (the starting point~end point covered by the receptive field of each node) The convolution kernel width and step size that act on this layer
1 rf_1=1 1~1, 2~2,3~3,4~4,... k_1=3,s_1=2
2 rf_2=3

\large 1\sim rf_1,(1+s_1) \sim (1+s_1+rf_1-1),(1+2s_1)\sim(1+2s_1-1+rf_1-1), ...

-----------------------------------

1~3, 3~5, 5~7

k_2=3,s_2=2
3 rf_3=7

\large 1\sim rf_2,(1+s_2s_1) \sim (1+s_2s_1+rf_2-1),(1+2s_2s_1)\sim(1+2s_2s_1-1+rf_2-1), ...

-----------------------------------

1~7, 5~11,9~15,...

k_3=3,s_3=2
4 rf_4=15 \large 1\sim rf_3,(1+s_3s_2s_1) \sim (1+s_3s_2s_1+rf_3-1),(1+2s_3s_2s_1)\sim(1+2s_3s_2s_1-1+rf_3-1), ... k_4,s_4
5      
...      

To the receptive field of the node in the 5th layer, that is to integrate the nodes of the 4th k_4layer  , so as long k_4 as the coverage of the receptive field of the 1st node in the 4th layer is known, the coverage boundary of the receptive field of the 5th layer can be obtained.

Pay attention to the formula in the third column of the above table. The starting range and ending range of each node's receptive range are arithmetic progressions, and the difference between the starting point and the ending point must be the size of the current layer's receptive field plus 1 (this is very good I understand, if you don't understand, think about it for a while). Therefore, as long as we know the size of the receptive field of this layer and the starting range of the target node, we can locate the specific position and size of the receptive field of the node.

Please calculate the following, the starting range of the second node from the left in the feature map of the second layer is  1+s_1.

And on the third layer, since the effect is the same as the stride of the feature map of the second layer   s_2, the second node from the left of the third layer  s_2 starts from the first node of the second layer, therefore, the second node from the left of the third layer The starting range of nodes is  1+s_2s_1, and so on. In this way, the calculation formula of the starting node is determined.

The next step is to determine the receptive field of each layer. For the two adjacent layers, the i-th layer, the nodes of the i-1th layer are fused  , so we need to find the receptive field range of the node 1\sim k_{i-1} from the left of the i-1th layer  ; according to the above reasoning method, the node starts k_{i-1}-1The range is  \large 1+(k_{i-1}-1)\prod _{j=0}^{i-2}s_j, so the terminating range is \large (k_{i-1}-1)\prod_{j=0}^{i-2}s_j + rf_{i-1}.

Therefore, the receptive field range of the i-th layer is rf_i = (k_{i-1}-1)\prod_{j=0}^{i-2}s_j + rf_{i-1}, i \geqslant 2

The above is the derivation formula. According to the formula and the above derivation method, the size and position of the receptive field of each layer of nodes can be obtained .

If you have any questions, please leave a message to discuss.

Guess you like

Origin blog.csdn.net/yangyehuisw/article/details/105167930