Reprint: CNN learning convolution or pooling output map size calculation

I believe that when you are learning CNN, you often do not know how to calculate the size of the map obtained after convolution or pooling. Especially when it comes to borders.
 
First of all, you need to understand that for an input image of input_height*input_widtht, when convolution or pooling, padding is often needed. This is a way to deal with boundary problems, so the original input becomes the image below. Show:
 
The size calculation for the output is as follows:
out_height=((input_height - filter_height + padding_top+padding_bottom)/stride_height  )+1
out_width=((input_width - filter_width + padding_left+padding_right)/stride_width )+1
 
but often
out_height=out_width ,
input_height = input_width   
fillter_height=filter_width  padding_top=padding_bottom=padding_left=padding_right  
stride_width=stride_height
 
 
In addition, for the current mainstream cnn framework
 
It is calculated like this in tensorflow:
First of all, there are two default options for padding, same and valid. If it is valid, the four values ​​of padding are all 0, that is, there is no padding.
For the same, the calculation method is as follows. The reason why the value of out in the figure can be directly calculated is because the value of padding is a fixed value according to the input; we convert the equation of pad_along_height into, the left side is out_height, the right side is other form. Just the same.
 
 
Other frameworks, such as caffe, should set the value of padding by themselves.
 
As for the value of the padding area, generally fill in 0. Or copy the value of the boundary and so on.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326131039&siteId=291194637