Darknet source code reading notes (2)

From  https://github.com/BBuf/Darknet  I saw a well-written blog: https://mp.weixin.qq.com/s/RruZSl49vv5B0eRif-p9HQ

Learn first. There are additional content to add.

im2col analysis

From the above code, we can know that the core point of the forward propagation of the convolutional layer is the im2col operation and the sgemm matrix calculation method to calculate the data after using im2col to rearrange. Now let’s analyze the im2col algorithm. The sgemm algorithm is just called directly after im2col runs, so I won’t go into details.

Considering that it is easier to understand the idea of ​​im2col in combination with pictures, I will use the picture of the CSDN Tiger-Gao blogger to describe it. First of all, what happens after we rearrange a single-channel image with a length and width of 4 through im2col? Look at the picture below:


Insert picture description here

Let's take a look at the change process in detail:

 

This is a single-channel process of change, so what about multi-channel? First look at the original picture:

 

Insert picture description here

The process of multi-channel im2col is to first im2col the first channel, then im2col the second channel, and finally im2col the third channel. The im2col data of each channel is also continuously stored in the memory. Look at the picture below:

Insert picture description here

This is the change of the original image after im2col, what about the kernel? See the original picture:

 

Insert picture description here

The kernel channel data is also continuously stored in the memory. So the above kernel image can be expressed as the following figure after im2col algorithm:

 

So how do we get the result of forward propagation? In DarkNet and Caffe are implemented in the same way as Kernel*Img, that is, in matrix multiplication:

M=1 ,
N=output_h * output_w
K=input_channels * kernel_h * kernel_w

The results are as follows:

 

 

The image data is stored continuously, so the output image can also be as shown in the figure below [output_h * output_w]=[2*2]:

 

Insert picture description here

The process for multi-channel images is similar:

 

 

Similarly, the image data of multiple output channels is stored continuously, so the output image can also be as shown in the figure below [output_channels * output_h * output_w]=[32*2]

 

Insert picture description here

The implementation of im2col algorithm is in src/im2col.cthe im2col_cpufunction.

 

 

 

 

 

Guess you like

Origin blog.csdn.net/juluwangriyue/article/details/109185644