[Deep learning][Image processing][Complete][Common sense]Some basic common sense about image processing, and notes in the last 5 days. 2018.5.2

"Deep Learning and Computer Vision"           

            to see

            Pages: 28, 30, 37, 51-53, 66-70,

                73-78、81、84、88-95、100、

                104、113-115、120-121、125-126

                125-138

               < 125 The calculations on page 125 also have channels, which is what I want. >

1 × 1 convolution on page            136 , is what I want. >

              The result of the dot product of the two vectors is that the corresponding bits are multiplied and then added.

 

            The larger the convolution kernel step size, the smaller the dimension of the convolution result, which achieves the purpose of downsampling.

 

              The nonlinear transformation of the fully connected layer is the activation function, as is the convolutional layer.

 

            L2 normalization is weight decay, which is used in w of wx+b .

 

 

            L1 normalization

 

 

            After the convolution term is convolved with the convolution kernel, the result obtained is called the feature map ( feature map ).

 

            The function of the convolution kernel is to find the part of the image that is most similar to its own texture.

 

            Each graph is called a channel.

 

            Step-by-step is many-to-many.

 

                 The following figure is an explanation of 1×1 convolution

 

Regarding the Bound Box Regression algorithm, I read https://blog.csdn.net/ap1005834/article/details/77915794 and https://blog.csdn.net/zijin0802034/article/details/77685438, the latter one is The explanation of some incomprehensible places in the previous article, among which "why can the model be approximated by linear regression when IoU>6" is the best, in terms of the equivalent infinitesimal in high numbers . 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325254421&siteId=291194637
Recommended