CNN, Exploration of Displacement Invariance, Making Convolutional Networks Shift-Invariant Again

Data that has not been enhanced by transform will lose robustness when faced with sample displacement. Making Convolutional Networks Shift-Invariant Again explains this problem and proposes a solution:

Paper believes that conventional CNN network produces shift-variance is due stride and down-sampling operation excellent results in the

It is easy to understand with one-dimensional data. When using maxpool with k=2, s=2 to operate on [0,0,1,1,0,0,1,1], you will get [0,1,0,1]. The data is shifted [0,1,1,0,0,1,1,0], and the output results have changed dramatically [1,1,1,1]. I personally think that in layman's terms, all the reduction operations in CNN are possible Lead to shift-variance, combined with personal understanding, get the following conclusions, shift-variance is due to the frequency spectrum leakage caused by stride and downsampling.

The author's solution is also very clear. Use Gaussian smoothing kernel to smooth the picture before downsampling, and then downsampling. The effect is very simple. It reduces the high-frequency pixel evaluation rate and increases its spatial distribution scale to avoid excessive downsampling from leaking relevant information. Drop

In the project, when Fasterrcnn detects small targets, the displacement of the same target will lead to missed detection. You can smooth the stride con and Roi pooling.

引用  Making Convolutional Networks Shift-Invariant Again

Guess you like

Origin blog.csdn.net/dl643053/article/details/108274204