[Paper reading notes] Imporved Regularization of Convolutional Neural Networks with Cutout

Paper address: Cutout

Paper summary

  The method in this article is called cutout , which is a data augmentation method, mainly used in classification tasks.
  The cutout method is to randomly select a point in the image as the center point and cover a fixed-size square zero-mask. The size of the mask is a hyperparameter, in the text it is the length obtained by grid search. The mask area can be outside the image.

Introduction

  The starting point of the cutout method is to prevent CNN from overfitting as a regularization method. The cutcout method is very simple, that is, during training, a square matrix is ​​applied at random locations.
  The author believes that this technology encourages the Internet to use the information of the entire picture instead of relying on a small part of specific visual features.

  Compared to dropout, cutout is more like a means of data enhancement, rather than adding noise.

  When first applying maks, the author also tried to apply masks to key parts (those areas with the largest activation value), and got good results (as shown in the figure below). However, it was later discovered that the effect of randomly removing fixed-size areas and directly on the target area was equivalent, so the strategy of removing fixed-size areas was adopted later.

  At the same time, the author found that the choice of zero-mask area size is more important than the choice of shape . The size selection is done by grid search in the text, but it is applied to smaller data sets (CIFAR10/CIFAR100/SVHN). When selecting the application area, it is found that the zero-mask random application effect is better, that is, part of the mask is outside the image. The author explained that part of the mask outside the image is the key to achieving good performance.

Thesis experiment

CIFAR-10 and CIFAR-100

  The image size of the CIFAR dataset is 32 ∗ 32 32*32323 2 . Search the side length of the zero-mask through the grid, the relationship between the side length and accuracy is as follows: select16 ∗ 16 16*16on CIFAR1016. 1 . 6 pixels, selected on CIFAR100. 8 * *. 8. 8. 888 pixels. The author believes that as the category increases, the optimal cutout size decreases, which is important. When more fine-grained detection is required, the contextual information of the image is not very useful in identifying categories. On the contrary, smaller and more subtle details are more important.

SVHN

  The image size of the SVHN dataset is 32 ∗ 32 32*32323 2 , the final cutout size is20 ∗ 20 20*202020

STL-10

  The image size of the STL-10 data set is 96 ∗ 96 96*96969 6 , the final cutout size is24 ∗ 24 24*24242 4 (when there is no data enhancement) or32 ∗ 32 32*32323 2 (when there is data enhancement).

Cutout's influence on activation value

  After applying cutout, the author found that the intensity of the shallow activation value generally increases, and the increase in the deep activation value is mainly in the tail.

Guess you like

Origin blog.csdn.net/qq_19784349/article/details/107325616