FCN笔记(Fully Convolutional Networks for Semantic Segmentation)

FCN笔记(Fully Convolutional Networks for Semantic Segmentation)

 

FCN replaces the fully connected layer with a convolutional layer (the size of the convolutional layer?), and finally generates a heatmap.

The size of the convolutional layer is (1,1,4096), (1,1,4096), (1,1,1000)

 

FCN is faster than the previous method in both forward and backward calculations. FCN generates a 10*10 result, which takes 22ms, while the previous method generates 1 result, which takes 1.2ms. If it is 100 As a result, it takes 120ms, so the FCN is faster.

 

After using the fully convolutional layer, there is no requirement for the size of the input image (for verification),

 

 

 

Because after multiple convolutions and pooling, the obtained images are getting smaller and smaller, and the resolution is getting lower and lower. In order to obtain information, FCN uses upsampling (using deconvolution) to achieve size reduction. Not only the feature map after pool5 is restored, but also the feature map after pool4 and pool3. The results show that the semantic information about the picture can be obtained from these feature maps. big, the effect is getting better and better

 

 关于patch wise training and fully convolutional training

Answers in stackoverflow

The term "Fully Convolutional Training" just means replacing fully-connected layer with convolutional layers so that the whole network contains just convolutional layers (and pooling layers).

The term "Patchwise training" is intended to avoid the redundancies of full image training. In semantic segmentation, given that you are classifying each pixel in the image, by using the whole image, you are adding a lot of redundancy in the input. A standard approach to avoid this during training segmentation networks is to feed the network with batches of random patches (small image regions surrounding the objects of interest) from the training set instead of full images. This "patchwise sampling" ensures that the input has enough variance and is a valid representation of the training dataset (the mini-batch should have the same distribution as the training set). This technique also helps to converge faster and to balance the classes. In this paper,they claim that is it not necessary to use patch-wise training and if you want to balance the classes you can weight or sample the loss. In a different perspective, the problem with full image training in per-pixel segmentation is that the input image has a lot of spatial correlation. To fix this, you can either sample patches from the training set (patchwise training) or sample the loss from the whole image. That is why the subsection is called "Patchwise training is loss sampling". So by "restricting the loss to a randomly sampled subset of its spatial terms excludes patches from the gradient computation." They tried this "loss sampling" by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.In a different perspective, the problem with full image training in per-pixel segmentation is that the input image has a lot of spatial correlation. To fix this, you can either sample patches from the training set (patchwise training) or sample the loss from the whole image. That is why the subsection is called "Patchwise training is loss sampling". So by "restricting the loss to a randomly sampled subset of its spatial terms excludes patches from the gradient computation." They tried this "loss sampling" by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.In a different perspective, the problem with full image training in per-pixel segmentation is that the input image has a lot of spatial correlation. To fix this, you can either sample patches from the training set (patchwise training) or sample the loss from the whole image. That is why the subsection is called "Patchwise training is loss sampling". So by "restricting the loss to a randomly sampled subset of its spatial terms excludes patches from the gradient computation." They tried this "loss sampling" by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.you can either sample patches from the training set (patchwise training) or sample the loss from the whole image. That is why the subsection is called "Patchwise training is loss sampling". So by "restricting the loss to a randomly sampled subset of its spatial terms excludes patches from the gradient computation." They tried this "loss sampling" by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.you can either sample patches from the training set (patchwise training) or sample the loss from the whole image. That is why the subsection is called "Patchwise training is loss sampling". So by "restricting the loss to a randomly sampled subset of its spatial terms excludes patches from the gradient computation." They tried this "loss sampling" by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.by randomly ignoring cells from the last layer so the loss is not calculated over the whole image.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324923976&siteId=291194637