Deep learning study notes (c)

(A) batch normalization and residuals network

Batch normalization (BatchNormalization)

Standardized (shallow model) input
mean any feature in the processed data set to 0 on all samples, the standard deviation of 1.
Normalization processing input data distribution similar to the respective features

Batch normalized (depth model)
by using the mean value and the standard difference in small quantities, the intermediate output continuously adjust the neural network, the neural network so that the entire intermediate output value is more stable in each layer.

Respectively, in the whole connection layer, layer convolution, normalized quantities make the prediction.

Residual network (ResNet)

Depth study of the problem: the depth of CNN network reaches a certain depth and then blindly increase the number of layers and can not bring further improve the classification performance, but will lead to network convergence becomes slower, accuracy becomes worse.

Residual block (Residual Block)
identity mapping:
left: f (x) = x
to the right: f (x) -x = 0 ( fluctuations easily capturing fine identity mapping)
Here Insert Picture Description
in the residual block, may be input through cross START data link layer, the forward propagation faster.

ResNet model

  1. Convolution (64,7x7,3)

  2. Batch integration

  3. The largest pool of (3x3,2)

  4. X4 residual block (residual block 2 is reduced in each module between the height and width of a step width)

  5. Global average Pooling

  6. Fully connected

## dense network connection (DenseNet)
Here Insert Picture Description
main building blocks:
a dense block (dense block): defines how input and output are connected.
Transition (transition layer): used to control the number of channels, so that too large.

(Ii) a convex optimization

Optimization and estimation
Although optimization method can minimize the loss function value learning depth, but the depth of the target and the target to achieve optimization learning method is not essentially the same.

  • Optimization goal: training set loss function value
  • Depth learning objectives: test set loss function value (generalized)

(C) gradient descent

Gradient: Mathematically, the gradient is intended a vector (vector) representing a function to obtain the maximum value of the derivative along the direction of direction at this point, i.e., along the direction (the gradient function at this point direction) of the fastest changing, the maximum rate of change (gradient for the mold), that is a function of the gradient is such that the maximum value to obtain the gradient direction, that is a function of the direction of increasing value.

Gradient descent algorithm commonly used in three ways:

  1. Batch gradient descent algorithm
  2. Stochastic gradient descent algorithm
  3. Small batch gradient descent algorithm

Click here to know more gradient descent can

(Iv) target detection base

Anchor block
target detection algorithm is usually a large number of samples in the input image area, and then determine whether to include these regions we are interested in the target, and adjust the region of the edge bounding box in order to predict the true target (ground-truth bounding box) more accurately . The method of sampling different areas of use may be different models. Here we describe a method in which: in which each generate a plurality of pixel size and aspect ratio center (aspect ratio) different bounding box. These bounding boxes are called anchor frames (anchor box).
Anchor box labeled training set
in the training set, we will each anchor frame as a training sample. To train the target detection model, we need to mark the box for each anchor tag two categories: one category anchor box contained targets, referred to the category; the second is offset relative to the true bounding box anchor box, referred to as the offset amount (offset ). When the detection target, we first generate a plurality of anchor block, and then followed by adjusting the position of the anchor block for each prediction class, and the anchor block according to the offset to obtain predicted prediction offset bounding box, the final prediction filter needs to output boundary frame.

summary

  1. In the center of each pixel, to generate a plurality of different sizes and aspect ratios of the anchor block.
  2. And the ratio of cross-phase area ratio and the area of ​​the two bounding boxes intersect.
  3. In the training set, marked two types of labels for each anchor boxes: one box anchor category contains target; the second is true bounding box relative offset anchor box.
  4. Prediction, a non-maximal suppression may be used to remove a similar prediction bounding box, so that the result is simple.

(E) migrate style image

Migration patterns
using a convolution neural network to automatically apply a style image over another image.

Content of the image in the following figure is author of the book landscapes in the Seattle suburb of Mount Rainier National Park (Mount Rainier National Park) shot, style and image is a theme of autumn oak painting. The case of a composite image output in the final to retain the shape of the body content of the image object in the application of the painting stroke style images, but also to the overall color more vivid.
Here Insert Picture Description
summary

  1. Migration patterns of common loss function consists of three parts: the loss of the contents of the composite image with the content image on the content characteristics close, so that the loss of style pattern of the image close to the synthesized image on the style characteristics, and can help reduce the loss of the total variation Synthesis noise in the image.
  2. An image feature may be extracted by the convolutional neural network pre-trained, and to continuously update the composite image by minimizing the loss function.
  3. Expression pattern with the output pattern layer Gram matrix.

(Vi) GAN

GAN network advantages and disadvantages:

Generator: can generate a partial image, but not taking into account the relationship between the different parts of the image

Disciminator: consider the global picture, but can not produce images

reference

(Vii) DCGAN

DCGAN stands for Deep Convolution Generative Adversarial Networks (convolution depth generated against network) as the name suggests is based on the depth of the convolution added a GAN, they constitute what we call DCGAN.

In DCGAN we should note the following:

  1. After the network is determined over D we determined result output unit to output into a fully connected network of a
  2. Use LeakyReLU activation function on all layers of the network is determined.
  3. Use RelU activation function network generated on all layers except the output layer uses Tanh activation function.

Reference (actual)

Released four original articles · won praise 5 · Views 175

Guess you like

Origin blog.csdn.net/u014536801/article/details/104458038