torch.nn.BatchNorm2d function
What is a batch? '
batch is part of the training set, often because the training set is too large can not enter into the network all at once, so you need delivered in batches so that a batch (batch) per batch
What is Normalization?
Normalization translation is normalized, normalized introduced in order to reduce internal covariatie shift phenomenon, specific performance is in the process of training the Deep Web, the change will affect the front layer parameters of each parameter distribution behind,
It can only opt for a lower rate of learning and initialization parameters cautious when led training. The introduction of Batch Normalization layer (BN) allows us to use a higher rate of learning and not worry too much problem initialization parameters .
Batch Normalization of the specific process:
For a batch input data, first calculates the average and variance of each dimension, the grayscale image of a 2 * 2 is then necessary to calculate the mean and variance of each pixel channel in a batch (total calculated four, four get mean and variance).
Then get batch after normalization by the following formula
Note: calculate the mean and variance in the testing phase has two modes:
The first: instead of the mean and variance of the test phase through a large batch statistic calculated mean and variance of the training phase
The second: to the mean and variance obtained by the first method to be modified by following the testing phase batch mean and variance
running_mean = momentum * running_mean + (1 - momentum) * train_mean
running_var = momentum running_var * + (1 - momentum) * train_var
Which is momoentum weight, train_mean statistics of the training process is the mean of all batch, running_mean is the simple average of the test batch
torch.nn.BatchNorm2d function
形式:torch.nn.BatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
Returns a shape the same tensor num_features
among them:
1.num_features number of input images in channle BATCH (per channle do a normalized)
2.eps is a stability factor
3.momentum weight is the weight of running_mean and running_var
4.affine represents whether learning, by learning from representatives of True, False on behalf of a fixed value
Use 5.track_running_stats testing phase on behalf of the first or second test method mean, True behalf second, False represents a first