A, Batch Normalization concept
Batch Normalization: batch standardization
Batch: a batch of data, typically a mini-batch
Standardization: 0 mean and variance 1
advantage:
1, with a larger learning rate and accelerate model convergence
2, can not elaborate weights initialization
3, can not dropout or a smaller dropout
4, L2 or can not smaller weight decay
5, can not LRN (local response normalization)
Calculation
affine transform enhancements Capacity
Internal Covariate Shift(ICS)
_BatchNorm
nn.BatchNorm1d
nn.BatchNorm2d
nn.BatchNorm3d
parameter:
num_features: a number of sample characteristics (the most important)
eps: the denominator correction term
momentum: exponentially weighted average of the current estimated mean / var
affine: the need for affine transform
track_running_stats: is training state, or state test
The main attributes:
running_mean: Mean
running_var: variance
weight: affine transform the gamma
bias: affine transform in beta
nn.BatchNorm1d input = B * * 1d wherein eigenvalues
nn.BatchNorm2d input = B * * 2d characterized eigenvalues
nn.BatchNorm3d input = B * * 3d characterized eigenvalues