Group Norm,Batch Norm,Layer Norm

Group Norm (GN), Batch Norm (BN) and Layer Norm (LN) are commonly used normalization methods, which can improve the training effect of the model in deep learning.

Difference and connection:

BN is to normalize the data of a batch, LN is to normalize all the features of each sample, and GN is to divide the channel into several groups and normalize the data of each group.
BN and LN are suitable for small batches of data because they use the mean and variance of the small batch of data when normalizing, but GN can also perform well for smaller batch sizes because it calculates within each group mean and variance.
Normalization for BN and GN is performed on a per-channel basis, while LN is performed on a per-sample basis.

Advantages and disadvantages:

BN can speed up the convergence, but it needs to save the mean and variance of the training set during the test, so it will increase the additional memory and calculation burden.
GN is more robust than BN, performs better for small batch sizes, and does not require additional computational burden.
LN performs better in RNN and sequence models, but performs poorly in tasks in the image domain.

Applicable scene:

BN is suitable for larger batch sizes, such as greater than or equal to 32.
GN is suitable for small batch sizes and irregular training data.
LN is suitable for RNN and sequence models.

Guess you like

Origin blog.csdn.net/qq_44324007/article/details/130362707