batch normalization 与 layer normalization

 

ln bn and the essential difference:

Normalization is batch longitudinal normalizing , normalizing each neuron in the same layer as the direction of the batch, i.e., having a different mean and variance of each neuron in the same layer .

layer normalization is transverse normalized , i.e., having the same mean and variance of all neurons in the same layer .


 

ln bn and use the difference between:

1. If the batch size is too small, it is difficult to get statistics global response information, and therefore not suitable for use bn; and ln do not care batch size.

2. As rnn sentence length coding information is considered, and a batch in the same sentence length may be inconsistent, and therefore used in the batch rnn Normalization is problematic, and ln is lateral normalization can be applied in network rnn .

 

Guess you like

Origin www.cnblogs.com/zhufz/p/11352403.html