Normalization of action, LN, BN, WN

Reference: https://zhuanlan.zhihu.com/p/33173246?utm_source=wechat_session&utm_medium=social&utm_oi=611573545537507328

General use bn, then, required for initialization of the model is not so high, but the end result may not be so good.

1. Cause of raw data that needs to whitening:

So that data remain independent and identically distributed as possible.

1. Data has been a regular why do you need all kinds of Normalization

Deep training is difficult because the data After a certain level, if not regular, then each layer enter the updated upper parameters obtained will once again change the distribution of high-level data distribution changes will be very powerful, so parameter update strategy is important. If the result of each layer can be relatively uniform distribution normalization, then, each receiver to avoid the effects of different distributions generated.

Guess you like

Origin www.cnblogs.com/wb-learn/p/11695609.html