Cross entropy and maximum likelihood

Cross entropy are used to calculate a distance between two probability function or mode of calculation is used KL Divergence

As appreciated that the cross-entropy loss function neural network meaning:
cross entropy is portrayed from the actual output (probability) with the desired output (probability), that is, the smaller the cross entropy value, the closer the two probability distributions, i.e., the proposed together better.
CrossEntropy = H (p) + DKL (p||q) CrossEntropy = H (p) + DKL (p||q) Cross Entropy = H (p) + DKL (p || q) CrossEntropy = H (p) + DKL (p||q)
when p distributions are known, the entropy is constant; then cross-entropy and the KL divergence is equivalent.
Minimizing the KL divergence and model using the maximum likelihood estimation parameter estimation is consistent. (It can be proved from the formula derivation)
which is a lot model and using the maximum likelihood estimation as a cause of loss of function.

Guess you like

Origin www.cnblogs.com/ivyharding/p/11391008.html