softmax损失函数的求导

可以参考:https://blog.csdn.net/qian99/article/details/78046329

第一篇博客中:

将   C=-\sum_{i} y_{i} \ln a_{i}      定义成:

   L_{i}=-\sum_{j=1}^{k} 1\left\{y_{(i)}=j\right\} \log \frac{e^{\tilde{z}_{j}}}{\sum_{l=1}^{k} e^{\dot{z}_{k}}}=-\hat{y}_{i} \ln y_{i}  

或者:   

  L=-\frac{1}{m}\left[\sum_{i=1}^{m} \sum_{j=1}^{k} 1\left\{y_{(i)}=j\right\} \log \frac{e^{z_{j}}}{\sum_{t=1}^{k} e^{z_{k}}}\right]

                 =-\frac{1}{m}\left[\sum_{i=1}^{m} \hat{y}_{i} \log \frac{e^{z_{i}}}{\sum_{t=1}^{k} e^{z_{k}}}\right]=-\frac{1}{m}\left[\sum_{i=1}^{m} \hat{y}_{i} \log y_{i}\right]

将会更加容易理解

https://www.cnblogs.com/zongfa/p/8971213.html

猜你喜欢

转载自blog.csdn.net/HUNXIAOYI561/article/details/89958105