Softmax交叉熵损失函数 反向传播公式推导

Softmax交叉熵损失函数 反向传播公式推导

标签(空格分隔): Caffe源代码


Softmax交叉熵损失函数:

J = i = 1 K y i ln ( z i ) z i = e x i j = 1 K e x j

现在我们的目的是求 J x k

在训练集中,假设 y s = 1 ,其余 y k s = 0

我们分为两种情况:
(1)当 k = s 时:

J x k = ( y s l n ( e x s j = 1 K e x j ) ) x s = y s × j = 1 K e x j e x s × e x s × j = 1 K e x j e 2 x s ( j = 1 K e x j ) 2 = y s × j = 1 K e x j e x s j = 1 K e x j = y s ( 1 z s )

(2)当 k s 时:

J x k = ( y s l n ( e x s j = 1 K e x j ) ) x k = y s × j = 1 K e x j e x s × e x s × e x k ( j = 1 K e x j ) 2 = y s e x k j = 1 K e x j = y s z k

总结:

J x k = z k y k

猜你喜欢

转载自blog.csdn.net/charel_chen/article/details/81266838