tensorflow 2.0 随机梯度下降 之 损失函数及其梯度

版权声明:本文为博主([email protected])原创文章,未经博主允许不得转载。 https://blog.csdn.net/z_feng12489/article/details/90033237

6.3 损失函数及其梯度

MSE

  1. l o s s = 1 N ( y o u t ) 2 loss = \frac{1}{N}\sum (y-out)^2 这里 N = B N u m O f C l a s s N = B * NumOfClass
  2. L 2 n o r m = ( y o u t ) 2 L_{2-norm} = \sqrt{\sum(y-out)^2}

导数:

  1. l o s s = [ y f θ ( x ) ] 2 loss = \sum [y-f_\theta(x)]^2
  2. l o s s θ = 2 [ y f θ ( x ) ] f θ ( x ) θ \frac{\nabla loss}{\nabla\theta}=2\sum [y-f_\theta(x)]*\frac{\nabla f_\theta(x)}{\nabla\theta}
x=tf.random.normal([1,3])

w=tf.ones([3,2])

b=tf.ones([2])

y = tf.constant([0, 1])


with tf.GradientTape() as tape:

	tape.watch([w, b])
	logits = tf.sigmoid(x@w+b) 
	loss = tf.reduce_mean(tf.losses.MSE(y, logits))

grads = tape.gradient(loss, [w, b])
print('w grad:', grads[0])

print('b grad:', grads[1])
w grad: tf.Tensor(
[[-0.0309717   0.04595036]
 [-0.1207474   0.17914379]
 [ 0.01667333 -0.02473696]], shape=(3, 2), dtype=float32)
b grad: tf.Tensor([ 0.09684253 -0.14367795], shape=(2,), dtype=float32)

cross entropy

softmax:
p i = e a i k = 1 N e a k p_i=\frac{e^{a_i}}{\sum_{k=1}^{N}e^{a_k}}
在这里插入图片描述

  1. 转移成概率值
  2. 拉大值之间的差距

Softmax 求导:
在这里插入图片描述
得到:
在这里插入图片描述
在这里插入图片描述
p i a j = p i ( δ i j p j ) \frac{\partial p_i}{\partial a_j}=p_i(\delta_{ij}-p_j)

x=tf.random.normal([1,3])

w=tf.random.normal([3,2])

b=tf.random.normal([2])

y = tf.constant([0, 1])


with tf.GradientTape() as tape:

	tape.watch([w, b])
	logits = (x@w+b)
	loss = tf.reduce_mean(tf.losses.categorical_crossentropy(y, logits, from_logits=True))

grads = tape.gradient(loss, [w, b])
print('w grad:', grads[0])

print('b grad:', grads[1])
w grad: tf.Tensor(
[[ 0.23242529 -0.23242529]
 [ 0.9089024  -0.9089024 ]
 [ 0.58031267 -0.58031267]], shape=(3, 2), dtype=float32)
b grad: tf.Tensor([ 0.6567008 -0.6567008], shape=(2,), dtype=float32)

猜你喜欢

转载自blog.csdn.net/z_feng12489/article/details/90033237