loss反向传播

版权声明:本文为博主CSDN Rosefun96原创文章。 https://blog.csdn.net/rosefun96/article/details/88929748

1 理论

损失函数的反向传播就是层层链式求导,得到每个矩阵或者向量的增量,然后更新。

2 实践


import numpy as np
x = np.array([[0,0,1], [0,1,1],[1,0,1],[1,1,1]])
y = np.array([[0,1,1,0]]).T
w0 = 2*np.random.random((3,4))-1
w1 = 2*np.random.random((4,1))-1
for j in range(60000):
    l1 = 1/(1+ np.exp(-(np.dot(x,w0))))
    l2 = 1/(1+ np.exp(-np.dot(l1,w1)))
    l2_delta= (y-l2)*(l2*(1-l2))
    l1_delta = l2_delta.dot(w1.T)*(l1*(1-l1))
    w1 += l1.T.dot(l2_delta)
    w0 += x.T.dot(l1_delta)
l1 = 1/(1+ np.exp(-(np.dot(x,w0))))
l2 = 1/(1+ np.exp(-np.dot(l1,w1)))
print(l1)
print(l2)

结果:

[[0.00321158]
 [0.99531495]
 [0.99592019]
 [0.00482494]]

在深度学习中,梯度可以自动求,比如在Pytorch里边,

import torch
device = "cuda:0"
N, D = 3, 4
x = torch.randn(N, D, requires_grad = True)
y = torch.randn(N, D)
z = torch.randn(N, D)

a = x*y 
b = a+ z
c = torch.sum(b)

c.backward()
print(x.grad)

输出的梯度:


tensor([[ 0.3665,  0.3097, -0.1923, -0.1062],
        [-0.7262,  0.1843, -1.3898,  1.7717],
        [ 0.1283,  0.1792, -1.3739,  1.4119]])

参考:
1 cs231n 课件

猜你喜欢

转载自blog.csdn.net/rosefun96/article/details/88929748