版权声明:本文为博主CSDN Rosefun96原创文章。 https://blog.csdn.net/rosefun96/article/details/88929748
1 理论
损失函数的反向传播就是层层链式求导,得到每个矩阵或者向量的增量,然后更新。
2 实践
import numpy as np
x = np.array([[0,0,1], [0,1,1],[1,0,1],[1,1,1]])
y = np.array([[0,1,1,0]]).T
w0 = 2*np.random.random((3,4))-1
w1 = 2*np.random.random((4,1))-1
for j in range(60000):
l1 = 1/(1+ np.exp(-(np.dot(x,w0))))
l2 = 1/(1+ np.exp(-np.dot(l1,w1)))
l2_delta= (y-l2)*(l2*(1-l2))
l1_delta = l2_delta.dot(w1.T)*(l1*(1-l1))
w1 += l1.T.dot(l2_delta)
w0 += x.T.dot(l1_delta)
l1 = 1/(1+ np.exp(-(np.dot(x,w0))))
l2 = 1/(1+ np.exp(-np.dot(l1,w1)))
print(l1)
print(l2)
结果:
[[0.00321158]
[0.99531495]
[0.99592019]
[0.00482494]]
在深度学习中,梯度可以自动求,比如在Pytorch里边,
import torch
device = "cuda:0"
N, D = 3, 4
x = torch.randn(N, D, requires_grad = True)
y = torch.randn(N, D)
z = torch.randn(N, D)
a = x*y
b = a+ z
c = torch.sum(b)
c.backward()
print(x.grad)
输出的梯度:
tensor([[ 0.3665, 0.3097, -0.1923, -0.1062],
[-0.7262, 0.1843, -1.3898, 1.7717],
[ 0.1283, 0.1792, -1.3739, 1.4119]])
参考:
1 cs231n 课件