目前跟着b站“刘二大人”学习pytorch实践课程
网址:https://www.bilibili.com/video/BV1Y7411d7Ys?p=3&vd_source=32b3ab6f83a7264145dc021d4ff722f6
1. 梯度下降Gradient Descent
说明:
1. GD(梯度下降)算法是朝着梯度的反方向迭代
2. 可能最后结果是局部最优(如图所示)
# 梯度下降【Gradient Descent】
import matplotlib.pyplot as plt
# 准备数据
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
# 初始猜测的权值
w = 1.0
# 预测函数
def forward(x):
return x * w
# 损失函数(所有训练数据损失求平均)
def cost(xs, ys):
cost = 0.0
for x, y in zip(xs, ys):
y_pred = forward(x) # 预测
cost += (y_pred - y) ** 2
return cost / len(xs)
# 求梯度函数
def gradient(xs, ys):
grad = 0.0
for x, y in zip(xs, ys):
grad += 2 * x * (x * w - y)
return grad / len(xs)
print("Predict (before training)", 4, forward(4))
epoch_list = []
cost_list = []
# 梯度下降开始迭代
for epoch in range(100):
cost_val = cost(x_data, y_data)
grad_val = gradient(x_data, y_data)
w -= 0.01 * grad_val # 0.01为自己设定的学习率
print('Epoch=', epoch, 'w=', w, 'loss=', cost_val)
epoch_list.append(epoch)
cost_list.append(cost_val)
print('Predict(after training)', 4, forward(4))
#作图
plt.plot(epoch_list, cost_list)
plt.xlabel('epoch')
plt.ylabel('cost')
plt.show()
2. 随机梯度下降(Stochastic Gradient Descent)
两者区别:
1. GD是对所有训练数据计算梯度,然后求其均值;梯度之间没有关联
2. SGD是对每一个训练数据计算梯度,前后梯度之间有关联
# 随机梯度下降【Stochastic Gradient Descent】
import matplotlib.pyplot as plt
# 准备数据
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]
# 初始猜测的权值
w = 1.0
# 预测函数
def forward(x):
return x * w
# 损失函数(每一个训练数据求损失)
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) ** 2
# 求梯度函数(每一个训练数据都要算一个损失)
def gradient(x, y):
grad = 2 * x * (x * w - y)
return grad
print("Predict (before training)", 4, forward(4))
epoch_list = []
loss_list = []
# 梯度下降开始迭代
for epoch in range(100):
for x, y in zip(x_data, y_data):
grad_val = gradient(x, y)
w -= 0.01 * grad_val # 0.01为自己设定的学习率
print("\tgrad:", x, y, grad_val)
l = loss(x, y)
epoch_list.append(epoch)
loss_list.append(l)
print("progress:", epoch, 'w=', w, "loss=", l)
print('Predict(after training)', 4, forward(4))
#作图
plt.plot(epoch_list, loss_list)
plt.xlabel('epoch')
plt.ylabel('loss')
plt.show()