Bobo老师机器学习笔记第六课-梯度下降法

思维导图笔记

数学基础链接：

练习代码

# -*- coding: utf-8 -*-

import numpy as np
import matplotlib.pyplot as plt

plot_x = np.linspace(-1, 6, 141)
plot_y = (plot_x - 2.5) ** 2 - 1


def J(theta):
    """
    导数函数，损失函数
    :param theta:
    :return:
    """
    return 2 * (theta - 2.5)

def DJ(x):
    """
    原函数
    :param x:
    :return:
    """
    return ((x - 2.5) ** 2) - 1


def gradient_descent(theta=0, eta=0.1):
    min_value = DJ(theta)
    theta_list = [theta]
    j_plot = [min_value]
    while True:
        delta = eta * (- J(theta))
        theta = theta + delta
        theta_list.append(theta)
        j_plot.append(DJ(theta))
        # 第一种方法 限定仅仅循环到1000次
        # if F(theta) < min_value:
        #     min_value = F(theta)

        # 第二种方法 让变化率小于 1e-15
        if delta < 1e-15:
            break

    return theta_list, j_plot

def draw_graph():
    fit_theta_list, fit_j_plot = gradient_descent(theta=0, eta=0.1)
    large_list, large_j_plot = gradient_descent(theta=0, eta=0.8)
    small_eta_list, small_j_plot = gradient_descent(theta=0, eta=0.05)
    plt.plot(plot_x, plot_y, label='y=(x-2.5)**2 - 1')
    plt.plot(fit_theta_list, fit_j_plot, color='red', marker='+', label='eta=0.1, min=%s' % (fit_j_plot[-1]))
    plt.plot(small_eta_list, small_j_plot, color='blue', marker='*', label='eta=0.05, min=%s' % (small_j_plot[-1]))
    plt.plot(large_list, large_j_plot, color='green', marker='.', label='eta=0.8, min=%s' % (large_j_plot[-1]))
    plt.legend()
    plt.show()


if __name__ == '__main__':
    draw_graph()

运行结果：

编码总结：

1、eta代表的是步长，可以查看eta越大，则收敛的速度越快，但是不一定越大越好，当eta太大的时候会超出范围，导致找不到最低点。一般eta取值0.1

2、让循环终止的办法至少有2种，第一、设置次数比如不超过1000次，一般通常来说，只要eta设置合理，在1000以内就能找到最优解了。第二、判断当前etha和上一个etha在损失函数中的差，如果两者差小于1e*10-16，那么就满足结果了。

3、注意方向导数和梯度直接的差别，方向导数=梯度与单位向量的乘积。

要是你在西安，感兴趣一起学习AIOPS，欢迎加入QQ群 860794445

Bobo老师机器学习笔记第六课-梯度下降法

思维导图笔记

练习代码

猜你喜欢