Seeking univariate linear regression fitting parameter gradient descent

Copyright: Reprinted attribution to https://blog.csdn.net/hpu2022/article/details/90409289

Recommended blog:

https://blog.csdn.net/winone361/article/details/88786513

https://blog.csdn.net/wyl1813240346/article/details/78366390

https://blog.csdn.net/winone361/article/details/88787256

Gradient descent method , also known as the steepest descent method. 1847 by the famous mathematician Cauchy Cauchy given.

Vector is used to write the following like this:

1. Assuming fit function: 

|
Vector writing:

2. constructor loss function:
loss function is a function of the assumed value calculated - the true value of the square, and then seek a. ,

Multiplied by 1/2 in order to facilitate the calculation.

3. To minimize the loss of function, so that we fit function to the greatest extent possible the objective function y fitting.

Thought to minimize the loss function using gradient descent algorithm. Selecting a first vector θ as the initial vector, and then kept it iterates, subtracting each element of each vector in step (learning rate) by the loss function to its partial derivatives, the amount of change is determined after each update element if less than the setting parameters (a small number close to zero).

Example:

Suppose a function h (x) = θ0 + θ1x, θ starts initialization vector is zero vector, followed by iteration until the stop condition is satisfied so far.

BGD code:

import numpy as np
from matplotlib import pyplot

theta = []  # 存储theta0和theta1的中间结果
area = [150, 200, 250, 300, 350, 400, 600]  # 数据
price = [6450, 7450, 8450, 9450, 11450, 15450, 18450]
def BGDSolve(): # 批量梯度下降
    alpha = 0.00000001 # 步长
    kec = 0.00001     # 终止条件
    theta0 = 7  # 初始值
    theta1 = 7
    m = len(area)   # 数据个数
    theta.append((theta0, theta1))
    while True:
        sum0 = sum1 = 0
        # 计算求和求导过的损失函数
        for i in range(m):
            sum0 = sum0 + theta0 + theta1 * area[i] - price[i]
            sum1 = sum1 + (theta0 + theta1 * area[i] - price[i]) * area[i]
        theta0 = theta0 - sum0 / m * alpha # 公式上是 alpha/m * sum0
        theta1 = theta1 - sum1 / m * alpha
        print(theta0, theta1)
        theta.append((theta0, theta1))  # 保存迭代结果
        # 迭代终止条件,变化量小于kec就终止
        if abs(sum0/m * alpha) < kec and abs(sum1/m *alpha) < kec:
            return theta0, theta1

def Plot():     # 绘图函数
    theta0, theta1 = BGDSolve()
    pyplot.scatter(area, price)
    x = np.arange(100, 700, 100)
    y = theta0 + theta1*x
    pyplot.plot(x, y)
    pyplot.xlabel('area')
    pyplot.ylabel('price')
    pyplot.show()
if __name__ == '__main__':
    theta0, theta1 = BGDSolve()
    Plot()
    print(len(theta))

 

 

FIG BCD fitting effect: 

SGD code:

import numpy as np
from matplotlib import pyplot
import random

theta = []  # 存储theta0和theta1的中间结果
area = [150, 200, 250, 300, 350, 400, 600]  # 数据
price = [6450, 7450, 8450, 9450, 11450, 15450, 18450]
def SGDSolve():
    alpha = 0.00000001  # 步长
    kec = 0.00001  # 终止条件
    theta0 = 7  # 初始值
    theta1 = 7
    m = len(area)  # 数据个数
    theta.append((theta0, theta1))
    while True:
        # 随机梯度下降,每次只用一个样本点
        i = random.randint(0,m-1)
        sum0 = theta0 + theta1 * area[i] - price[i]
        sum1 = (theta0 + theta1 * area[i] - price[i]) * area[i]

        theta0 = theta0 - sum0 * alpha
        theta1 = theta1 - sum1 * alpha
        theta.append((theta, theta1))
        # 变化量小于kec时,终止迭代
        if abs(sum0/m * alpha) < kec and abs(sum1/m * alpha) < kec:
            return theta0, theta1


def Plot():     # 绘图函数
    theta0, theta1 = SGDSolve()
    pyplot.scatter(area, price)
    x = np.arange(100, 700, 100)
    y = theta0 + theta1*x
    pyplot.plot(x, y)
    pyplot.xlabel('area')
    pyplot.ylabel('price')
    pyplot.show()
if __name__ == '__main__':
    theta0, theta1 = SGDSolve()
    Plot()
    print(len(theta))

FIG SGD fitting effect:

Guess you like

Origin blog.csdn.net/hpu2022/article/details/90409289