Unary linear regression learning record

1. Return

        In the regression model, the variable we need to predict is called the dependent variable, such as product quality; the variable selected to explain the change of the dependent variable is called the independent variable, such as user satisfaction. The purpose of regression is to establish a regression equation to predict the target value , and the entire regression solution process is to find the regression coefficient of this regression equation.

        Given a point set, construct a function to fit the point set, and minimize the error between the point set and the fitting function as much as possible. If the function curve is a straight line, it is called linear regression. If The curve is a cubic curve, which is called cubic polynomial regression.

2. Linear regression

 Suppose there is a data set in the table, which is the cost and profit data set of a certain enterprise . The data set from 2002 to 2016 in the data set is called the training set , and there are 15 sample data in the entire training set. The focus is on the two variables of cost and profit, cost is an input variable or a feature, and profit is an output variable or target variable.

To build a model, x represents the cost of the enterprise, y represents the profit of the enterprise, h (Hypothesis) represents the function that maps the input variable to the output variable y, and the linear regression (univariate linear regression) formula corresponding to a dependent variable is as follows:

insert image description here

So, the problem to be solved now is how to solve the sum of the two parameters. Our idea is to select the parameters and make the function as close as possible to the y value. Here we propose the squared error function (Squared Error Function) or the least squares method of the training set (x, y). In the regression equation, the method of minimizing the sum of squared errors is the best method to find the regression coefficient corresponding to the feature . The error refers to the difference between the predicted y value and the true y value . Using a simple accumulation of errors will cause the positive and negative differences to cancel each other out. The squared error (least squares method) used is as follows:

insert image description here

Mathematically, the solution process is transformed into finding a set of values ​​to make the above formula take the minimum value. The most common solution method is the gradient descent method. According to the square error, the loss function of the linear regression model is defined as follows:

insert image description here

The fitting solution process can be realized by using the gradient descent method to obtain appropriate parameters to minimize min . Through the above example, we can define the linear regression model as follows: According to the coordinates of the samples x and y, to estimate the function h, and seek the approximate functional relationship between variables. The formula is as follows:

insert image description here

 Among them, n represents the number of features, which represents the i-th eigenvalue of each training sample. When there is only one dependent variable x, it is called a linear regression, similar to that; when there are multiple dependent variables, it becomes a multiple linear regression. Our goal is to minimize and thus best fit the sample data set and better predict new data.

Relevant additions to the formula:

  • m is the number of points in the data set
  • ½ is a constant, so that when the gradient is calculated, the quadratic multiplication will be offset by ½ here. Naturally, there will be no redundant constant coefficients, which is convenient for subsequent calculations and will not affect the results.
  • y is the value of the true y-coordinate of each point in the dataset
  • h is our prediction function, according to each input x, the predicted y value is calculated according to Θ

There are two variables in the cost function, so it is a multi-variable gradient descent problem, and the gradient of the cost function is solved, that is, the two variables are differentiated separately

What I don't understand:

        In order to facilitate the writing of code, we will convert all formulas into matrix form. We have two variables. In order to matrix this formula, we can add one dimension to each point x, and the value of this dimension is fixed. is 1, this dimension will be multiplied to Θ0. In this way, it is convenient for us to unify the calculation of matrix,

       Then we transform the cost function and gradient into matrix-vector multiplication:

3. Univariate linear regression real column

from numpy import *

# 数据集大小 即20个数据点
m = 20
# x的坐标以及对应的矩阵
X0 = ones((m, 1))  # 生成一个m行1列的向量,也就是x0,全是1
X1 = arange(1, m+1).reshape(m, 1)  # 生成一个m行1列的向量,也就是x1,从1到m
X = hstack((X0, X1))  # 按照列堆叠形成数组,其实就是样本数据
# 对应的y坐标
Y = array([
    3, 4, 5, 5, 2, 4, 7, 8, 11, 8, 12,
    11, 13, 13, 16, 17, 18, 17, 19, 21
]).reshape(m, 1)
# 学习率
alpha = 0.01


# 定义代价函数
def cost_function(theta, X, Y):
    diff = dot(X, theta) - Y  # dot() 数组需要像矩阵那样相乘,就需要用到dot()
    return (1/(2*m)) * dot(diff.transpose(), diff)


# 定义代价函数对应的梯度函数
def gradient_function(theta, X, Y):
    diff = dot(X, theta) - Y
    return (1/m) * dot(X.transpose(), diff)


# 梯度下降迭代
def gradient_descent(X, Y, alpha):
    theta = array([1, 1]).reshape(2, 1)
    gradient = gradient_function(theta, X, Y)
    while not all(abs(gradient) <= 1e-5):
        theta = theta - alpha * gradient
        gradient = gradient_function(theta, X, Y)
    return theta


optimal = gradient_descent(X, Y, alpha)
print('optimal:', optimal)
print('cost function:', cost_function(optimal, X, Y)[0][0])


# 根据数据画出对应的图像
def plot(X, Y, theta):
    import matplotlib.pyplot as plt
    ax = plt.subplot(111)  # 这是我改的
    ax.scatter(X, Y, s=30, c="red", marker="s")
    plt.xlabel("X")
    plt.ylabel("Y")
    x = arange(0, 21, 0.2)  # x的范围
    y = theta[0] + theta[1]*x
    ax.plot(x, y)
    plt.show()


plot(X1, Y, optimal)

 

 The fitted straight line:

 The related regression model library function LinearRegression:

insert image description here

 Learning articles:

Explanation of multiple regression examples: http://t.csdn.cn/6grAp

Basics of Linear Regression: Simple and Simple--Gradient Descent Method and Its Implementation- Short Book (jianshu.com)

Guess you like

Origin blog.csdn.net/weixin_52093896/article/details/129990143