Principle and Implementation of Linear Regression Algorithm

We have previously covered several machine learning algorithms, all of which are used for classification. For a change of taste today, let’s learn how to perform regression. Regression is to predict new data based on existing data, such as predicting product sales.


Let's take a look at the simplest linear regression. Based on standard linear regression, more linear regression algorithms can be extended, such as local weighted linear regression based on kernel function, lasso, etc. If you want to know, you can refer to the relevant information. Let's take a look at the most basic principles of linear regression.


From linear algebra, we can define a system of linear equations Xw=y, in a linear regression problem, X is our sample data matrix and y is our expected value vector, that is, for a linear regression problem, our X and y are It is known that the problem we need to solve is to find the most suitable vector w, so that the linear equation system can satisfy the linear distribution of the sample points as much as possible. Then we can use the obtained w to make predictions for new data points.


Here we use the squared error to evaluate the error between the actual y value and the predicted value:

Modified to matrix form:

We want to minimize the squared error. According to the content of calculus, we can derive the above formula with respect to w:


Set the above formula equal to 0, and then obtain:


The right side of the equation is all our known data, so the best estimate of w can be obtained.


It can be seen that the principle of linear regression is relatively simple, but if you pay attention to the above formula, you will find that we need to invert the matrix, but what if the matrix is ​​irreversible? There are two ideas here. One is to adopt the idea of ​​ridge regression and introduce the parameter lambda:


In this way, according to the theory of linear algebra, if lambda is not 0, the inverse matrix must exist. Take a look at the code below:

def linear_regression(x_arr, y_arr, lam=0.2):
    x_mat = np.mat(x_arr).T
    y_mat = np.mat(y_arr).T

    x_tx = x_mat.T * x_mat
    denom = x_tx + np.eye(np.shape(x_mat)[1]) * lam

    # if lam == 0.0
    if np.linalg.det (denom) == 0.0:
        print('This matrix is singular, cannot do inverse')
        return

    ws = denom.I * (x_mat.T * y_mat)
    return ws

Take a look at the demo called again:

if __name__ == '__main__':
    x_vals = np.linspace(0, 1, 1000)
    y_vals = x_vals + np.random.normal(0, 1, 1000)
    ws = linear_regression(x_vals, y_vals)

    predict = 20 * ws
    print(predict.A[0][0])

We constructed x and y to satisfy x=y as much as possible, but added noise from a standard normal distribution. After finding ws, we predicted the value of y for x=20. Here is the result of my one run, and the predictions work fine:

19.690649736617942


Another idea is that we directly use the gradient descent method to find the minimum value of the squared error, here we use tensorflow to achieve. First import the required packages and prepare the data:

import numpy as np
import tensorflow as tf


learning_rate = 0.05
batch_size = 50

x_vals = np.linspace(0, 1, 1000)
y_vals = x_vals + np.random.normal(0, 1, 1000)
x_vals.resize((x_vals.shape[0], 1))
y_vals.resize((y_vals.shape[0], 1))

Then construct the model, x_data and y_target are placeholders, which are passed in when training the model, and w is the target variable we want to get when training the model:

sex = tf.Session ()
x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)
w = tf.Variable(tf.random_normal([1, 1]))
model_output = tf.matmul(x_data, w)

Define the mean squared error, then use gradient descent to optimize and find the minimum

loss = tf.reduce_mean(tf.square(y_target - model_output))
init = tf.global_variables_initializer()
sess.run(init)
my_opt = tf.train.GradientDescentOptimizer(learning_rate)
train_step = my_opt.minimize(loss)

Finally, use stochastic gradient descent for training and output test results

for i in range(500):
    rand_index = np.random.choice(len(x_vals), size=batch_size)
    rand_x = x_vals[rand_index]
    rand_y = y_false[rand_index]
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})

[k] = sess.run (w)
predict = 20 * k
print(predict[0])

One run result:

19.260855


So far, we have implemented linear regression using two methods. The first method is relatively straightforward and implements the algorithm directly. The second method uses tensorflow to obtain the optimal value through the gradient descent algorithm.



Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325567817&siteId=291194637