Follow AI to learn AI (1): Linear regression model

Follow AI to learn AI-1: Linear regression model

Question: Please use numpy to implement a linear regression model code

answer:

import numpy as np

class LinearRegression:
    def __init__(self, lr=0.01, n_iters=1000):
        # 初始化函数，设置学习率和迭代次数
        self.lr = lr
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        # 训练函数，输入X和y，更新权重和偏置
        n_samples, n_features = X.shape

        self.weights = np.zeros(n_features)
        self.bias = 0

        for _ in range(self.n_iters):
            y_predicted = np.dot(X, self.weights) + self.bias

            # 计算梯度
            dw = (1 / n_samples) * np.dot(X.T, (y_predicted - y))
            db = (1 / n_samples) * np.sum(y_predicted - y)

            # 更新权重和偏置
            self.weights -= self.lr * dw
            self.bias -= self.lr * db

            self.weights -= self.lr * dw
            self.bias -= self.lr * db

    def predict(self, X):
        # 预测函数，输入X，输出预测值
        y_predicted = np.dot(X, self.weights) + self.bias
        return y_predicted

Question: Please use unittest to generate test cases

import unittest

class TestLinearRegression(unittest.TestCase):
    def test_fit(self):
        X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
        y = np.array([6, 15, 24])
        lr = LinearRegression()
        lr.fit(X, y)
        self.assertEqual(lr.weights.tolist(), [3, 3, 3])
        self.assertEqual(lr.bias, 0)

    def test_predict(self):
        X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
        y = np.array([6, 15, 24])
        lr = LinearRegression()
        lr.fit(X, y)
        y_pred = lr.predict(X)
        self.assertEqual(y_pred.tolist(), [6, 15, 24])

if __name__ == '__main__':
    unittest.main()

Question: Please describe the gradient descent of the linear regression model

Answer:
Gradient descent is an optimization algorithm used to minimize a loss function. In linear regression models, we use gradient descent to find the best-fit line that minimizes the squared error. Specifically, we compute the partial derivative of the loss function with respect to each parameter, which is then multiplied by the learning rate and subtracted from the current parameter value. This process is repeated until the loss function converges or reaches a predetermined number of iterations. In this way, we can find the best-fit straight line that minimizes the squared error.

Question: Please describe the mathematical derivation of the linear regression model

Answer:
In linear regression, our goal is to find a straight line such that the sum of the distances from all data points to the line is minimized. This distance can be expressed in terms of squared error. Therefore, our goal is to minimize the squared error. Suppose our model is y = wx + b, where w is the slope and b is the intercept. We can use gradient descent to find the best-fit line that minimizes the squared error. Specifically, we compute the partial derivatives of the loss function with respect to w and b, which are then multiplied by the learning rate and subtracted from the current values of w and b. This process is repeated until the loss function converges or reaches a predetermined number of iterations. Eventually, we can find the best-fit straight line that minimizes the squared error.

Question: What is the loss function of the linear regression model?

Answer:
Squared Error Loss Function

Question: Please write the loss function of the linear regression model

J(w,b) = 1/2m * sum((y_pred - y)^2)

Question: Please give the partial derivative of the loss function of the linear regression model:

answer:

∂J(w,b)/∂w = 1/m * sum((y_pred - y) * x)
∂J(w,b)/∂b = 1/m * sum(y_pred - y)

Question: Please give the derivation process of the partial derivative of the loss function of the linear regression model using chain derivation

Answer:
We can use the chain derivation method to derive the partial derivative calculation process of the loss function of the linear regression model. First, we compute the partial derivative of the loss function with respect to the predicted value y_pred:

∂J/∂y_pred = 1/m * sum(y_pred - y)

Then, we compute the partial derivatives of y_pred with respect to w and b:

∂y_pred/∂w = x
∂y_pred/∂b = 1

Next, we can use the chain derivation method to calculate the partial derivatives of the loss function with respect to w and b:

∂J/∂w = ∂J/∂y_pred * ∂y_pred/∂w = 1/m * sum((y_pred - y) * x)
∂J/∂b = ∂J/∂y_pred * ∂y_pred/∂b = 1/m * sum(y_pred - y)

In this way, we get the partial derivative calculation formula of the loss function of the linear regression model. We can use these formulas to implement the gradient descent algorithm to find the best-fit line that minimizes the squared error.

–end–