Introduction to Simple Linear and Logistic Regression

Introduction to Simple Linear and Logistic Regression

introduce

Linear regression and logistic regression are one of the most basic models in deep learning, and they are also important tools for solving many practical problems. Both linear regression and logistic regression are models in supervised learning. Linear regression models are used for prediction of continuous numerical data, and logistic regression models are used for classification problems.

In this tutorial, we will introduce the basic concepts of linear regression and logistic regression, and implement simple linear regression and logistic regression models with Python code.

linear regression

basic concept

Linear regression is a modeling method for continuous numerical data, mainly used to predict the relationship between a dependent variable and one or more independent variables. In univariate linear regression, the model assumes a linear relationship between the dependent variable and one independent variable. A linear regression model can be represented by the following equation:
y = β 0 + β 1 x 1 + β 2 x 2 + . . . + β nxn + ϵ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_n x_n+ \epsilony=b0+b1x1+b2x2+...+bnxn+ϵ
ϵ ϵϵ among them,yyy is the dependent variable,x 1 , x 2 , . . . , xn x_1, x_2, ..., x_nx1,x2,...,xnβ 0 , β 1 , β 2 , . . . , β n \beta_0, \beta_1, \beta_2, ..., \beta_nb0,b1,b2,...,bnare the coefficients of the model, often called weights or parameters, ϵ \epsilonϵ is the error term, which represents the difference between the true value and the predicted value due to various reasons. When training the model, our goal is to learn the relationship between the real value and the predicted value, that is, to learn the most suitable coefficientsβ 0 , β 1 , β 2 , . . . , β n \beta_0, \beta_1, \beta_2 , ..., \beta_nb0,b1,b2,...,bn

Our goal is to find a set of parameters β 0 , β 1 , β 2 , . . . , β n \beta_0, \beta_1, \beta_2, ..., \beta_nb0,b1,b2,...,bn, such that the model best predicts the value of the dependent variable. A commonly used method is the least squares method.

Python implementation

We will implement our linear regression model using the linear regression model from the scikit-learn library.

# 导入必要的包
from sklearn.linear_model import LinearRegression
import numpy as np

# 创建数据
x = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([3, 5, 7, 9, 11])

# 创建线性回归对象
lr = LinearRegression()

# 训练模型
lr.fit(x, y)

# 预测
x_test = np.array([6]).reshape(-1, 1)
print(lr.predict(x_test))

The output is:

[13.]

logistic regression

basic concept

Logistic regression is a model widely used in classification problems. A logistic regression model transforms a classification problem into a regression problem by using the sigmoid function to transform continuous variables into 0 or 1. The probability form of the logistic regression model can be expressed as:
p ( y = 1 ∣ x ) = 1 1 + e − ( w 0 + w 1 x 1 + w 2 x 2 + . . . + wnxn ) p(y=1 \ mid x) = \frac{1}{1 + e^{-(w_0 + w_1 x_1 + w_2 x_2 + ... + w_n x_n)}}p ( and=1x)=1+e(w0+w1x1+w2x2+...+wnxn)1
Among them, p ( y = 1 ∣ x ) p(y=1 \mid x)p ( and=1x ) means that in a given featurexxxy = 1 y=1y=1的probability,w 0 , w 1 , w 2 , . . . , wn w_0,w_1,w_2,...,w_nw0,w1,w2,...,wnis the regression coefficient, x 1 , x 2 , . . . , xn x_1,x_2,...,x_nx1,x2,...,xnis characteristic. Our goal is to find the best set of model parameters w 0 , w 1 , w 2 , . . . , wn w_0,w_1,w_2,...,w_nw0,w1,w2,...,wn

We can rewrite the expression as follows:
logit ( p ( y = 1 ∣ x ) ) = w 0 + w 1 x 1 + w 2 x 2 + . . . + wnxn logit(p(y=1 \mid x )) = w_0 + w_1 x_1 + w_2 x_2 + ... + w_n x_nlogit(p(y=1x))=w0+w1x1+w2x2+...+wnxn
Among them, logit logitThe l o g i t function is expressed as follows:
logit ( x ) = log ( x 1 − x ) logit(x) = log(\frac{x}{1-x})logit(x)=log(1xx)

Python implementation

We will implement our logistic regression model using the logistic regression model from the scikit-learn library.

# 导入必要的包
from sklearn.linear_model import LogisticRegression
import numpy as np

# 创建数据(使用二元变量作为因变量)
x = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([0, 0, 1, 1, 1])

# 创建逻辑回归对象
lr = LogisticRegression()

# 训练模型
lr.fit(x, y)

# 预测
x_test = np.array([6]).reshape(-1, 1)
print(lr.predict(x_test))

The output is:

[1]

commonalities and differences

common ground:

  • Both linear regression and logistic regression are commonly used models in supervised learning;
  • Both linear regression and logistic regression model based on the relationship between features and targets;
  • Both linear regression and logistic regression are optimized based on algorithms such as gradient descent;
  • Both linear regression and logistic regression rely on certain assumptions (linearity assumption and sigmoid function assumption).

difference:

  • The goal of linear regression is to predict continuous values, while the goal of logistic regression is to predict discrete categories;
  • Linear regression uses the linear assumption, and logistic regression uses the sigmoid function assumption;
  • Linear regression uses methods such as least squares or gradient descent for model optimization, and logistic regression uses methods such as maximum likelihood estimation or gradient descent for model optimization.

Summarize

In this tutorial, we introduced the basic concepts of linear regression and logistic regression, and implemented simple linear regression and logistic regression models with Python code. In actual use, we can use more complex models to solve practical problems to improve the prediction accuracy of the model.

Guess you like

Origin blog.csdn.net/qq_36693723/article/details/130389623