Multiple Linear Regression and Cases (Python)

1 Introduction to Multiple Linear Regression

The multiple linear regression model can be expressed as the formula shown below.

Among them, x1, x2, x3... are different characteristic variables, k1, k2, k3... are the coefficients before these characteristic variables, and k0 is a constant item.

2 Case: Customer Value Prediction Model

Multiple linear regression models can be used to predict customer value based on multiple factors. After the model is built, different business strategies can be adopted for customers of different values.

2.1 Case Background

Here we take the customer value of credit card customers as an example to explain the specific meaning of customer value forecasting: customer value forecasting refers to predicting how much profit a customer can bring in a certain period of time in the future, and the profit may come from the annual fee of the credit card, cash withdrawal fee, installment Handling fees, overseas transaction fees, etc. After analyzing the customer value, when conducting marketing, telephone answering, collection, product consultation and other businesses, we can provide high-value customers with services that are different from ordinary customers, so as to further tap the value of these high-value customers and improve their value. loyalty.

2.2 specific code

import pandas as pd
from sklearn.linear_model import LinearRegression


df = pd.read_excel('客户价值数据表.xlsx')


X = df[['历史贷款金额','贷款次数','学历','月收入','性别']]
Y = df['客户价值']
model = LinearRegression()
model.fit(X,Y)

model.coef_,model.intercept_

Here, what is obtained through model.coef_ is a list of coefficients, corresponding to the coefficients in front of different feature variables, namely k1, k2, k3, k4, k5, so the multiple linear regression equation at this time is as follows.

y=-208+0.057x^1+96x^2+113x^3+0.056x^4+1.98x^5

3 Model evaluation

import statsmodels.api as sm
X2 = sm.add_constant(X)
est = sm.OLS(Y,X2).fit()
est.summary()

4 Advantages and disadvantages of linear regression

The linear regression model has the following advantages and disadvantages.

        Advantages : fast; no tuning parameters; easily interpretable; understandable.

        Disadvantages : Compared with other complex models, its prediction accuracy is not high, because it assumes that there is a definite linear relationship between features and responses. This assumption is obviously not good for nonlinear relationships. data modeling.

reference books

"Python Big Data Analysis and Machine Learning Business Case Practice"

Guess you like

Origin blog.csdn.net/qq_42433311/article/details/124104841