What is regression analysis
Regression analysis (regression analysis) is a statistical analysis to determine the quantitative relationship between two or more interdependent variables. Use a very wide range, regression analysis according to the number of variables involved, divided into one regression and multiple regression analysis; according to the number of dependent variables can be divided into simple regression analysis and multiple regression analysis; in accordance with the relationship between the independent and dependent variable types , regression analysis can be divided into linear and non-linear regression. If the regression analysis, only one independent variable and includes a dependent variable, and the relationship between the two approximated straight line is available, this is called regression analysis, a linear regression analysis. If the regression analysis comprises two or more independent variables, and there is a linear correlation between independent variables, it is referred to as multiple linear regression analysis.
In fact, give you some point, solving linear equations
Allow direct entry bar
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
x = np.linspace(0,30,50)
y = x+ 2*np.random.rand(50)
plt.figure(figsize=(10,8))
plt.scatter(x,
There are 50 points on the map, quickly to find linear equations
from sklearn.linear_model import LinearRegression #导入线性回归
model = LinearRegression() #初始化模型
x1 = x.reshape(-1,1) # 将行变列 得到x坐标
y1 = y.reshape(-1,1) # 将行变列 得到y坐标
model.fit(x1,y1) #训练数据
model.predict(40) #预测下x=40 ,y的值
array([[40.90816511]]) # x=40的预测值
Or draw a map to see better
plt.figure(figsize=(12,8))
plt.scatter(x,y)
x_test = np.linspace(0,40).reshape(-1,1)
plt.plot(x_test,model.predict(x_test))
There are pictures and I do not know the parameters of the equation
We have a model of how can we not know
model.coef_ #array([[1.00116024]]) 斜率
model.intercept_ # array([0.86175551]) 截距
y = 1.00116024 * x + 0.86175551
Evaluate how good or bad that this model
is certainly not that the closer the better point to the square of the distance of the straight line as small as possible
np.sum(np.square(model.predict(x1) - y1))
16.63930773735106 # 这个不错,挺小的
I do not believe is the best, then add a little bit intercept 0.01
y2 = model.coef_*x1 + model.intercept_ + 0.01
np.sum(np.square(y2 - y1)) # 16.64430773735106
Or draw a map
plt.figure(figsize=(10,10))
plt.scatter(x1,y1)
plt.plot(x1,model.predict(x1),color = 'r')
plt.plot(x1 , model.coef_*x1 + model.intercept_ +1, color = 'b') # 用大的1区别这两条线,0.01两条线几乎重合