linear regression
Regression analysis: According to the data, determine the quantitative relationship between two or more variables that depend on each other.
Functional expression: y = f ( x 1 , x 2... xn ) y=f(x1,x2...xn)y=f ( x 1 ,x2...xn)
Classification of regression:
Classification according to the number of variables: unary regression, multiple regression
Classification according to functional relationship: linear regression, nonlinear regression
Gradient descent method:
A method for finding minima. Iterative search is performed by the specified step distance in the opposite direction of the gradient (or approximate gradient) corresponding to the current point on the function until it converges at the minimum.
Commonly used package: Scikit-Learn
An open source framework (algorithm library) developed specifically for machine learning applications in the Python language can implement commonly used machine learning algorithms such as data preprocessing, classification, regression, dimensionality reduction, and model selection.
Disadvantages: Does not support languages other than python, does not support deep learning and reinforcement learning.
pip installation (using Tsinghua mirror source)
pip install scikit-learn -i https://pypi.tuna.tsinghua.edu.cn/simple
Evaluate model performance
-
均方误差(MSE)
M S E = 1 m ∑ i = 1 m ( y i ‘ − y i ) 2 MSE=\frac{1}{m}\sum_{i=1}^{m}{(y_i^‘-y_i)^2} MSE=m1i=1∑m(yi‘−yi)2 -
R square value ( R 2 R^2R2)
R 2 = 1 − ∑ i = 1 m ( y i ‘ − y i ) 2 ∑ i = 1 m ( y i − y i ‾ ) 2 = 1 − M S E 方差 R^2=1-\frac{\sum_{i=1}^{m}{(y_i^‘-y_i)^2}}{\sum_{i=1}^{m}{(y_i-\overline{y_i})^2}}=1-\frac{MSE}{方差} R2=1−∑i=1m(yi−yi)2∑i=1m(yi‘−yi)2=1−varianceMSE
The smaller the MSE, the better, R 2 R^2RA score of 2 is as close to 1 as possible.
y ' y^`y‘ vs y y The higher the y concentration, the better (closer to a straight line distribution).
- visualization _y与ypredict y_{predict}ypredict