I am a rookie who is getting started, I hope to record what I have learned like taking notes, and I also hope to help people who are also getting started.
Table of contents
3. Solve the problem - find w and b
3. The derivative is 0 to draw a conclusion
4. Hidden problems - may not be full rank matrix
5. The hidden problem solution - regularization
1. L1 regularization - Lasso regression
2. L2 regularization - ridge regression
6. Changes and applications of linear regression
Eight, linear model - regression problem classification problem
1. Problem description
We now have a data set D at hand: each sample is described by d attributes, that is , where is the value of sample x on the i-th attribute. And the final corresponding result value of each sample is .
Now comes a new sample and want to know its result value
2. Problem Analysis
We need to find a linear model to predict according to the data set D , that is, to find the appropriate w and b.
3. Solve the problem - find w and b
We can use the method of least squares to solve this problem.
1. Vector form conversion
First, combine w and b into a vector form with a size of (d+1)*1;
Then rewrite the data matrix X: , the size is (d+1)*m.
Then write the vector pattern for the label y as well:
2. Goal style
make
3. The derivative is 0 to draw a conclusion
4. The final model result
4. Hidden problems - may not be full rank matrix
It may not be a full-rank matrix, and multiple optimal solutions will be generated. Which solution should be selected as
For example, if the number of samples is small, the number of feature attributes is large, or even exceeds the number of samples, then it is not a full-rank matrix at this time , and multiple solutions can be solved .
5. The hidden problem solution - regularization
The role of regularization is to choose a model with small empirical risk and model complexity at the same time
1. L1 regularization - Lasso regression
Add an item after the objective function
Then, the objective function becomes
The first item is the empirical risk mentioned above, and the second item controls the complexity of the model.
Among them , the degree of punishment is controlled: ;
This is also known as Lasso regression.
As shown in the figure below (assuming there are only two attributes): the L1 regularized squared error term contour and the regularized contour often intersect on the coordinate axis, which means that one of the attributes is discarded, reflecting the feature selection. characteristics, it is easier to get a sparse solution (compared to the L2 regularization below) - that is, there will be fewer non-zero values in the obtained W vector.
2. L2 regularization - ridge regression
Add an item after the objective function
Then, the objective function becomes
This is also known as ridge regression.
L2 regularization selects parameters evenly, so that the coefficients of the fitting curve are similar. Although the number of items cannot be reduced, the coefficients are balanced. This is different from L1 regularization in principle.
6. Changes and applications of linear regression
If the model for the problem is not linear regression, try approximating the model predictions to derivatives of y.
For example - log-linear regression
More generally: , which is called a generalized linear model.
Seven, python implementation
1. Multiple Linear Regression
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=1)//x,y分别为已经分好的属性数据和标记数据
model = LinearRegression()
model.fit(X_train, Y_train)
score = model.score(X_test, Y_test)
print('模型测试得分:'+str(score))
Y_pred = model.predict(X_test)
print(Y_pred)
2. Ridge regression
from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=1)//x,y分别为已经分好的属性数据和标记数据
model = Ridge(alpha=1)
model.fit(X_train, Y_train)
score = model.score(X_test, Y_test)
print('模型测试得分:'+str(score))
Y_pred = model.predict(X_test)
print(Y_pred)
3. lasso regression
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(x,y,test_size=0.3,random_state=1)//x,y分别为已经分好的属性数据和标记数据
model = Lasso(alpha=0.1)
model.fit(X_train, Y_train)
score = model.score(X_test, Y_test)
print('模型测试得分:'+str(score))
Y_pred = model.predict(X_test)
print(Y_pred)
Eight, linear model - regression problem classification problem
All the above mentioned are using linear models to solve regression problems. In fact, linear models can also be used to solve classification problems - logistic regression (logarithmic probability regression).
For details, see Logistic Regression (Logistic Regression)
Everyone is welcome to criticize and correct in the comment area, thank you~