Note: This article refers to: Station B: Classical Machine Learning Algorithms - Regression Algorithms
Article directory
An introduction to machine learning
The fields of machine learning applications are very broad:
- Data mining: analyze user information and improve users' dependence on products.
- Computer Vision: Driverless Cars – Real-time Detection Tasks
Steps of Machine Learning:
1. Training samples
2. Feature extraction (Data Scientist: Knowing how a piece of data is a feature that machines can better identify)
3. Learning functions
4. Prediction
Numpy: scientific computing library (matrix)
Pandas: data analysis and processing library (missing values, outliers and other data analysis and processing)
Matplotlib: data visualization library (drawing)
Scikit-learn: machine learning library (machine learning)
2. Regression algorithm
Classification: The final result is a category
Regression: The final result is not a category value, but a specific value
Example:
Determine how much the bank can borrow from the customer based on salary and age.
Since the final prediction result is a concrete value, this is a regression problem.
The two indicators of salary and age are called characteristics , and the degree of influence of the two characteristics is different.
Regarding the algorithm of linear regression, the likelihood function and log-likelihood function in probability theory and mathematical statistics are generally used. The specific process will not be shown. The objective function is as follows:
Logistic regression:
Logistic regression is a classic two-class problem. , although it is a regression, the final result it gets is a category
About gradient descent: explanation of the principle of gradient descent algorithm - machine learning
Third, the problem to be solved by the support vector machine
Support vector machines do classification tasks and solve binary classification problems.
Compared with the decision tree, the support vector machine not only needs to be pure, but also needs to be divided well, and it can tolerate a larger error.
Therefore, the problem to be solved by the support vector machine is to find the optimal line, which can best distinguish the samples and classify them.
Fourth, the support vector machine solves the target
According to the above figure, our purpose is to find a line that meets the following requirements:
that is, the
following figure X1, X3 in the figure below are the support vectors in the support vector machine algorithm.
For a linear support vector machine, just find such a hyperplane.
However, if there is an abnormal point in the sample data, then this abnormal point needs to be discarded.
Observe the solid line below: O and X are both close to the line due to the inclusion of outliers, but if a dashed line is used, OX is farther from the dashed line, which can better meet our needs.
5. Kernel function transformation
Support vector machine is to convert some low-dimensional space data into high-dimensional space data through some kernel functions.