General steps of machine learning

1. Overall framework

1

2. Data collection

2
Since machine learning is a method of learning from data , the first thing to do is to collect data for the problem you want to solve. There are two main ways to collect data, one is to collect it yourself, and the other is to find public data sets on the Internet. After the data collection is completed, the original data is obtained.

3. Feature extraction

3
Due to the large amount of raw data, we need to extract the data related to the problem we want to solve as features from the raw data (some deep learning methods can extract features from the data by themselves, but traditional machine learning methods often need to extract features by themselves. Called feature engineering). For example, the knocking sound, color gloss, texture clarity, etc. in the above picture can be used as features to determine whether a watermelon is mature, but features such as the shape of a watermelon may not be related to whether it is mature or not, and cannot be used as features (if extracted and the problem is solved What about irrelevant features?).

4. Determination of models, learning criteria and optimization algorithms

4
Models, learning criteria and optimization algorithms are the three elements of machine learning :

The role of the model is to give the output result according to the input characteristics ( for specific problems ), and the model can also be understood as a function. Different machine learning models (such as LR, SVM, NB, etc.) are essentially different clusters of functions to be selected. When the type of the model is determined, the general framework of the function is determined, and the rest is the learning of the parameters in the function. Therefore, the essence of machine learning is to select the best one (an optimization problem ) among a bunch of functions determined by different parameters .

The role of learning criteria is to evaluate the quality of a certain model for the problem you want to solve. In supervised learning, it is generally based on the difference between the output of the model and the true value in the data set. The smaller the difference, the better the model.

The function of the optimization algorithm is to solve the optimization problem of selecting the best model.

After these three elements are determined, the data set is brought into it, and an optimal model under the current data set can be trained.

5. Use of the model

5
After training, an optimal function is obtained, and then the characteristic independent variable to be predicted is input into the model to obtain the predicted result.

Guess you like

Origin blog.csdn.net/weixin_43795921/article/details/113435521