deeplearning.ai Wu Enda online course learning (14) - general theory of machine learning tuning

Reference link: https://www.missshi.cn/api/view/blog/5a8d4064c55cb02519000000

Mainly summarize the original text:



1. Machine Learning Strategies


2. Orthogonalization of parameters:



3. Evaluation Metrics, Optimization Metrics, and Satisfaction Metrics

We need to find a function that returns a real number as an evaluation metric



4. Training set, validation set and test set:

The validation set and the test set should come from the same distribution and are consistent with the real requirements:  we should randomly shuffle the data in all different regions, and randomly extract data from it and assign it to the validation set and test set, so as to ensure that the two originate from the same distributed. Ultimately, developers can be assured that the carefully trained and tested models are consistent with the models they need to meet.

The size of the training set, validation set and test set should be well allocated : in the case of a large number of samples, the main data can be used for training, and it is only necessary to ensure that there is enough data in the validation set or test set.

Adjust evaluation metrics or dev/validation sets: learn to adjust appropriately for the situation

5. Model performance and human performance

The deep learning model can usually achieve the accuracy of normal human recognition in a short period of time. However, once the accuracy of the model exceeds the accuracy of human recognition, the speed of model accuracy improvement will be greatly reduced, but the overall view of the model. Accuracy is still increasing. However, the theoretical accuracy of the model cannot reach 100%, it will reach a theoretical upper limit, which we call the Bayesian optimal error.



6. How to perform error analysis


(1) If the calibration data in the training data itself is wrong:

① For the training set, in fact, the deep learning algorithm itself has this high robustness, so for a small number of wrong samples, it will not have a great impact on the model training itself. Therefore, as long as the proportion is not large, we can ignore these samples.

② If it is a validation set or a test set, we can add a column of error calibration data to the error analysis


(2) When the data of real application scenarios is not enough to support us to build a huge data set:

User-uploaded images can often be relatively blurry and unclear, and such images may not be easy to collect. What we can collect is some high-definition pictures crawled from the Internet.


(3)

Suppose we think the Bayesian optimal error for this problem is close to 0%. The current training set error is 1%, and the validation set error is 10%. 

Comparing the training set and the validation set, they differ in two points:

  • The data in the validation set is not trained
  • The training set and the validation set have different distribution sources



7. Transfer Learning


8. Multi-task learning



9.  End-to-End Learning

In previous machine learning, a relatively complex task was usually split into multiple steps: such as image preprocessing, feature extraction, feature classification, etc., while end-to-end learning refers to ignoring all individual steps and using a single All these steps are represented by a large neural network, that is, the input data and calibration data are directly fed into the network to train the network.

The advantage of end-to-end learning is that it ignores the need to manually design models and split the process, thereby making the whole process simpler and more natural and simplifying the system architecture; while the disadvantage of end-to-end learning is that it requires a larger amount of data samples for training. , otherwise it is difficult to get the desired result.




Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325409550&siteId=291194637