1.9 Errors can be avoided-Deep Learning Lesson 3 "Structured Machine Learning Project"-Professor Stanford Wu Enda

Avoidable Error

We have discussed that you want your learning algorithm to perform well on the training set, but sometimes you do n’t actually want to do too well. You have to know what the human-level performance is. It can tell you exactly how well the algorithm should perform on the training set, or how bad it is. Let me tell you what it means.

Insert picture description here

We often use cat classifiers as examples. For example, humans have near perfect accuracy, so the human level error is 1%. In this case, if your learning algorithm reaches an 8% training error rate and a 10% development error rate, then you may want to get better results on the training set. So in fact, if there is a big gap between the performance of your algorithm on the training set and the performance of the human level, it means that your algorithm does not fit the training set well. So from the perspective of tools to reduce bias and variance, in this case, I will focus on reducing bias. What you need to do is, for example, train a larger neural network, or run a little longer for gradient descent, and try to do better on the training set.

Insert picture description here

But now we look at the same training error rate and development error rate, assuming that human performance is not 1%, we will copy it over. But you know, in different applications or in different data sets, assuming that the human level error is actually 7.5%, maybe the image in your data set is very blurry, even humans ca n’t judge whether there is a cat in this photo . This example may be a little more complicated, because humans are actually very good at looking at photos and distinguishing whether there are cats in the photos. But just to give this example, for example, the image in your data set is very blurry and the resolution is very low, even if the human error rate is 7.5%. In this case, even if your training error rate and development error rate are the same as in other examples, you know that maybe your system performs well on the training set, it is only a little worse than human performance. In the second example, you may want to focus on reducing this component and reducing the variance of the learning algorithm. Maybe you can try regularization to make your development error rate closer to your training error rate.

Therefore, in the discussion of deviation and variance in the previous course, we mainly assumed that the Bayesian error rate of some tasks was almost 0. So to explain what is happening here, take a look at this cat classifier and use human-level error rate to estimate or replace Bayesian error rate or Bayesian optimal error rate. For computer vision tasks, this replacement is quite reasonable. Because humans are actually very good at computer vision tasks, the level that humans can achieve is not far from the Bayesian error rate. By definition, the human level error rate is a bit higher than the Bayesian error rate, because the Bayesian error rate is the theoretical upper limit, but the human level error rate is not too far away from the Bayesian error rate. So what is more unexpected here is how much the human level error rate is, or this is really close to the Bayesian error rate, so we assume it is, but it depends on what level we think is achievable.

Insert picture description here

In both cases, with the same training error rate and development error rate, we decided to focus on strategies to reduce bias or variance. So what happened in the example on the left? The 8% training error rate is really high, you think you can reduce it to 1%, then the means to reduce the deviation may be effective. In the example on the right, if you think that the Bayesian error rate is 7.5%, here we use the human level error rate to replace the Bayesian error rate, but you think that the Bayesian error rate is close to 7.5%, you know that there is no There is too much room for improvement, you ca n’t continue to reduce your training error rate, and you wo n’t want it to be much better than 7.5%, because this kind of goal can only be achieved by providing further training. On this side, there is still more room for improvement (between training error and development error). You can narrow this 2% gap a little. It should be feasible to use means to reduce variance, such as regularization, or collect more training. data.

So to name these concepts, this is not a widely used term, but I think it is smoother to think about it. It is this difference, the difference between the Bayesian error rate or the estimation of the Bayesian error rate and the training error rate, which is called avoidable bias. You may want to improve the training set performance until you are close to Bayesian. The error rate, but in fact you do not want to be better than the Bayesian error rate, which is theoretically impossible to exceed the Bayesian error rate, unless overfitting. The difference between the training error rate and the development error rate before, probably shows how much room for improvement of your algorithm in the variance problem.

Insert picture description here

The term avoidable deviation indicates that there are some other deviations, or that the error rate has a minimum level that cannot be exceeded, that is to say if the Bayesian error rate is 7.5%. You don't actually want to get an error rate lower than this level, so you won't say that your training error rate is 8%, and then 8% measures the size of the deviation in the example. You should say that avoidable deviation may be around 0.5%, or 0.5% is an indicator of avoidable deviation. This 2% is an indicator of variance, so reducing this 2% is much larger than reducing this 0.5%. In the example on the left, 7% measures the size of avoidable deviations, and 2% measures the size of variance. So in the example on the left, reducing the concentration to avoid deviations may have greater potential.

So in this example, when you understand the human-level error rate and your estimate of the Bayesian error rate, you can focus on different strategies in different scenarios, whether to use bias avoidance strategies or variance avoidance strategies. How to consider the human level performance during training to determine the focus of work, there are more subtle details about how to do it, so in the next video, we will deeply understand the true meaning of human level performance.

Course blackboard

Insert picture description here
Insert picture description here

Published 241 original articles · Like9 · Visitors 10,000+

Guess you like

Origin blog.csdn.net/weixin_36815313/article/details/105493290