Early stopping of deep learning tuning

In machine learning, hyperparameters have proliferated, and the selection of feasible algorithms has become more and more complicated. I found that if we use a set of tools to optimize the cost function J, machine learning will become easier. When focusing on optimizing the cost function, you only need to pay attention to w and b. The smaller the value of J(w,b), the better , You only need to find a way to reduce this value, don't pay attention to the others. Also pay attention to overfitting .

But the main disadvantage of early stopping is that you cannot deal with these two problems independently , because you stop the gradient descent early, that is, stop optimizing the cost function, because now you are no longer trying to reduce the cost function, so the value of the cost function may not be small enough. At the same time, you hope that there is no overfitting. Instead of taking different ways to solve these two problems, you use one method to solve both problems at the same time. The result of this is that the things I want to consider become more complicated.

If you don't use early stopping, another method is regularization, and the time to train the neural network may be very long. I found that this makes the super parameter search space easier to decompose and easier to search, but the disadvantage is that you have to try a lot of regularization parameter (lamda) values, which also causes the computational cost of searching for a large number of values ​​to be too high.

The advantage of Early stopping is that you only run the gradient descent once, and you can find the smaller, intermediate and larger values ​​without having to try to regularize many values ​​of the super parameter.

Although regularization has its shortcomings, many people are willing to use it. Teacher Wu Enda personally prefers to use regularization and try many different values, assuming you can afford a lot of calculations .
And using early stopping can also get similar results without having to try so many values.

Guess you like

Origin blog.csdn.net/qq_38574975/article/details/107574241