How to set epoch_The most understandable way to introduce Epoch, Iteration, Batchsize in the neural network model

This article introduces three terms that are often encountered in neural networks and are easily confused in the most straightforward way. There are a total of 1162 characters, and it takes about 6 minutes to learn all of them.

batchsize: To put it simply, it is how much data we want to throw into the model for training at a time. This value is between 1 and the total number of training samples.

If the batchsize is too large or too small, it is not good. If the value is too small, assuming batchsize=1, one data is used for training each time. If the total amount of data is large (assuming there are 100,000 pieces of data), it is necessary to input 100,000 data to the model. It takes a long time to complete the data training, and the training efficiency is very low. If the value is too large, assuming batchsize=100000, throwing 100,000 pieces of data into the model at a time may cause memory overflow and cannot be trained normally.

Therefore, we need to set an appropriate batchsize value to find the best balance between training speed and memory capacity.

A few experiences:

Compared with the normal data set, if the Batch_Size is too small, the training data will be very difficult to converge, resulting in underfitting. Increase the Batch_Size, the relative processing speed will become faster, and the required memory capacity will increase at the same time. In order to achieve a better training effect, generally when the Batchsize increases, we need to increase the number of training times for all samples (that is, the epoch to be discussed later) to achieve the best results. When increasing the Batchsize, it will generally increase the training times of all samples, which will also increase the time consumption. Therefore, it is necessary to find an appropriate Batchsize value to achieve the best balance between the overall efficiency of the model and the memory capacity.

iteration: the number of iterations (the number of times data is thrown into the model)

Here is an example to understand at a glance. Assuming that there are 100 training data in total, and the batchsize is set to 10, that is, there are 100 data in total, and 10 data are thrown into the model for training at a time, how many times will it take to train all the data in total? 100/10=10 (times), that is, we can train the data once by throwing (iterating) ten times, where the number of throwing data (number of iterations) = iteration=10.

Notice:

The result of each iteration will be used as the initial value of the next iteration. For example, ten iterations are required in the above example. First, all parameters will have an initial value. After the first iteration, the parameters in the model will get a batch of new values, and this batch of parameter values ​​will be used as the input of the second iteration. Through the second iteration, a batch of optimized parameter values ​​will be obtained, and these parameter values ​​will be used as the input of the third iteration... In this way, through each iteration, the parameters in the model are approaching the optimal parameters step by step... One iteration = one forward pass + one reverse pass of the same batch of batchsize data.

All data from left to right and then to left is an epoch

Epoch: All the samples in the training set go through the training model and return once (there are going and returning), which is an epoch.

Notice:

Generally, it is not enough to pass all the data sets in the neural network once. We need to pass all the data sets in the same neural network multiple times, such as 20,000 times. This number also needs to be trained. If the number of epochs is too many, it is easy to cause overfitting.

For example: 10000 training samples, batchsize is set to 20, and all training samples are trained 5 times in the same model, then epoch=5, batchsize=20, iteration=10000/20=500 Related resources: https://www.zhihu.com/question/65196241 Program: test_numpy.py_cycleGAN several epochs are suitable-Internet Documentation
Resources -
CSDN Library

Guess you like

Origin blog.csdn.net/Adam897/article/details/126493516