What is the difference between Epoch, Batch and Iteration?

What is the difference between Epoch, Batch and Iteration?

Epoch, using all the data in the training set to conduct a complete training of the model, is called a generation of training

Batch, using a small sample in the training set to update the parameters of the model weight by reverse interpolation, this small sample is called "a batch of data"

Iteration, the process of using a batch of data to update the parameters of the model, is called "a training

Epoch (period):

When a complete data set is passed through the neural network once and returned once, this process is called an epoch. (That is, all training samples have undergone one forward pass and one backpropagation through the neural network)

More generally, an Epoch is the process of training all training samples once.

However, when the number of samples of an Epoch (that is, all training samples) may be too large (for a computer), it needs to be divided into multiple small blocks, that is, divided into multiple Batches for training.

Batch (batch/batch of samples):

Divide the entire training sample into several batches.

Batch_Size (batch size):

The size of each batch of samples.

Iteration (one iteration):

Training a Batch is an Iteration (this concept is similar to an iterator in a programming language).

for example:

The mnist dataset has 60,000 images as training data and 10,000 images as testing data. Suppose now choose Batch_Size = 100 to train the model. Iterate 30000 times.

The number of pictures to be trained for each Epoch: 60000 (all images on the training set)

The number of batches in the training set: 60000/100=600

The number of batches that need to be completed for each Epoch: 600

The number of Iterations each Epoch has: 600 (to complete a Batch training, which is equivalent to one parameter iteration)

Number of model weight updates per epoch: 600

After training for 10 Epochs, the number of model weight updates: 600*10=6000

The training of different Epochs actually uses the data of the same training set. Although the 1st Epoch and the 10th Epoch both used 60,000 pictures of the training set, the weight update values ​​for the model are completely different. Because the models of different Epochs are in different positions in the cost function space, the later the training generation of the model is, the closer it is to the bottom, and the lower the cost.

Completed a total of 30,000 iterations, which is equivalent to completing 30,000/600=50 Epoch

Guess you like

Origin blog.csdn.net/u014723479/article/details/130784839