If the total amount of data cannot be divisible by batch_size, there will be a problem that the data volume of the last batch is smaller than batch_size. The solution is as follows:
Method 1 Drop the last batch and add drop_last = True
to the data loader parameters. Advantages : Save training time Disadvantages : When the batch_size is small, it will easily lead to increased loss
Method 2 Pack the last batch separately and still send it to training (Pytorch’s default method) Advantages
: Ensure training integrity
Disadvantages
: Increase training time