PyTorch distributed training --- DistributedSampler for data loading

1. One sentence summary:

DistributedSampler in DDP guarantees that the test data set is loaded in a fixed order, and it is the same in each program (because random seeds are used when shuffle=True, it is not guaranteed that the test data set is loaded in the same order in each program), preferably in The parameter shuffle=False is guaranteed in DistributedSampler, and the training set needs to ensure that shuffle=True (the default shuffle parameter in DistributedSampler is True).
In DDP, in the DataLoader, it is necessary to ensure that the test data set and the training data set are both shuffle=False (the parameter shuflle is False by default), because there is a sampler for data sampling, if shuffle=True will conflict with the sampler, and an error will be reported (DataLoader The default shuflle parameter is False). If it is not DDP, you need to ensure that the shuffle parameter in the dataloader of the training data set is True, and the shuffle parameter in the dataloader of the test data set is False.

2. Reference link:

  1. Random factors in DistributedSampler() in Pytorch
  2. Pytorch loads samples in a fixed order
  3. Pytorch Distributed related code learning (1)
  4. [Source Code Analysis] PyTorch Distributed (1) — DistributedSampler for Data Loading
  5. About DistributedSampler understanding
  6. pytorch distributed series 3 - what does torch.utils.data.distributed.DistributedSampler do during distributed training?
  7. Shuffer and random seed in pytorch's DataLoader
  8. Pytorch multi-GPU parallel training DistributedDataParallel application and stepping pit record
  9. How to use PyTorch multi-card distributed training DistributedDataParallel
  10. Official: https://pytorch.org/docs/stable/data.html#
  11. Official: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html
  12. Official: https://pytorch.org/docs/stable/notes/ddp.html
  13. About the use of distributedsampler function in pytorch
  14. Pytorch DistributedDataParallel data sampling shuffle

Guess you like

Origin blog.csdn.net/flyingluohaipeng/article/details/128996516