The pytorch DataLoader () function

When training the neural network, preferably a data batch to operate, but also need to shuffle data in parallel and acceleration. In this regard, PyTorch provides DataLoader help us achieve these functions.

DataLoader function is defined as follows:

DataLoader(dataset, batch_size=1, shuffle=False, sampler=None, 
num_workers=0, collate_fn=default_collate, pin_memory=False, 
drop_last=False)

dataset: load the data set (Dataset objects)
batch_size: BATCH size
shuffle :: upset if the data
sampler: Samples Sampling, follow-up will detail
num_workers: the process of loading process using multiple numbers, 0 to not use multi-process
collate_fn: How a plurality of sample data spliced into a batch, generally use the default splicing can
pin_memory: whether the data stored in the pin memory area, the data pin memory of the GPU will go faster
drop_last: the number of data in the dataset may not be batch_size integral multiple, drop_last to True will be more out of a batch of insufficient data discard

 

def main():
    import visdom
    import time

    viz = visdom.Visdom()

    db = Pokemon('pokeman', 224, 'train')

    x,y = next(iter(db))   ##
    print('sample:',x.shape,y.shape,y)

    viz.image(db.denormalize(x),win='sample_x',opts=dict(title='sample_x'))

    loader = DataLoader(db,batch_size=32,shuffle=True)

    for X, Y in Loader:   # in order to obtain a form of a data set for each data set 32 
        viz.images (db.denormalize (X), nrow =. 8, win = ' BATCH ' , the opts = dict (title = ' BATCH ' )) 
        viz.text (STR (y.numpy ()), win = ' label ' , the opts = dict (title = ' BATCH-Y ' )) 

        the time.sleep ( 10)

 

 

 

In data processing, sometimes a sample can not read other issues, such as an image that is damaged. _ In this case getItem abnormal _ will function, i.e., then the best solution is to eliminate the error sample

 

pytorch data processing: define your own data set

 

Guess you like

Origin www.cnblogs.com/kevin-red-heart/p/11298382.html