PytorchCNN project construction 3--- Cifar10 data set processing

The overall code can be viewed on my github

  • In actual convolutional neural network experiments on image processing, we often deal directly with the image itself, not the data set, so we need to learn how to convert the image into a data set, and then train it. Today’s main content is to convert the image Convert to a familiar data set~

  • Take the Cifar10 data set as an example. First download the Cifar10 data set from the official website, then convert the data set into pictures, and then convert the images into data sets.


Preliminary preparation:

Experimental equipment description:

  • I am using a remote server, pycharm programming
  • Create the PytorchCNN folder first, and all subsequent files and folders will be in this file

1. Download the Cifar10 data set from the official website

The Cifar10 data set has a total of 60,000 pictures, including 50,000 training pictures and 10,000 test pictures, each picture is 3*32*32 (3 channels, the picture size is 32*32),

Open the Cifar10 official website , download the required parts from Vision, and then follow the tutorial on the official website to learn the subsequent compression process.

Insert picture description here
Insert picture description here

note:

  • Because the data set is relatively large, it is best to download the data set to the root directory and call it directly every time it is used, which can save space. I put Cifar10 in my root directory'/DATASET/Cifar10/'
  • We can see that the data set contains a total of 6 files, data_batch_1, data_batch_2, …, data_batch_5, and test_batch. Each batch contains 10,000 pictures. The following is a process for each file

2. Convert the training set to a picture, and save the picture path and name to a txt file, and divide the training set into a training data set and a verification data set according to a certain probability

'''seg dataset to pic'''
def Trainset2Pic(cfg):
    classes = ('airplane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
    valid_pic_txt = open(cfg.PARA.cifar10_paths.valid_data_txt, 'w')# 设置为‘w'模式,并且放在最开始,则每次进行时,都会清空重写。
    train_pic_txt = open(cfg.PARA.cifar10_paths.train_data_txt, 'w')
    for i in range(1, 6):
        label_batch = ('A', 'B', 'C', 'D', 'E')
        traindata_file = os.path.join(cfg.PARA.cifar10_paths.original_trainset_path, 'data_batch_' + str(i))
        with open(traindata_file, 'rb') as f:
            train_dict = pkl.load(f,encoding='bytes')  # encoding=bytes ,latin1,train_dict为字典,包括四个标签值:b'batch_label',b'labels',b'data',b'filenames'
            data_train = np.array(train_dict[b'data']).reshape(10000, 3, 32, 32)
            label_train = np.array(train_dict[b'labels'])
            num_val = int(data_train.shape[0]*cfg.PARA.cifar10_paths.validation_rate)#验证集的个数

            val_list = random.sample(list(range(0,int(data_train.shape[0]))), num_val)

        for j in range(10000):
            imgs = data_train[j]
            r = Image.fromarray(imgs[0])
            g = Image.fromarray(imgs[1])
            b = Image.fromarray(imgs[2])
            img = Image.merge("RGB", (b, g, r))

            picname_valid = cfg.PARA.cifar10_paths.after_validset_path + classes[label_train[j]] + label_batch[i - 1] + str("%05d" % j) + '.png'
            picname_train = cfg.PARA.cifar10_paths.after_trainset_path + classes[label_train[j]] + label_batch[i - 1] + str("%05d" % j) + '.png'

            if (j in val_list):#如果在随机选取的验证集列表中,则保存到验证集的文件夹下
                img.save(picname_valid)
                valid_pic_txt.write(picname_valid + ' ' + str(label_train[j]) + '\n')

            else:
                img.save(picname_train)
                train_pic_txt.write(picname_train + ' ' + str(label_train[j]) + '\n')

    valid_pic_txt.close()
    train_pic_txt.close()

3. The test set test_dataset also does the same processing

def Testset2Pic(cfg):
    classes = ('airplane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
    testdata_file = os.path.join(cfg.PARA.cifar10_paths.original_testset_path, 'test_batch')
    test_pic_txt = open(cfg.PARA.cifar10_paths.test_data_txt, 'w')
    with open(testdata_file, 'rb') as f:
        test_dict = pkl.load(f, encoding='bytes')  # train_dict为字典,包括四个标签值:b'batch_label',b'labels',b'data',b'filenames'
        data_test = np.array(test_dict[b'data']).reshape(10000, 3, 32, 32)
        label_test= np.array(test_dict[b'labels'])

    test_pic_txt = open(cfg.PARA.cifar10_paths.test_data_txt, 'a')
    for j in range(10000):
        imgs = data_test[j]

        r = Image.fromarray(imgs[0])
        g = Image.fromarray(imgs[1])
        b = Image.fromarray(imgs[2])
        img = Image.merge("RGB", [b, g, r])

        picname_test = cfg.PARA.cifar10_paths.after_testset_path + classes[label_test[j]] + 'F' + str("%05d" % j) + '.png'
        img.save(picname_test)
        test_pic_txt.write(picname_test + ' ' + str(label_test[j]) + '\n')
    test_pic_txt.close()

Description:

After the pictures are processed, we can see the corresponding txt file and the folder where the corresponding data set pictures are saved in the DATASET/cifar10 folder

Insert picture description here

Insert picture description here
Insert picture description here


4. Then convert the picture into a data set, inherit the torch.utils.data.Dataset class to write

'''3. pic to dataset'''
from torch.utils.data import Dataset
class Cifar10Dataset(Dataset):
    def __init__(self,txt,transform):
        super(Cifar10Dataset,self).__init__()
        fh = open(txt, 'r')
        imgs = []
        for line in fh:
            line = line.strip('\n')
            words = line.split()  # 用split将该行切片成列表
            imgs.append((words[0], int(words[1])))
        self.imgs = imgs
        self.transform = transform

    def __getitem__(self, index):
        file_path, label = self.imgs[index]
        label = ToOnehot(label,10)
        img = Image.open(file_path).convert('RGB')
        if self.transform is not None:
            Trans = DataPreProcess(img)
            if self.transform == 'for_train' or 'for_valid':
                img = Trans.transform_train()
            elif self.transform == 'for_test':
                img = Trans.transform_test()
        return img,label

    def __len__(self):
        return len(self.imgs)

5. Process the image data

class DataPreProcess():
    def __init__(self,img):
        self.img = img

    def transform_train(self):
        return transforms.Compose([
            transforms.RandomCrop(32, padding=4),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),#三个均值,三个方差。
        ])(self.img)

    def transform_test(self):
        return transforms.Compose([
            transforms.ToTensor(),
            transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
        ])(self.img)

Description:

There are standardized operations in image data processing, that is, 0-1 normalization, and subsequent training is easier to converge, but why are these numbers? Let's do the calculations based on the image~

class Normalization():
    def __init__(self):
        super(Normalization, self).__init__()

    def Mean_Std(self, file_path):#计算所有图片的均值和方差
        MEAN_B, MEAN_G, MEAN_R = 0, 0, 0
        STD_B, STD_G, STD_R = 0, 0, 0
        count = 0
        with open(file_path, 'r') as f:
            for line in f.readlines():
                count += 1
                words = line.strip('\n').split()
                img = torch.from_numpy(cv2.imread(words[0])).float()
                MEAN_B += torch.mean(img[:, :, 0] / 255)
                MEAN_G += torch.mean(img[:, :, 1] / 255)
                MEAN_R += torch.mean(img[:, :, 2] / 255)
                STD_B += torch.std(img[:, :, 0] / 255)
                STD_G += torch.std(img[:, :, 1] / 255)
                STD_R += torch.std(img[:, :, 2] / 255)
        MEAN_B, MEAN_G, MEAN_R = MEAN_B / count, MEAN_G / count, MEAN_R / count
        STD_B, STD_G, STD_R = STD_B / count, STD_G / count, STD_R / count
        return (MEAN_B, MEAN_G, MEAN_R), (STD_B, STD_G, STD_R)

    def Normal(self,img, mean, std):#将一张图片,由正态分布变成0-1分布
        img = img/255
        img = img.transpose(2, 0, 1) #[H,W,C]-->[C,H,W],与pytorch的ToTensor做相同的变换,便于比较
        img[0, :, :] =  (img[0, :, :] - mean[0]) / std[0]
        img[1, :, :] =  (img[1, :, :] - mean[1]) / std[1]
        img[2, :, :] =  (img[2, :, :] - mean[2]) / std[2]
        return img


def test():
    img = cv2.imread('../../DATASET/cifar10/trainset/airplaneA00029.png')#读取一张图片,cv读取结果为[b,g,r] [H,W,C]
    my_normal = Normalization()
    img1 = my_normal.Normal(img, (0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))
    normal = transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010))

    converToTensor = transforms.ToTensor()
    img2 = converToTensor(img) #[H,W,C]-->[C,H,W]
    img2 = normal(img2)
    print(img1)
    print(img2)

if __name__ == '__main__':
    test()

references

Cifar10 official website

Finally, I would like to thank my brother, who taught me to build the whole project and the little friends who studied together in the laboratory~ I hope they will succeed in everything, and they will have a great future!

Guess you like

Origin blog.csdn.net/qq_44783177/article/details/113747416