Training a classification network with Mini-ImageNet

Dataset download link

Baidu network disk download:
link: https://pan.baidu.com/s/1Uro6RuEbRGGCQ8iXvF2SAQ Password: hl31

Introduction to datasets

When it comes to Imagenet, everyone knows that it is a very large and well-known open source dataset. Generally, a new classification network is designed to be trained and verified on the Imagenet 1000 class data. Including common target detection networks, etc., the backbone used is generally pre-trained based on Imagenet. But for ordinary researchers or developers, this data set is too large (all downloads are about 100GB), and the training requires very high hardware, usually many high-end graphics cards are trained in parallel, even such a configuration is usually Still training for a few days. So many people are discouraged (I am one of them, the key is too big, and the domestic download is very slow).

In 2016, the google DeepMind team extracted a small part (about 3GB in size) from the Imagnet dataset to make the Mini-Imagenet dataset, with a total of 100 categories, each category has 600 pictures, a total of 60,000 pictures (all .jpgfiles at the end ), and the size of the image is not fixed.

The structure of the dataset is:

├── mini-imagenet: 数据集根目录
     ├── images: 所有的图片都存在这个文件夹中
     ├── train.csv: 对应训练集的标签文件
     ├── val.csv: 对应验证集的标签文件
     └── test.csv: 对应测试集的标签文件

Also included in the Mini-Imagenet dataset train.csv, val.csvand test.csvthree files. It should be noted that at the time the author made this dataset mainly for the field of few-shot learning, and the provided label files were not sampled from each category. pandasI analyzed each tag file myself with the package.

  • train.csvContains 38,400 images in 64 categories.
  • val.csvContains 9600 images in 16 categories.
  • test.csvContains 12,000 images in 20 categories.

The images and categories between each csvfile are independent of each other, that is, a total of 60,000 images and 100 categories.

The data format of the file to be pandasread is csvas follows, each line corresponds to the name and category of a picture:

                filename      label
0  n0153282900000005.jpg  n01532829
1  n0153282900000006.jpg  n01532829
2  n0153282900000007.jpg  n01532829
3  n0153282900000010.jpg  n01532829
4  n0153282900000014.jpg  n01532829

As for the actual object name corresponding to each category, you can view this json file , which is the corresponding label file in the Imagenet1000 class data.

{
    
    "0": ["n01440764", "tench"], 
 "1": ["n01443537", "goldfish"], 
 "2": ["n01484850", "great_white_shark"],
 ...
}

Make new train and val files

According to the above analysis, it is not feasible to directly train your own classification network with the Mini-Imgenet dataset, because the train.csvsum is val.csvnot sampled from each category, so we need to build a new train.csvsum val.csvfile by ourselves. Below is a script I wrote myself to build train.csvand val.csvlabel files that divides the training set and validation set by a given ratio from these 100 categories.

import os
import json

import pandas as pd
from PIL import Image
import matplotlib.pyplot as plt


def read_csv_classes(csv_dir: str, csv_name: str):
    data = pd.read_csv(os.path.join(csv_dir, csv_name))
    # print(data.head(1))  # filename, label

    label_set = set(data["label"].drop_duplicates().values)

    print("{} have {} images and {} classes.".format(csv_name,
                                                     data.shape[0],
                                                     len(label_set)))
    return data, label_set


def calculate_split_info(path: str, label_dict: dict, rate: float = 0.2):
    # read all images
    image_dir = os.path.join(path, "images")
    images_list = [i for i in os.listdir(image_dir) if i.endswith(".jpg")]
    print("find {} images in dataset.".format(len(images_list)))

    train_data, train_label = read_csv_classes(path, "train.csv")
    val_data, val_label = read_csv_classes(path, "val.csv")
    test_data, test_label = read_csv_classes(path, "test.csv")

    # Union operation
    labels = (train_label | val_label | test_label)
    labels = list(labels)
    labels.sort()
    print("all classes: {}".format(len(labels)))

    # create classes_name.json
    classes_label = dict([(label, [index, label_dict[label]]) for index, label in enumerate(labels)])
    json_str = json.dumps(classes_label, indent=4)
    with open('classes_name.json', 'w') as json_file:
        json_file.write(json_str)

    # concat csv data
    data = pd.concat([train_data, val_data, test_data], axis=0)
    print("total data shape: {}".format(data.shape))

    # split data on every classes
    num_every_classes = []
    split_train_data = []
    split_val_data = []
    for label in labels:
        class_data = data[data["label"] == label]
        num_every_classes.append(class_data.shape[0])

        # shuffle
        shuffle_data = class_data.sample(frac=1, random_state=1)
        num_train_sample = int(class_data.shape[0] * (1 - rate))
        split_train_data.append(shuffle_data[:num_train_sample])
        split_val_data.append(shuffle_data[num_train_sample:])

        # imshow
        imshow_flag = False
        if imshow_flag:
            img_name, img_label = shuffle_data.iloc[0].values
            img = Image.open(os.path.join(image_dir, img_name))
            plt.imshow(img)
            plt.title("class: " + classes_label[img_label][1])
            plt.show()

    # plot classes distribution
    plot_flag = False
    if plot_flag:
        plt.bar(range(1, 101), num_every_classes, align='center')
        plt.show()

    # concatenate data
    new_train_data = pd.concat(split_train_data, axis=0)
    new_val_data = pd.concat(split_val_data, axis=0)

    # save new csv data
    new_train_data.to_csv(os.path.join(path, "new_train.csv"))
    new_val_data.to_csv(os.path.join(path, "new_val.csv"))


def main():
    data_dir = "/home/wz/mini-imagenet/"  # 指向数据集的根目录
    json_path = "./imagenet_class_index.json"  # 指向imagenet的索引标签文件

    # load imagenet labels
    label_dict = json.load(open(json_path, "r"))
    label_dict = dict([(v[0], v[1]) for k, v in label_dict.items()])

    calculate_split_info(data_dir, label_dict)


if __name__ == '__main__':
    main()

train your own network

Project address: https://github.com/WZMIAOMIAO/deep-learning-for-image-processing
In the pytorch_classification-> mini-imagenetfolder, two training scripts are provided, one for single GPU and one for multi-GPU . In this project, training ShuffleNetv2 is used as an example to explain. Trained for 100 epochs and achieved 78% accuracy.

shufflenetv2

Then, I use this pre-training weight to do transfer learning and train other small data sets, which is indeed helpful. In my testing process, without using pre-training weights, training my own dataset can achieve 80% accuracy, and using pre-training weights can achieve 90% accuracy. Of course, there is still some gap between the pre-training weight based on Mini-Imagenet and the pre-training weight based on Imagenet, after all, the amount of data is here. Previously using Imagenet-based pre-trained weights can achieve 94% accuracy.
Of course, if you want to quickly verify your new network, Mini-Imagenet is also a good choice.

Guess you like

Origin blog.csdn.net/qq_37541097/article/details/113027489