Article directory
Dataset download link
Baidu network disk download:
link: https://pan.baidu.com/s/1Uro6RuEbRGGCQ8iXvF2SAQ Password: hl31
Introduction to datasets
When it comes to Imagenet, everyone knows that it is a very large and well-known open source dataset. Generally, a new classification network is designed to be trained and verified on the Imagenet 1000 class data. Including common target detection networks, etc., the backbone used is generally pre-trained based on Imagenet. But for ordinary researchers or developers, this data set is too large (all downloads are about 100GB), and the training requires very high hardware, usually many high-end graphics cards are trained in parallel, even such a configuration is usually Still training for a few days. So many people are discouraged (I am one of them, the key is too big, and the domestic download is very slow).
In 2016, the google DeepMind team extracted a small part (about 3GB in size) from the Imagnet dataset to make the Mini-Imagenet dataset, with a total of 100 categories, each category has 600 pictures, a total of 60,000 pictures (all .jpg
files at the end ), and the size of the image is not fixed.
|
|
|
|
The structure of the dataset is:
├── mini-imagenet: 数据集根目录
├── images: 所有的图片都存在这个文件夹中
├── train.csv: 对应训练集的标签文件
├── val.csv: 对应验证集的标签文件
└── test.csv: 对应测试集的标签文件
Also included in the Mini-Imagenet dataset train.csv
, val.csv
and test.csv
three files. It should be noted that at the time the author made this dataset mainly for the field of few-shot learning, and the provided label files were not sampled from each category. pandas
I analyzed each tag file myself with the package.
train.csv
Contains 38,400 images in 64 categories.val.csv
Contains 9600 images in 16 categories.test.csv
Contains 12,000 images in 20 categories.
The images and categories between each csv
file are independent of each other, that is, a total of 60,000 images and 100 categories.
The data format of the file to be pandas
read is csv
as follows, each line corresponds to the name and category of a picture:
filename label
0 n0153282900000005.jpg n01532829
1 n0153282900000006.jpg n01532829
2 n0153282900000007.jpg n01532829
3 n0153282900000010.jpg n01532829
4 n0153282900000014.jpg n01532829
As for the actual object name corresponding to each category, you can view this json file , which is the corresponding label file in the Imagenet1000 class data.
{
"0": ["n01440764", "tench"],
"1": ["n01443537", "goldfish"],
"2": ["n01484850", "great_white_shark"],
...
}
Make new train and val files
According to the above analysis, it is not feasible to directly train your own classification network with the Mini-Imgenet dataset, because the train.csv
sum is val.csv
not sampled from each category, so we need to build a new train.csv
sum val.csv
file by ourselves. Below is a script I wrote myself to build train.csv
and val.csv
label files that divides the training set and validation set by a given ratio from these 100 categories.
import os
import json
import pandas as pd
from PIL import Image
import matplotlib.pyplot as plt
def read_csv_classes(csv_dir: str, csv_name: str):
data = pd.read_csv(os.path.join(csv_dir, csv_name))
# print(data.head(1)) # filename, label
label_set = set(data["label"].drop_duplicates().values)
print("{} have {} images and {} classes.".format(csv_name,
data.shape[0],
len(label_set)))
return data, label_set
def calculate_split_info(path: str, label_dict: dict, rate: float = 0.2):
# read all images
image_dir = os.path.join(path, "images")
images_list = [i for i in os.listdir(image_dir) if i.endswith(".jpg")]
print("find {} images in dataset.".format(len(images_list)))
train_data, train_label = read_csv_classes(path, "train.csv")
val_data, val_label = read_csv_classes(path, "val.csv")
test_data, test_label = read_csv_classes(path, "test.csv")
# Union operation
labels = (train_label | val_label | test_label)
labels = list(labels)
labels.sort()
print("all classes: {}".format(len(labels)))
# create classes_name.json
classes_label = dict([(label, [index, label_dict[label]]) for index, label in enumerate(labels)])
json_str = json.dumps(classes_label, indent=4)
with open('classes_name.json', 'w') as json_file:
json_file.write(json_str)
# concat csv data
data = pd.concat([train_data, val_data, test_data], axis=0)
print("total data shape: {}".format(data.shape))
# split data on every classes
num_every_classes = []
split_train_data = []
split_val_data = []
for label in labels:
class_data = data[data["label"] == label]
num_every_classes.append(class_data.shape[0])
# shuffle
shuffle_data = class_data.sample(frac=1, random_state=1)
num_train_sample = int(class_data.shape[0] * (1 - rate))
split_train_data.append(shuffle_data[:num_train_sample])
split_val_data.append(shuffle_data[num_train_sample:])
# imshow
imshow_flag = False
if imshow_flag:
img_name, img_label = shuffle_data.iloc[0].values
img = Image.open(os.path.join(image_dir, img_name))
plt.imshow(img)
plt.title("class: " + classes_label[img_label][1])
plt.show()
# plot classes distribution
plot_flag = False
if plot_flag:
plt.bar(range(1, 101), num_every_classes, align='center')
plt.show()
# concatenate data
new_train_data = pd.concat(split_train_data, axis=0)
new_val_data = pd.concat(split_val_data, axis=0)
# save new csv data
new_train_data.to_csv(os.path.join(path, "new_train.csv"))
new_val_data.to_csv(os.path.join(path, "new_val.csv"))
def main():
data_dir = "/home/wz/mini-imagenet/" # 指向数据集的根目录
json_path = "./imagenet_class_index.json" # 指向imagenet的索引标签文件
# load imagenet labels
label_dict = json.load(open(json_path, "r"))
label_dict = dict([(v[0], v[1]) for k, v in label_dict.items()])
calculate_split_info(data_dir, label_dict)
if __name__ == '__main__':
main()
train your own network
Project address: https://github.com/WZMIAOMIAO/deep-learning-for-image-processing
In the pytorch_classification
-> mini-imagenet
folder, two training scripts are provided, one for single GPU and one for multi-GPU . In this project, training ShuffleNetv2 is used as an example to explain. Trained for 100 epochs and achieved 78% accuracy.
Then, I use this pre-training weight to do transfer learning and train other small data sets, which is indeed helpful. In my testing process, without using pre-training weights, training my own dataset can achieve 80% accuracy, and using pre-training weights can achieve 90% accuracy. Of course, there is still some gap between the pre-training weight based on Mini-Imagenet and the pre-training weight based on Imagenet, after all, the amount of data is here. Previously using Imagenet-based pre-trained weights can achieve 94% accuracy.
Of course, if you want to quickly verify your new network, Mini-Imagenet is also a good choice.