Article directory
一、Oxford 102 Flower (102 Category Flower Dataset)
Oxford 102 Flower is an image classification dataset consisting of 102 flower categories. The flowers were chosen as common flowers in Britain. Each category consists of 40 to 258 images.
These images feature large variations in scale, pose, and lighting. Additionally, there are categories with large intra-category differences as well as several very similar categories.
2. Tiny ImageNet
Tiny ImageNet contains 100,000 images in 200 categories (500 per category), reduced to 64×64 color images. Each class has 500 training images, 50 validation images, and 50 testing images.
3. Stanford Cars
The Stanford Car Dataset contains 196 categories of cars, with a total of 16,185 images taken from the rear. The data is almost split into 50-50 train/test parts, which contains 8,144 training images and 8,041 test images. Categories are usually at the make, model, year level. Image size is 360×240.
4. Places205
The Places205 dataset is a large-scale scene-centric dataset containing 205 common scene categories. The training dataset contains approximately 2,500,000 images from these categories. In the training set, each scene category has a minimum of 5,000 and a maximum of 15,000 images. The validation set contains 100 images per category (20,500 images in total), and the test set contains 200 images per category (41,000 images in total).
五、DTD (Describable Textures Dataset)
The Descriptible Texture Dataset (DTD) contains 5640 wild texture images. They feature human-centric attribute annotations inspired by texture-aware properties.
6. Food-101
The Food-101 dataset contains 101 food categories, each category has 750 training images and 250 testing images, for a total of 101k images. The labels of the test images have been manually cleaned, while the training set contains some noise.
7. iNaturalist
The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. These categories belong to 13 supercategories, including Plantae (plants), Insecta (insects), Aves (birds), Mammalia (mammals), and more. The iNat dataset is highly imbalanced, with the number of images per category varying greatly. For example, the largest supercategory "Plantae" has 196,613 images from 2,101 categories; while the smallest supercategory "Protozoa" has only 381 images from 4 categories.
8. Caltech-256
Caltech-256 is an object recognition dataset containing 30,607 real-world images of different sizes covering 257 categories (256 object categories and an additional clutter category). Each category is represented by at least 80 images. This dataset is a superset of the Caltech-101 dataset.
九、PASCAL VOC (PASCAL Visual Object Classes Challenge)
The PASCAL Visual Object Classes (VOC) 2012 dataset contains 20 object classes, including vehicles, homes, animals, and others: airplanes, bicycles, boats, buses, cars, motorcycles, trains, bottles, chairs, dining tables, potted plants, Sofa, TV/Monitor, Birds, Cats, Cows, Dogs, Horses, Sheep and People. Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations. This dataset has been widely used as a benchmark for object detection, semantic segmentation, and classification tasks. The PASCAL VOC dataset is divided into three subsets: 1,464 images for training, 1,449 images for validation, and a private test set.
10. FGVC-Aircraft
FGVC-Aircraft contains 10,200 aircraft images, with 100 images of each of 102 different aircraft model variants, most of which are aircraft. The (master) aircraft in each image is annotated with tight bounding boxes and layered aircraft model labels. Aircraft models are organized in a four-level hierarchy. The four levels from fine to coarse are:
Model, such as Boeing 737-76J. Since some models are almost visually indistinguishable, this level is not used in the evaluation.
Variants such as the Boeing 737-700. One variant combines all visually indistinguishable models into one class. This dataset contains 102 different variants.
family, such as the Boeing 737. The data set contains 70 different series.
Manufacturers such as Boeing. The data set contains 41 different manufacturers. The data is divided into three equal-sized training, validation, and test subsets.
11. tieredImageNet
The tieredImageNet dataset is a larger subset of ILSVRC-12, containing 608 classes (779,165 images) divided into 34 higher-level nodes in the ImageNet human-curated hierarchy. This set of nodes is divided into 20, 6, and 8 disjoint sets of training, validation, and test nodes, and the corresponding classes form their respective metasets. As argued by Ren et al. (2018), this split near the root of the ImageNet hierarchy leads to a more challenging but more realistic regime where the test classes are less similar to the training classes.
12. EuroSAT
Eurosat is a dataset and deep learning benchmark for land use and land cover classification. The dataset is based on Sentinel-2 satellite imagery and covers 13 spectral bands, consisting of 10 categories, with a total of 27,000 labeled and geo-referenced images.