[Computer Vision | Image Classification] Commonly used data sets for image classification and their introduction (2)

一、Oxford 102 Flower (102 Category Flower Dataset)

Oxford 102 Flower is an image classification dataset consisting of 102 flower categories. The flowers were chosen as common flowers in Britain. Each category consists of 40 to 258 images.

These images feature large variations in scale, pose, and lighting. Additionally, there are categories with large intra-category differences as well as several very similar categories.

Insert image description here

2. Tiny ImageNet

Tiny ImageNet contains 100,000 images in 200 categories (500 per category), reduced to 64×64 color images. Each class has 500 training images, 50 validation images, and 50 testing images.

Insert image description here

3. Stanford Cars

The Stanford Car Dataset contains 196 categories of cars, with a total of 16,185 images taken from the rear. The data is almost split into 50-50 train/test parts, which contains 8,144 training images and 8,041 test images. Categories are usually at the make, model, year level. Image size is 360×240.

Insert image description here

4. Places205

The Places205 dataset is a large-scale scene-centric dataset containing 205 common scene categories. The training dataset contains approximately 2,500,000 images from these categories. In the training set, each scene category has a minimum of 5,000 and a maximum of 15,000 images. The validation set contains 100 images per category (20,500 images in total), and the test set contains 200 images per category (41,000 images in total).

Insert image description here

五、DTD (Describable Textures Dataset)

The Descriptible Texture Dataset (DTD) contains 5640 wild texture images. They feature human-centric attribute annotations inspired by texture-aware properties.

Insert image description here

6. Food-101

The Food-101 dataset contains 101 food categories, each category has 750 training images and 250 testing images, for a total of 101k images. The labels of the test images have been manually cleaned, while the training set contains some noise.

Insert image description here

7. iNaturalist

The iNaturalist 2017 dataset (iNat) contains 675,170 training and validation images from 5,089 natural fine-grained categories. These categories belong to 13 supercategories, including Plantae (plants), Insecta (insects), Aves (birds), Mammalia (mammals), and more. The iNat dataset is highly imbalanced, with the number of images per category varying greatly. For example, the largest supercategory "Plantae" has 196,613 images from 2,101 categories; while the smallest supercategory "Protozoa" has only 381 images from 4 categories.

Insert image description here

8. Caltech-256

Caltech-256 is an object recognition dataset containing 30,607 real-world images of different sizes covering 257 categories (256 object categories and an additional clutter category). Each category is represented by at least 80 images. This dataset is a superset of the Caltech-101 dataset.

Insert image description here

九、PASCAL VOC (PASCAL Visual Object Classes Challenge)

The PASCAL Visual Object Classes (VOC) 2012 dataset contains 20 object classes, including vehicles, homes, animals, and others: airplanes, bicycles, boats, buses, cars, motorcycles, trains, bottles, chairs, dining tables, potted plants, Sofa, TV/Monitor, Birds, Cats, Cows, Dogs, Horses, Sheep and People. Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations. This dataset has been widely used as a benchmark for object detection, semantic segmentation, and classification tasks. The PASCAL VOC dataset is divided into three subsets: 1,464 images for training, 1,449 images for validation, and a private test set.

Insert image description here

10. FGVC-Aircraft

FGVC-Aircraft contains 10,200 aircraft images, with 100 images of each of 102 different aircraft model variants, most of which are aircraft. The (master) aircraft in each image is annotated with tight bounding boxes and layered aircraft model labels. Aircraft models are organized in a four-level hierarchy. The four levels from fine to coarse are:

Model, such as Boeing 737-76J. Since some models are almost visually indistinguishable, this level is not used in the evaluation.
Variants such as the Boeing 737-700. One variant combines all visually indistinguishable models into one class. This dataset contains 102 different variants.
family, such as the Boeing 737. The data set contains 70 different series.
Manufacturers such as Boeing. The data set contains 41 different manufacturers. The data is divided into three equal-sized training, validation, and test subsets.

Insert image description here

11. tieredImageNet

The tieredImageNet dataset is a larger subset of ILSVRC-12, containing 608 classes (779,165 images) divided into 34 higher-level nodes in the ImageNet human-curated hierarchy. This set of nodes is divided into 20, 6, and 8 disjoint sets of training, validation, and test nodes, and the corresponding classes form their respective metasets. As argued by Ren et al. (2018), this split near the root of the ImageNet hierarchy leads to a more challenging but more realistic regime where the test classes are less similar to the training classes.

Insert image description here

12. EuroSAT

Eurosat is a dataset and deep learning benchmark for land use and land cover classification. The dataset is based on Sentinel-2 satellite imagery and covers 13 spectral bands, consisting of 10 categories, with a total of 27,000 labeled and geo-referenced images.
Insert image description here

Guess you like

Origin blog.csdn.net/wzk4869/article/details/133106504