Summary of high-quality open source data sets related to image classification (with download links)

Edit: Extreme Market Platform

Flower Dataset

Dataset download address: http://m6z.cn/6rTT7n

The dataset contains 4242 images of flowers. Data collection is based on data flicr, google images, yandex images. This dataset can be used to identify plants from photos. Data images are grouped into five categories: chamomile, tulip, rose, sunflower, dandelion. There are about 800 photos of each category. Photos are not high resolution, about 320x240 pixels. Photos are not reduced to a single size, they have different ratios.

Comprehensive Car Dataset

Dataset download address: http://m6z.cn/6rTTar

This dataset was used by the CVPR 2015 paper "A Large-Scale Automotive Dataset for Fine-Grained Classification and Validation". The Composite Cars (CompCars) dataset contains data from two scenarios, including images from web-natural and surveillance-natural. The web-nature data contains 163 car manufacturers and 1,716 car models. In total, there are 136,726 images of full vehicles and 27,618 images of auto parts. The full car image is labeled with bounding boxes and viewpoints. Each car model is marked with five attributes, including maximum speed, displacement, number of doors, number of seats, and car type. Data of a surveillance nature consisted of 50,000 images of cars captured in front view.

indoor scene recognition

Dataset download address: http://m6z.cn/5PCpJ5

This dataset is the original data provided by MIT. Indoor scene recognition is a challenging open problem in high-level vision. Most scene recognition models for outdoor scenes perform poorly in the indoor domain. The main difficulty is that while some indoor scenes (e.g. corridors) are well characterized by global spatial attributes, others (e.g. bookstores) are better characterized by the objects they contain. More generally, to solve the indoor scene recognition problem, we need a model that can exploit local and global discriminative information. The database contains 67 indoor categories with a total of 15620 images. The number of images varies by category, but each category has at least 100 images. All images are in jpg format.

A dataset of 90 animal images

Dataset download address: http://m6z.cn/6rTTbJ

In this dataset there are 5400 animal images of 90 different categories. This dataset was created from Google Images: https://images.google.com/. All photos will be stored in their respective folders according to their categories. Animal categories include: antelope, badger, bat, bear, bee, beetle, bison, boar, butterfly, cat
caterpillar, chimpanzee and more. The images in this dataset are not fixed in size and may require subsequent processing.

aircraft dataset

Dataset download address: http://m6z.cn/5X8CPy

The dataset contains 10,000 images of airplanes, and the data is divided into 3334 training images, 3333 validation images, and 3333 test images.

clothes dataset

Dataset download address: http://m6z.cn/64EPUp

The clothes dataset collects a total of 5,000 images of 20 kinds of clothes. This dataset is released under a public domain license (CC0). We used three different ways to collect the dataset: Toloka - a crowdsourcing platform; a web crowdsourcing initiative on social media; Tagias - a company specialized in data collection. Labeling was done manually using an IPython widget, while we corrected labeling errors using a simple neural network.

The dataset contains 20 classes, including T-shirts (1011 items), long sleeves (699 items), pants (692 items), shoes (431 items) shirts (378 items), dresses (357 items), coats (312 items) , shorts (308 items), hats (171 items), skirts (155 items), blazers (109 items), etc.

Trademark Dataset

Dataset download address: http://m6z.cn/6cb2HG

In this work, we construct a large-scale logo dataset Logo-2K+, which covers various logo categories from real-world logo images. Our generated logo dataset contains 167,140 images with 10 root categories and 2,341 classes.

Office-Home dataset

Dataset download address: http://m6z.cn/5I6cFG

Office-Home is a benchmark dataset for domain adaptation, which contains 4 domains, each domain consists of 65 categories. The four domains are: Art - artistic images in the form of drawings, paintings, decorations, etc.; Clip Art - a collection of clip art images; Product - images of objects without a background; and Real World - objects captured with a regular camera image. It contains 15,500 images, with an average of about 70 images per class and a maximum of 99 images per class.

Food Image Dataset

Dataset download address: http://m6z.cn/6rdsSw

This dataset contains many different subsets of the full food-101 data. In order to make a simpler training set for image analysis than CIFAR10 or MNIST, the data includes massively downscaled versions of the images for fast testing. The data have been reformatted as HDF5, specifically Keras HDF5Matrix, so that they can be easily read. The filename indicates the content of the file. For example

foodc101n1000_r384x384x3.h5 indicates that there are 101 categories, n=1000 images, and the resolution is 384x384x3 (RGB, uint8)
foodtestc101n1000r32x32x1.h5 indicates that the data is part of the validation set, representing 101 categories, n=1000 images, and the resolution is 32x32x1 (f loat32 from -1 to 1)

The first goal with this dataset is to classify unknown images, but beyond that, see which regions/image components are important for classification, identify new types of food as combinations of existing labels, construct Object detectors that find similar objects throughout a scene.

Guess you like

Origin blog.csdn.net/Extremevision/article/details/126470702