Various data sets summary image data set voice data set

1. Image data set:

(1) MNIST: Handwritten digits (0-9) data set compiled by Yann LeCun of New York University, the picture size is 28*28, contains 60,000 training sets, 10,000 test sets, widely used for machine learning testing and training .

(2) cifar: a small image data set collected by Alex Krizhevsky and others of the Canadian Institute of Advanced Technology. Contains two kinds of CIFAR-10 and CIFAR-100, the picture size is 32*32. There are 10 categories of cifar-10. 50,000 training, 10,000 testing. CIFAR-100 contains 100 categories, each with 600 images, of which 500 are used for training and 100 are used for testing. Among them, these 100 categories constitute 20 large categories, and each image contains two labels, a small category and a large category.

(3) ImageNet: The image recognition database established by Li Feifei of Stanford in the United States simulated human recognition system. It currently contains 14197122 images. The images contain 1000 categories. It is the largest known image database, such as AlexNet, VggNet, GoogleLeNet, and ResNet. This data set is used in classic image recognition models. Picture generation description data set:

(4) COCO: It is a new image recognition, segmentation and captioning data set obtained by the Microsoft team. The characteristics are: target segmentation, recognition through context, each image contains multiple target objects, more than 300,000 images, more than 2,000,000 instances, 80 kinds of objects, each image contains 5 subtitles, and contains 100,000 key points. It is a data set commonly used for picture description. It can also be used for multi-label training.

(5) Image Chinese description data set: the data set of the ai challenge held by Sogou, Toutiao, etc. Each picture has five Chinese descriptions, the training set has 210,000 pictures, and the verification set has 30,000 pictures. Landscape image multi-label dataset: collected by Nanjing University, including 2000 images, each image has five labels for desert, mountains, sea, sunset, trees. It can be used for network migration training and image multi-label training.

2. Voice data set:

(1) clesent: English corpus recorded by native speakers of Chinese. The clesent voice library is divided into two sets of accent adaptation and testing, and a total of 3 hours.

(2) CET: The audio samples are sampled through the CET-4 and CET-6. The corpus is about 800 hours.

(3) TIMIT: This phonetic database is a continuous English phonetic database recorded by natives from various regions of the United States (with dialects). The data set is divided into training set and test set, about 5.5 hours.

(4) WSJCAM0: An English phonetic library released by the University of Cambridge, UK. The corpus is approximately 24 hours.

(5) WSJ1: Wall Street Journal oral corpus, mainly voice audio recorded by announcers. The size is about 162 hours.

(6) WSJ0: The Wall Street Journal corpus provided by the US Department of Defense Spoken Language Project is mainly used for the research of large vocabulary continuous speech recognition systems. The corpus is approximately 42.5 hours.

(6) TM: is the audio of some English textbooks, about 43 hours.

(7) Libirispeech: It is a corpus for reading audio books in the public domain based on LibriVox. Mainly to train and test the automatic speech recognition system. Among them, there are 100 hours of pure training speech database, 300 hours and other 500 hours corpus containing some noise; the test set and development set include the above training set. In the early stage of the hardware platform, learn to use your own notebook to run the neural network on the CPU. In the later stage, you need to have GPU resources as much as possible to save time.



Author: microchip
link: https: //www.jianshu.com/p/d0baf4326ff2
Source: Jane books
are copyrighted by the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.

Guess you like

Origin blog.csdn.net/a493823882/article/details/111941258