mnist data set introduction, reading, and saving as pictures
- 1. Introduction to the mnist dataset:
- The MNIST dataset is a handwritten dataset, simply a bunch of these things
- The official website address of MNIST is MNIST ; by reading the official website, we can know that this data set consists of four parts, namely
; - That is, a training picture set, a training label set, a test picture set, and a test label set; we can see that this is not an ordinary text file or picture file, but a compressed file, download and decompress it, What we see is a binary file, where the content of the training image set is partially so
how can these binary data be interpreted? Here we only explain the training image set and training label set for the official website, and the test set is the same. - Regarding the training label set, the official website states that there
are 60,000 use cases in the training set, which means that this file contains 60,000 label contents, and the value of each label is a number between 0 and 9; On our training label set, as mentioned above, we first analyze the meaning of each attribute, offset represents the byte offset, that is, what is the offset of the binary value of this attribute; type represents the value of this attribute. The type of value; value represents the value of this attribute; description is a description of this; so, here is a description of the above, it says "starting from the 0th byte, there is a 32-bit integer. , its value is 0x00000801, which is a magic number; starting from the 4th byte, there is a 32-bit integer whose value is 60000, which represents the number of datasets; starting from the 8th byte, there is a 32-bit integer unsigned byte, its value is ??, it is a label value..."; we will now explain the file we have seen. Looking at the picture,
first we know that this file is opened with sublime (it has been decompressed), it is a file with sixteen It is expressed in hexadecimal, that is to say, each number in it represents four bits, and two numbers represent one byte; we first see that the offset is 0 byte,0000 0801
it represents the magic number, and its value For0000 0801
, let me add here what is a magic number. In fact, it is a check number, which is used to determine whether this file is the train-labels.idx1-ubyte file in MNIST; then look down at the offset of 4 bytes.0000 ea60
, we know that according to the above, this should be the number of capacity, which is 60000, and the hexadecimal of 60000 is ea60, which is satisfied; then look at the offset of 8 bytes05
, it represents our label value , that is to say, the label value of the first image is 5, and so on; - Next, let's take a look at the training image set. Also from the official website, we can see that
its explanation is similar to the above label file, but here is an additional explanation. In the MNIST image set, all the images are 28×28, that is, Each image has 28×28 pixels; looking back at our above image, it means that there is a 4-byte number at offset 0 in our train-images-idx3-ubyte file to0000 0803
represent magic Number; the next0000 ea60
value is 60000 to represent the capacity, and then there is a 4-byte number starting from the 8th byte, and the value is 280000 001c
, which means the number of lines per picture; starting from the 12th byte , there is a 4-byte number. 4 bytes, the value is also 28, which0000 001c
means the number of columns of each picture; starting from the 16th byte is our pixel value, speaking with pictures
; and every 784 bytes represent a picture
we can See that the binary content of the file is the same as what we analyzed. - Supplementary note: In the figure, we can see that there is an MSB first, the full name of which is "Most Significant Bit first", and a relatively symmetrical LSB first, "Least Significant Bit"; MSB first refers to the most significant bit first, also It is our big-endian storage, and LSB corresponds to little-endian storage; about big-endian and little-endian, you can refer to
2. Read and save the mnist dataset as a picture
The code consists of three files in total
ReadMnistData.h
ReadMnistData.cpp
Main.cpp
The ReadMnistData.h and ReadMnistData.cpp files define the functions for reading files and saving them as pictures, and Main.cpp sets the read file path and save path. Just set the read path and save path of the file, and then run it.
The code of Main.cpp is given below:
- #include "ReadMnistData.h"
- int main()
- {
- ReadMnistData rmd;
- /*读取训练文件并保存成图片格式*/
- string filename_train_images = "C:\\Users\\lyf\\Desktop\\mnist\\train-images-idx3-ubyte"; //train images 文件路径
- string filename_train_labels = "C:\\Users\\lyf\\Desktop\\mnist\\train-labels-idx1-ubyte"; //train labels 文件路径
- string save_train_image_path = "C:\\Users\\lyf\\Desktop\\mnist\\train_images\\"; //train images 保存路径
- vector<cv::Mat> vec_train_images; //保存读取的train images
- vector<int> vec_train_labels; //保存读取的train labels
- rmd.Read_Mnist_Images(filename_train_images, vec_train_images); //读取train images
- cout << "-----------------------------" << endl;
- rmd.Read_Mnist_Labels(filename_train_labels, vec_train_labels); //读取train labels
- cout << "-----------------------------" << endl;
- rmd.Save_Mnist_Images(save_train_image_path, vec_train_images, vec_train_labels); //保存train_images
- //==================================================================================
- /*读取测试文件并保存成图片格式*/
- string filename_test_images = "C:\\Users\\lyf\\Desktop\\mnist\\t10k-images-idx3-ubyte"; //test images 文件路径
- string filename_test_labels = "C:\\Users\\lyf\\Desktop\\mnist\\t10k-labels-idx1-ubyte"; //test labels 文件路径
- string save_test_image_path = "C:\\Users\\lyf\\Desktop\\mnist\\test_images\\"; //test images 保存路径
- vector<cv::Mat> vec_test_images; //保存读取的test images
- vector<int> vec_test_labels; //保存读取的test labels
- rmd.Read_Mnist_Images(filename_test_images, vec_test_images); //读取test images
- cout << "-----------------------------" << endl;
- rmd.Read_Mnist_Labels(filename_test_labels, vec_test_labels); //读取test labels
- cout << "-----------------------------" << endl;
- rmd.Save_Mnist_Images(save_test_image_path, vec_test_images, vec_test_labels); //保存test_images
- return 0;
- }
The code of the other two files will not be posted. If you need it, please download it yourself:
read the mnist dataset and save it as a picture
The code runs as follows:
File naming format:
For example, 0_00001.jpg: 0 indicates the content of the corresponding picture, that is, the label; 00001 indicates the first picture in the picture with the label 0, 00002 is the second picture, and so on. …