When parsing the CIFAR-10 dataset, the storage format of the dataset is as follows: that
is, when the data in the dataset is stored, each 32 32 3 image is pulled into a one-dimensional array, with a total of 3072 data. R for each point in the first 1024 bits, followed by G for 1024 points, and B for 1024 points.
Therefore, when we parse, we have the following commands:
Among them, reshape(3,32,32) obtains a matrix with a height of 3, a length of 32, and a width of 32. The first layer has 1024 data, all of which are R channel data. The 2nd floor is G and the 3rd floor is B. When executing transpose(img_a,(1,2,0)), the process is understood as follows: Take the following figure as an example:
the matrix in the red circle 1 is the original matrix. For the convenience of understanding, the (3,32,32) matrix is simplified as (3,2,2) matrix, a total of 3 layers, 2 rows and 2 columns, where the first layer is [0 1] [2 3], [0 1] is the first row of a layer, [2 3] is the first layer The second row has 2 columns in each row. Therefore, the RGB data for each point should be: [0,4,8], [1,5,9], [2,6,10], [3,7,11] Making instructions according to the CIFAR-10 dataset , it can be seen that the data of the first layer are all R, after we transpose the matrix (1,2,0), the above result becomes (2,2,3), a total of 2 rows, 2 columns and 3 layers, in terms of rows and columns Look, each point contains three layers of data, one of which is taken from each layer of the original data, which just constitutes RGB, which conforms to the image data format, so it can be displayed directly with imshow.