np.transpose(img_a,(1,2,0)) during CIFAR-10 parsing

  • When parsing the CIFAR-10 dataset, the storage format of the dataset is as follows: that
    insert image description here
    is, when the data in the dataset is stored, each 32 32 3 image is pulled into a one-dimensional array, with a total of 3072 data. R for each point in the first 1024 bits, followed by G for 1024 points, and B for 1024 points.
  • Therefore, when we parse, we have the following commands:
mydict = unpickle(file_i)
    print(mydict[b'data'][0])
    img_a = mydict[b'data'][0].reshape(3,32,32) / 255   # CIFAR10数据集在将32*32*3图像拉伸为一维数组时,
    # 依次存放1024个R,1024个G,1014个B数据,将其reshape为(3,32,32),则一共有3层,第一层全是R,第二层全是G,第三层全是B
    img_a = np.transpose(img_a,(1,2,0)) # 翻转数据,将三层的RGB作为图像的深度
    print(img_a)
    cv2.imshow("wcc",img_a)
    userkey = cv2.waitKey()

  • Among them, reshape(3,32,32) obtains a matrix with a height of 3, a length of 32, and a width of 32. The first layer has 1024 data, all of which are R channel data. The 2nd floor is G and the 3rd floor is B.
    When executing transpose(img_a,(1,2,0)), the process is understood as follows: Take
    the following figure as an example:
    insert image description here
    the matrix in the red circle 1 is the original matrix. For the convenience of understanding, the (3,32,32) matrix is ​​simplified as (3,2,2) matrix, a total of 3 layers, 2 rows and 2 columns, where the first layer is [0 1] [2 3], [0 1] is the first row of a layer, [2 3] is the first layer The second row has 2 columns in each row. Therefore, the RGB data for each point should be: [0,4,8], [1,5,9], [2,6,10], [3,7,11]
    Making instructions according to the CIFAR-10 dataset , it can be seen that the data of the first layer are all R, after we transpose the matrix (1,2,0), the above result becomes (2,2,3), a total of 2 rows, 2 columns and 3 layers, in terms of rows and columns Look, each point contains three layers of data, one of which is taken from each layer of the original data, which just constitutes RGB, which conforms to the image data format, so it can be displayed directly with imshow.

Guess you like

Origin blog.csdn.net/wcc243588569/article/details/129666101