"Digital Image Processing-OpenCV/Python" serial (10) Image attributes and data types

"Digital Image Processing-OpenCV/Python" serial (10) Image attributes and data types


This book’s JD discount purchase link: https://item.jd.com/14098452.html
This book’s CSDN exclusive serial column: https://blog.csdn.net /youcans/category_12418787.html

Insert image description here


Chapter 2 Image Data Format

In the Python language, OpenCV stores images in Numpy arrays, and access and processing of images are achieved through the operation of Numpy arrays.


Summary of this chapter

  • Introduce the data structure of OpenCV in Python language and learn to obtain the basic attributes of images.
  • Learn how to use Numpy arrays to create, copy, crop, splice, split and merge images.
  • Learn to use lookup tables (LUTs) to quickly replace pixel values.

2.1 Image attributes and data types


2.1.1 Image color classification

According to image color classification, images can be divided into binary images, grayscale images and color images.

  • Binary image: An image with only two colors: black and white. The pixel value of each pixel can be represented by 0/1 or 0/255, 0 represents black, and 1 or 255 represents white.
  • Grayscale image: An image with only grayscale. The pixel value of each pixel can be represented by an 8-bit number [0, 255] to represent the gray level, such as 0 representing pure black and 255 representing pure white.
  • Color images: Color images can be represented by a combination of three color channels: blue (B), green (G), and red (R). Each pixel can use three 8-bit numbers [0, 255] to represent the color components of red, green and blue respectively. For example, (0,0,0) represents black, (0,0,255) represents red, and (255,255,255) represents White.

OpenCV uses the BGR format to read and decode the image, and then stores it as a multi-dimensional Numpy array in B/G/R order, while libraries such as PIL, PyQt, and Matplotlib use the RGB format.

In digital image processing, the color channel order of an image can be converted as needed, or a color image can be converted into a grayscale image and a binary image.


2.1.2 Representing digital images with Numpy arrays

Digital images are described by matrices composed of pixels, and are represented and processed as multi-dimensional Numpy arrays.

The Mat class defined by OpenCV in the C++ language is the most basic image storage format. In the Python language API, multi-dimensional arrays are stored and processed based on the Numpy library, that is, multi-dimensional Numpy arrays are used to store and process images. In the Python language, any operation OpenCV performs on images is essentially an operation on multi-dimensional Numpy arrays.

Binary images and grayscale images in OpenCV are represented by two-dimensional arrays. The shape of the array is (h, w), and the rows and columns represent the height and width of the image respectively. The value of each element in the array represents the grayscale value of the corresponding row/column pixel. Binary images are special grayscale images with pixel values ​​of 0/1 or 0/255.

The color image in OpenCV is represented by a three-dimensional array (h, w, ch), ch=3 represents the number of channels, and the data organization form is shown in Figure 2-1. Each element in the array corresponds to a certain color component value of the pixel.

The order of OpenCV color channels is B/G/R, so img[:,:,0] represents the B channel of the color image img, img[:,:,1] represents the G channel, and img[:,:,2] represents R channel.

In OpenCV, the data structure of an image is a Numpy array, so all properties and operation methods of Numpy arrays are applicable to OpenCV image objects. For example:

  • img.ndim: Check the dimensionality of the image. The dimensionality of the color image is 3, and the dimensionality of the grayscale image is 2.
  • img.shape: View the shape (h, w, ch) of the image, that is, the number of rows (height), number of columns (width) and number of channels of the image.
  • img.size: View the total number of image array elements, which is the product of the number of image pixels and the number of channels.

2.1.3 Image data type

OpenCV functions have strict requirements for data types, and incorrect data types will cause syntax errors.

The parameter naming format of the image data type in OpenCV is as follows.

CV_{Number of digits}{Number type}C{Number of channels}

For example, CV_8UC3 represents a matrix of three-channel 8-bit unsigned integer data format.

The comparison between OpenCV data types and Numpy data types is shown in Table 2-1. In image processing, the most commonly used data type is the 8-bit unsigned integer data CV_8U, and the corresponding Numpy data type is uint8.

It is recommended to use the name of the Numpy data type when calling Numpy library functions, and use the name of the OpenCV data type when calling OpenCV functions to avoid errors.

Use img.dtype to get the data type of the Numpy array, and use img.astype to convert the image data type into the specified Numpy data type.


[Routine 0201] Image attributes and data type conversion

This routine uses the Numpy array operation method to obtain the image attributes and data format.


# 【0201】图像属性与数据类型转换
import cv2 as cv
import numpy as np

if __name__ == '__main__':
    # 读取图像,支持 BMP、JPG、PNG、TIFF 等常用格式
    filepath = "../images/imgLena.tif"  # 读取文件的路径
    img = cv.imread(filepath, flags=1)  # flags=1 读取彩色图像(BGR)
    gray = cv.imread(filepath, flags=0)  # flags=0 读取为灰度图像

    # 维数(Ndim)、形状(Shape)、元素总数(Size)、数据类型(Dtype)
    print("Ndim of img(BGR): {}, gray: {}".format(img.ndim, gray.ndim))
    print("Shape of img(BGR): {}, gray: {}".format(img.shape, gray.shape))  # number of rows, columns and channels
    print("Size of img(BGR): {}, gray: {}".format(img.size, gray.size))  # size = rows × columns × channels

    imgFloat = img.astype(np.float32) / 255
    print("Dtype of img(BGR): {}, gray: {}".format(img.dtype, gray.dtype))  # uint8
print("Dtype of imgFloat: {}".format(imgFloat.dtype))  # float32


operation result:

Ndim of img(BGR): 3, gray: 2
Shape of img(BGR): (512, 512, 3), gray: (512, 512)
Size of img(BGR): 786432, gray: 262144
Dtype of img(BGR): uint8, gray: uint8
Dtype of imgFloat: float32

Program description:
(1) The color image is a three-dimensional Numpy array, and the grayscale image is a two-dimensional Numpy array. Therefore, a color image of the same size has the same number of pixels as a grayscale image, but a different number of array elements.
(2) The shape of the color image is (h, w, 3), and the shape of the grayscale image is (h, w). When viewing image height and width, it is recommended to use h, w=img.shape[:2], but h, w=img.shape is not recommended.


JD discount purchase link for this book: https://item.jd.com/14098452.html


Copyright statement:
youcans@xupt original work, reprints must be marked with the original link: (https://blog.csdn.net/youcans/article/details/133561857) Crated:2023-10-05
Copyright 2023 youcans, XUPT

Welcome to follow this book’s CSDN exclusive serialization column
"Digital Image Processing-OpenCV/Python" serialization: https://blog.csdn.net/youcans/category_12418787.html

Guess you like

Origin blog.csdn.net/youcans/article/details/133561857