Chapter 3 - OpenCV Basics - 2&3 - Image Processing

pixel processing

 

Pixel is the basic unit of image, and pixel processing is the basic operation of image processing, and the elements in the image can be accessed and processed through the index.

A picture has intuitive attributes of width and height. The width and height constitute a coordinate system with the upper right as the origin, the width as the x-axis & the column as the y-axis. Each coordinate point is the index position of the pixel point, and the whole picture is a Two-dimensional array.

The data of each index point in the binary image & grayscale image is the black and white color value of this coordinate point. The specific type is, the color value of each point in the binary image is 0&255, and the color value of each point in the grayscale image is [0,255]. Therefore, the binary image is also a special grayscale image. The grayscale image is represented by a computer as a two-dimensional array of integer data.

The data of each index point of a color picture is a combination of color values ​​of the actual type in a list-like form (expressed as [aa bb cc]), and the color values ​​are the color values ​​taken from this index position of the B, G, and R channels respectively. When the color values ​​of each index position of the B, G, and R channels are the same, the color image is displayed as a grayscale image (it seems to be, but it is not). The performance of a color image on a computer is a two-dimensional array that stores data in the form of [aa bb cc] (in fact, it can also be regarded as a three-dimensional array, which is understood as a cube with length, width and height. The length and width are the length and width of the picture, and the height is always the channel. number, usually 3).

Grayscale

import cv2 as cv
import numpy as np

# 生成一个10*10的二维数组,初始值为0
pic = np.zeros((10, 10), np.uint8)
print(pic)
# 显示纯黑
cv.imshow("np zero", pic)
# 生成一个10*10的二维数组,初始值为1
pic = np.ones((10, 10), np.uint8)
# 显示近似纯黑(反正眼睛看不出来)
cv.imshow("np one", pic)
print(pic)
# 二维数组的每个元素从1设置为255
pic = pic[:, :] * 255
print(pic)
# 改变某个点的颜色值,(5,5)第6行第6列
pic[5, 5] = 0
print(pic)
# 显示纯白,点(5,5)为白色
cv.imshow("np 255", pic)
cv.waitKey()
cv.destroyAllWindows()

Notice:

  1. numpy.zeros ((w, h), numpy.uint8) and numpy.ones ((w, h), numpy.uint8) are two-dimensional arrays whose initial values ​​are 0 and 1 respectively for generating w columns and h rows, and the type is
  2. pic[ : , : ] is similar to the slice operation, it is a two-dimensional array for all selections, if part of the selection is equivalent to the usage of slices
  3. The image data source can also be obtained by reading the grayscale image through imread

color image

import cv2 as cv

lena = cv.imread("lenacolor.png")
print(lena)  # 打印处所有位置及通道的颜色值
print("---------------------")
print(lena[0])  # 打印出一行的所有通道的颜色
print("---------------------")
print(lena[0, 0])  # 打印处单个点的三个通道的颜色值
print("---------------------")
print(lena[0, 0, 0])  # 原点位置B通道的颜色值
print(lena[0, 0][0])  # 等同于上面
print("---------------------")
print(lena[0, 0, 1])  # 原点位置G通道的颜色值
print(lena[0, 0][1])  # 等同于上面
print("---------------------")
print(lena[0, 0, 2])  # 原点位置R通道的颜色值
print(lena[0, 0][2])  # 等同于上面

lena[10:100, 20:120, 0:2] = 255  # 切片操作修改区域及通道的值

cv.imshow("half", lena[0:256, :])
cv.waitKey()
cv.destroyAllWindows()

The program runs as follows:

Notice:

  1. When OpenCV processes a color image in RGB mode, it reads the pixel values ​​of the B channel, G channel, and R channel of the image in sequence according to the direction, and stores the pixel values ​​in the ndarray class in turn.
  2. The color value in the channel can be read in the form of lena[x,y,z] or lena[x,y][z]
  3. The data in the ndarray class can be modified in slices, and the length, width and channel index values ​​should not exceed the bounds

channel split

The RGB image is composed of R channel, G channel, and B channel. After being read by OpenCV, the sequence becomes B channel, G channel, and R channel. In addition to the previously provided slicing method to obtain single-channel data, OpenCV also provides a separate method split() to obtain single-channel.

import cv2 as cv

lena = cv.imread("face1.jpg")
# 切片方式分别得到单通道数据
b1 = lena[:, :, 0]
g1 = lena[:, :, 0]
r1 = lena[:, :, 0]
print(b1)
print(g1)
print(r1)

# 通过cv2.split()一次性得到所有的单通道数据
b2, g2, r2 = cv.split(lena)
#上面语句等价于下面语句
b2=cv.split(lena)[0]
g2=cv.split(lena)[1]
r2=cv.split(lena)[2]
print(b2)
print(g2)
print(r2)

channel merge

import cv2 as cv

lena = cv.imread("face1.jpg")
# 通过cv2.split()一次性得到所有的单通道数据
b1, g1, r1 = cv.split(lena)

cv.imshow("lena", lena)

new_lena = cv.merge([b1, g1, r1])  # 使用cv2.merge()合并通道形成彩照数据,注意通道顺序不能乱,
cv.imshow("new_lena", new_lena)
cv.waitKey()
cv.destroyAllWindows()

resize image

OpenCV provides the method cv2.resize() to adjust the image size, which can be scaled according to the ratio or set according to the specific display size

import cv2 as cv

# dst=cv2.resize(src,dsize)             dsize=(size_y,size_x)
# dst=cv2.resize(src,dsize,fx,fy)       fx,fy缩放大小
lena = cv.imread("face1.jpg")
cv.imshow("lena", lena)
# 指定大小缩放
size = (300, 300)
size_200 = cv.resize(lena, size)
cv.imshow("resize", size_200)
# 按比例缩放
rows, cols = lena.shape[:2]
resize1 = cv.resize(lena, (round(cols * 0.5), round(rows * 0.5)))
cv.imshow("resize1 0.5 0.5", resize1)

# 按比例缩放2 设置行,后设置列,行为0.5倍,列为0.3倍:
resize2 = cv.resize(lena, None, fx=0.5, fy=0.3)
cv.imshow("resize2 0.5 0.3", resize2)

cv.waitKey()
cv.destroyAllWindows()

ROI

The region of interest in the image during image processing becomes the region of interest (Region of Interest, ROI)

The operation is to process some areas of the image, as long as you know the concept of ROI

Guess you like

Origin blog.csdn.net/sunguanyong/article/details/129188852