Application of OpenCV's HSV color space in color recognition in unmanned vehicles

RGB belongs to the three-primary color space, which is the most familiar to everyone. Any color you see can be made by mixing the three primary colors. However, the effective processing of images in color space is generally carried out in HSV space. HSV (Hue, Saturation, Brightness Value) is a color space created according to the intuitive characteristics of color, also known as the hexagonal cone model .

 

The value range of the HSV color space in OpenCV => H:[0, 180], S:[0, 255], V:[0, 255], the smaller the H hue , the closer it is to red, and the higher it is, the closer it is to blue. This expression is also more accurate than simply using red to represent red; the smaller the S saturation, the lighter the color, and the larger the color, the denser the color; the smaller the V brightness, the darker it is, and the larger it is, the brighter it is . Notice the color change in the picture above!
The reason for choosing HSV is that the hue represented by H can basically determine a certain color, and then combined with saturation and brightness information, it can be judged to be greater than a certain threshold. While RGB is composed of three components, it is necessary to judge the contribution ratio of each component. The recognition range of HSV space is wider, and it is more convenient to use.

1. Example demonstration

 Let's look at an example, take a pack of wide and narrow cigarettes with blue outer packaging, identify and track the color.

1.1. Color recognition and tracking

Since I don't have a camera installed on my desktop, I use the camera on the unmanned vehicle here. The method of obtaining the camera video is a little different from OpenCV, and it is similar. After all, it is based on OpenCV.

from jetbotmini import Camera
from jetbotmini import bgr8_to_jpeg
import cv2
import numpy as np
import traitlets
import ipywidgets.widgets as widgets
from IPython.display import display

# 目标颜色,这里设置为蓝色数组
color_lower = np.array([100,43,46])
color_upper = np.array([124, 255, 255])
# 相机实例
camera = Camera.instance(width=720, height=720)
# 显示控件(视频也是图片的连续帧)
color_image = widgets.Image(format='jpeg', width=500, height=400)
display(color_image)

# 实时识别颜色并反馈到上面的控件里
while 1:
    frame = camera.value # (720, 720, 3) (H,W,C)
    frame = cv2.resize(frame, (400, 400)) # (400, 400, 3)
    frame = cv2.GaussianBlur(frame,(5,5),0) # 高斯滤波(5, 5)表示高斯矩阵的长与宽都是5,标准差为0
    hsv = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)#将BGR转成HSV
    mask=cv2.inRange(hsv,color_lower,color_upper)
    mask=cv2.erode(mask,None,iterations=2) # 进行腐蚀操作,去除边缘毛躁
    mask=cv2.dilate(mask,None,iterations=2) # 进行膨胀操作
    mask=cv2.GaussianBlur(mask,(3,3),0)
    cnts=cv2.findContours(mask.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)[-2] # 轮廓
    if len(cnts)>0:
        cnt = max(cnts,key=cv2.contourArea) # 轮廓面积
        (color_x,color_y),color_radius=cv2.minEnclosingCircle(cnt) # 外接圆的位置信息
        if color_radius > 10:
            # 圆圈标注
            cv2.circle(frame,(int(color_x),int(color_y)),int(color_radius),(255,0,255),2)
    color_image.value = bgr8_to_jpeg(frame) # 转成图片传入Image组件

You can see that the pink circle is blue in the tracking screen. Among them , there are many web interactive components in the usage of the camera and widgets . If you are interested, you can refer to: Real-time captured pictures of the camera of the unmanned vehicle and related operations of the widgets

The meaning of this code is relatively clear . Instantiate the camera , classify and mark each frame of image according to the HSV color gamut space of different colors, first convert BGR (the picture read here is BGR instead of RGB) to HSV through cv2 . label . If the recognition effect is not ideal when the ambient light is sufficient, we can manually change some parameter settings in an endless loop.

1.2、cv2.inRange

Next, some functions in the above code are explained . When
making a mask , mask=cv2.inRange(hsv,color_lower,color_upper) means that the value below the lower boundary array and higher than the upper boundary array is 0 black, and the value in between is 255 white, which belongs to single channel. Let's look at the code to understand intuitively: 

import cv2
import matplotlib.pyplot as plt

img_cv2 = cv2.imread('test.jpg')
hsv = cv2.cvtColor(img_cv2, cv2.COLOR_BGR2HSV)
lowerb = np.array([20, 20, 20])
upperb = np.array([200, 200, 200])

# 黑白单通道(H,W)
mask = cv2.inRange(hsv, lowerb, upperb)
cv2.imshow('Display', mask)
cv2.waitKey(0)
cv2.destroyAllWindows()

plt.subplot(1,2,1); plt.imshow(img_cv2,aspect='auto');plt.axis('off');plt.title('BGR')
plt.subplot(1,2,2); plt.imshow(mask,aspect='auto');plt.axis('off');plt.title('mask')
plt.show()

 BGR and mask as shown below:

 

1.3, cv2.erode and cv2.dilate

Corrosion operation belongs to image morphology. It is the same as the literal meaning. Corrosion is performed. We can check the help of this function: erode(
src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) -> dst The size of the original src image is the same as that of the
target image dst , and the size of the kernel core here determines the size of the erosion, which is optional. Under the code test:

import cv2
import numpy as np

image = cv2.imread('test.jpg')
kernel = np.ones((5, 5), np.uint8)
image = cv2.erode(image, None)
cv2.imshow('erode', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Here image = cv2.erode(image, None) The second kernel parameter can be specified or not. After specifying image = cv2.erode(image, kernel) , and modify the size of the kernel to see the effect:

The essence is to do convolution operation. For the number in the operation that is not 1, it is set to 0, and it is 1 only if it is all 1. The function is like a comment in the code, used to remove some noise glitches on the edge, etc. The cv2.dilate expansion function can be regarded as the reverse operation of the corrosion function. After corrosion, the black part expands, and the expansion function shrinks.

1.4、cv2.GaussianBlur

Gaussian blur is also used for denoising. Let's take a look at adding Gaussian noise to a picture, and then use this function to make a denoising effect, mainly for Gaussian noise: 

import cv2  as cv
import numpy as np
 
def myShow(name,img):
    cv.imshow(name,img)
    cv.waitKey(0)
    cv.destroyAllWindows()
# 加高斯噪声
def addGauss(img,mean=0,val=0.01):
    img = img / 255
    gauss = np.random.normal(mean,val**0.05,img.shape)
    img = img + gauss
    return img

img = cv.imread('gauss.png')
img1 = addGauss(img)
myShow('img1',img1)

img2 = cv.GaussianBlur(img1,(3,3),0)
myShow('img2',img2)

Here I put the original image and the three images with Gaussian noise and noise reduction processing together, as shown below:

1.5, cv2.findContours and drawContours 

Find the contour function findContours(image, mode, method[, contours[, hierarchy[, offset]]]) -> contours, hierarchy  is detected in a binary image, so we first convert it into a grayscale image, and convert it into a binary image through the threshold .
Finally, we draw the outline and use drawContours(image, contours, contourIdx, color[, thickness[, lineType[, hierarchy[, maxLevel[, offset]]]]] -> image Let’s take a look at the
implementation of the code:

import cv2
import numpy as np

image = cv2.imread('test.jpg')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
ret, binary = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
contours, hierarchy = cv2.findContours(binary, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
image = cv2.drawContours(image,contours,-1,(255,0,0),2)

#cv2.imshow('erode',binary)
cv2.imshow('erode',image)
cv2.waitKey(0)
cv2.destroyAllWindows()

 

cv2.RETR_EXTERNAL : Detect outer contours, ignoring structures inside the contour.
cv2.CHAIN_APPROX_SIMPLE : Compress elements in the horizontal, vertical, and diagonal directions, and only retain the end point coordinates of this direction. For example, a matrix outline only needs 4 points to save the outline information.
cv2.CHAIN_APPROX_NONE : Store all contour points, and the pixel position difference between two adjacent points does not exceed 1

1.6, HSV color value 

Blue is used here as an example, what if you want other colors? What is the value in HSV, as shown in the figure below:

 

2. OpenCV knowledge points

A lot of OpenCV knowledge is used here. Let's get familiar with the commonly used ones, such as reading pictures and converting them to gray pictures. 

2.1, read and display pictures 

import cv2

img = cv2.imread('test.jpg', 0)
cv2.imshow("image",img)
# 如果注释下面的等待按键和释放窗口资源,会出现“窗口未响应”的状态,不能正常显示图片
cv2.waitKey()
cv2.destroyAllWindows()

 Of course, if this visual library is not installed, an error will be reported: ModuleNotFoundError: No module named 'cv2'

Installation command: pip install opencv-python -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com 

When reading the image cv2.imread() , the second parameter is 0, indicating the grayscale mode. You can also use cv2.IMREAD_GRAYSCALE instead of 0.
The other numbers are as follows:
1 indicates the color image cv2.IMREAD_GRAYSCALE
2 indicates not to modify the mode including the channel cv2.IMREAD_UNCHANGED 

2.2, read and save pictures

img = cv2.imread('test.jpg', 0)
cv2.imwrite('new.jpg', img)

In this way, the image read in grayscale mode is saved as a new image by cv2.imwrite .

2.3, read and display video 

Similarly, let's look at the effect of the video. The popular understanding is that the video continuously reads each frame (picture) in it. 

import cv2

cap = cv2.VideoCapture('test.mp4')
print(cap.read()[1].shape) # (1080, 1920, 3)

 In this way, one frame of the video is read, and the return value is a tuple. What if the entire video is read? Let's take a look:

import cv2

cap = cv2.VideoCapture('test.mp4')
if (cap.isOpened() == False):
    print('不能打开视频文件')
else:
    fps = cap.get(cv2.CAP_PROP_FPS) # 参数可直接使用5代替
    print('每帧速度:', fps,'FPS') # 每帧速度: 29.97002997002997 FPS
    f_count = cap.get(cv2.CAP_PROP_FRAME_COUNT) # 7
    print('总帧数: ', f_count) #总帧数:  3394.0

while(cap.isOpened()):
    ret, frame = cap.read()
    if ret == True:
        cv2.imshow('Frame',frame)
        key = cv2.waitKey(20)
        # 按q键退出
        if key == ord('q'):break
    else:
        break

# 注意释放资源
cap.release()
cv2.destroyAllWindows()

In this way, the video is displayed just like the picture is displayed. Among them, cv2.waitKey(20) means to wait for 20 milliseconds between consecutive frames. The larger the value, the longer the waiting time, and you can see that the playback of the video becomes slower.
Of course, the more critical thing is to obtain the video of the camera monitoring.

# 参数 0 表示设备的默认摄像头,当设备有多个摄像头时可以改变参数选择
cap = cv2.VideoCapture(0)

Of course, the ID of some external cameras may not be 0, we can use traversal to get it:

import cv2
ID = 0
while(1):
    cap = cv2.VideoCapture(ID)
    ret, frame = cap.read()
    if ret == False:
        ID += 1
    else:
        print(ID)
        break

 2.4. Combine multiple pictures into a video

The method of writing the pictures in a directory to the video is similar to the previous method, but it should be noted that the size of the pictures in the composite video must be the same , that is to say, if the pictures in the directory are of different sizes, simply writing the video directly will fail, so here you need to crop the pictures to be the same and use the interpolation method to operate:

import cv2
import os
path ='imgs'
size = (600,400) # (W,H)
fps = 1
#fourcc = cv2.VideoWriter_fourcc('X','V','I','D')
fourcc = cv2.VideoWriter_fourcc(*'XVID')
video = cv2.VideoWriter('hi.avi',fourcc,fps,size)

for item in os.listdir(path):
    if item.lower().endswith('.jpg'):
        img = cv2.imread(os.path.join(path,item))
        img1 = cv2.resize(img, size, interpolation=cv2.INTER_CUBIC)
        print(img1.shape) # (H,W,C)
        video.write(img1)
video.release()
cv2.destroyAllWindows()

The interpolation method specified by the interpolation parameter:

INTER_NEAREST : nearest neighbor interpolation
INTER_LINEAR : bilinear interpolation, default
INTER_CUBIC : bicubic interpolation within a 4x4 pixel neighborhood
INTER_LANCZOS4 : Lanczos interpolation within an 8x8 pixel neighborhood

2.5, straight line, rectangle, circle and other shapes

Some common shapes, which are very common in practical applications, are listed here:

2.5.1. Straight line

cv2.line(img, startPoint, endPoint, color, thickness)
startPoint	:起始位置像素坐标
endPoint:结束位置像素坐标
color:绘制的颜色
thickness:绘制的线条宽度

2.5.2, circle

cv2.circle(img, centerPoint, radius, color, thickness)
img:需要绘制的目标图像对象
centerPoint:绘制的圆的圆心位置像素坐标
radius:绘制的圆半径
color:绘制的颜色
thickness:绘制的线条宽度(thickness 是负数,表示圆被填充)

2.5.3. Rectangle

cv2.rectangle(img, point1, point2, color, thickness)
img:需要绘制的目标图像对象
point1:左上顶点位置像素坐标
point2:右下顶点位置像素坐标
color:绘制的颜色
thickness:绘制的线条宽度

2.5.4. Text

cv2.putText(img, text, point, font, size, color, thickness)
img:需要绘制的目标图像对象
text:绘制的文字
point:左上顶点位置像素坐标
font:绘制的文字格式
size:绘制的文字大小
color:绘制使用的颜色
thickness:绘制的线条宽度

2.5.5. Image scaling

cv2.resize(InputArray src, OutputArray dst, Size, fx, fy, interpolation)
InputArray src:输入图片
OutputArray dst:输出图片
Size:输出图片尺寸
fx, fy:沿x轴,y轴的缩放系数
interpolation:插值方法

3、bgr8_to_jpeg

Finally, when the image needs to be displayed in the image component, we need to feed back the image of each frame to the image component, and then the image of each frame is BGR, and the format of the image component is jpeg, so a conversion is required.
The function bgr8_to_jpeg is to encode the image into the memory buffer, which is essentially an encapsulation of the imencode function, which compresses the image.
The source code of the function is:

def bgr8_to_jpeg(value, quality=75):
    return bytes(cv2.imencode('.jpg', value)[1])

The imencode function inside is as follows:

imencode(ext, img[, params]) -> retval, buf
ext:定义输出格式的文件扩展名
img:要写入的图像
buf:输出缓冲区调整大小,以适应压缩图像

Quickly view the function source code or the situation where it is inconvenient to view the source file:

import inspect
print(inspect.getsource(bgr8_to_jpeg))

Guess you like

Origin blog.csdn.net/weixin_41896770/article/details/131746841