White introductory computer vision (a): 10 minutes Getting opencv


Recently suddenly interested in computer vision, and so we learn about their own way, let's point it is interesting, not too difficult to learn, hey! ! !
What is computer vision it? In a nutshell is to let the computer owner can see, the people can know, the ability of people can think, you can say the computer has vision, namely computer vision. Besides the straightforward point is to let the computer to recognize images and video and then treated like a human brain as the algorithm can obtain the required information, and pictures and video to make some judgments.

Computer Vision Applications

For now learned is concerned, there are five main computer applications:
image classification: image classification Architecture - convolutional neural network (CNN)
target detection: R-CNN
tracking: mean shift algorithm that meanshift
semantic segmentation: FCN
instance split : Mask R-CNN

Understanding opencv

Computer vision may be mentioned the first time we can think of is opencv library opencv code is written in C ++, is a more classic image processing library, from the most classic to the very forefront of image algorithm DL pre-training model encompasses the CV many ways, many large library of image processing are developed based on opencv library. But opencv has a fatal flaw, that is non-differentiable, so the emergence of a new open source computer vision library differentiable Kornia, built on PyTorch. Interested can go to a small partner github point of view about this open source project https://github.com/arraiyopensource/kornia
us briefly look at some of the underlying operating opencv image processing library bar

Picture reading

opencv library is a cross-platform library, you can use your good language to call it, the article I used python, just call cv2 module on the line
python opencv module installation is very simple, command line on it

pip install opencv-python

Must pay attention to read the image using the cv2 module path can not have Chinese, or will not be read
Here Insert Picture Description
results after a look at this picture read cv2

import cv2
img= cv2.imread("data.jpg")
print(img)
#output
[[[238 239 243]
  [239 240 244]
  [240 241 245]
  ...
  [247 247 247]
  [247 247 247]
  [247 247 247]]
    ...
  [238 224 206]
  [238 224 206]
  [241 228 212]]]

We all know that pictures are composed of pixels, each pixel has three RGB color values make up, so read the picture is a return similar to the above three-dimensional array.
Here I will explain again imread () function now, this is the function prototype

imread(filename,flags)

flags parameter is read later mark, for selecting the image reading mode, the default value IMREAD_COLOR, the image is read must be a conscious channel 3 BGR color image. Set flag value reading what color picture about the format, as to how many models to choose from we all look at the statistics online opencv will know, here I am no longer be introduced.

Graying pictures

Because the picture is three-dimensional data volume of each picture will be great, convenient handling, so often converted grayscale for processing. What is grayscale, that is, in the RGB model, if R = G = B, the color means a color gradation, wherein the value of R = G = B gradation values ​​is called, the grayscale image thus obtained each pixel point is only one byte, not only saves storage space further facilitate handling.

import cv2
img= cv2.imread("1.jpg")
print(img)
img_gray = cv2.cvtColor(img, code=cv2.COLOR_BGR2GRAY)
print(img_gray) 
cv2.imshow('gray', img_gray)
# 等待键盘输入时中断,单位是毫秒,如果是0,无限等待,不然程序执行到show时,会发生肉眼看不见图片的现象
cv2.waitKey(0)
#output
[[240 241 242 ... 247 247 247]
 [240 241 242 ... 247 247 247]
 [240 241 242 ... 247 247 247]
 ...
 [204 204 204 ... 225 220 219]
 [204 204 204 ... 222 218 219]
 [200 200 200 ... 220 220 225]]

Here Insert Picture Description
We can see into gray image pixels becomes a two-dimensional

Face Recognition

Face recognition is biometric identification technology of facial features of a person based on the information, is currently a very popular field of computer vision research. Face recognition requires a lot of training data, we have to use modules already installed cv2 trained data set to test it, said the Internet haarcascade_frontalface_alt.xml this data set relatively high accuracy, for ease of use, we directly to this dataset copied to the same directory project

Here we must note, in the face detection module can only detect positive face, and the image must be positive, that is, people on the pictures can not be tilted his head, otherwise they can not be detected, and such articles that picture above it is not detected to face

Look at the complete code

import cv2

img = cv2.imread('data.jpg')
# 人脸数据,级联分类器,给人脸特征数据,返回可以识别人脸的对象
detector = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')
# 转换成灰度
gray = cv2.cvtColor(img, code=cv2.COLOR_BGR2GRAY)
# 使用训练好的识别人脸对象来识别人脸区域
# 后两个参数就是默认值,可以修改来调整识别人脸的精确度
face_zone = detector.detectMultiScale(gray, scaleFactor=1.2, minNeighbors=3) 

for x, y, w, h in face_zone:
    # 在人脸上画一个正方形,画正方形只需要知道左上角和右下角坐标即可
    cv2.rectangle(img, pt1=(x, y), pt2=(x+w, y+h), color=[0, 255, 0], thickness=2)

# 使用灰度图检测,绘制在彩色图片上
cv2.imshow('img', img)
cv2.waitKey(0)

Here mainly explain to you detectMultiScale () function, the function prototype is as follows:

    def detectMultiScale(self, image, scaleFactor=None, minNeighbors=None, flags=None, minSize=None, maxSize=None)

image: this is generally converted into image after gradation

scaleFactor: compensation parameters, generally set a value of 1.1 to 1.5 is better, the effect we all look forward to change

minNeighbors: around its current number of objects defined

minSize: set window size

flags: a little abstract, you can choose the default or figure
Ado give everyone look at the results

Here Insert Picture Description
Here Insert Picture Description
Let us try to identify the effects of people's pictures

Here Insert Picture Description
Here Insert Picture Description
After all, not their own training data set, so the result was almost mean, ha ha! ! ! Do not care about the details
we'll try to humans and animals can not be distinguished
Here Insert Picture Description
Here Insert Picture Description
or can distinguish human faces and animals face, ha ha ha! ! !

Video Processing

Video originally composed by a number of pictures, like the picture is composed of pixels, the two are the same reason, but there are audio video join, here we speak of the video audio processing how, because I do not Yes! ! !
Look at the whole bar code

import cv2

# 读取本地视频
cap = cv2.VideoCapture("data.mp4")

w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) + 1  # 宽一点没问题,小了不行
h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) + 1
vidioWriter = cv2.VideoWriter('data2.mp4', cv2.VideoWriter_fourcc('M','P','4','v'), 24, (w, h))  # 要求int型

detector = cv2.CascadeClassifier('haarcascade_frontalface_alt.xml')

while cap.isOpened():
    flag, frame = cap.read()

    gray = cv2.cvtColor(frame, code=cv2.COLOR_BGR2GRAY)
    face_zone = detector.detectMultiScale(gray, scaleFactor=1.3, minNeighbors=3)
    for x, y, w, h in face_zone:
        # 在人脸上画一个正方形,画正方形只需要知道左上角和右下角坐标即可
        cv2.rectangle(frame, pt1=(x, y), pt2=(x + w, y + h), color=[0, 255, 0], thickness=2)
        #  上面创建了写视频对象,仅需把每一帧写入即可
        vidioWriter.write(frame)


    if flag == False:
        # 判断是否还能读取到帧,取不到则表示视频结束了,退出循环
        break

    cv2.imshow('frame', frame)

    cv2.waitKey(20)

VideoWriter () function
to open a video file or a video camera requires VideoCapture Opencv class of stored video or a video camera to a local disk, the required VideoWriter class Opencv generally configured as follows

VideoWriter(filename, fourcc, fps,frameSize, isColor=true);

filename: the video is saved to a local file name
fourcc: expressed in the kind of encoders, commonly used are the following:

CV_FOURCC('P','I','M','1') = MPEG-1 codec
CV_FOURCC('M','J','P','G') = motion-jpeg codec
CV_FOURCC('M', 'P', '4', '2') = MPEG-4.2 codec
CV_FOURCC('D', 'I', 'V', '3') = MPEG-4.3 codec
CV_FOURCC('D', 'I', 'V', 'X') = MPEG-4 codec
CV_FOURCC('U', '2', '6', '3') = H263 codec
CV_FOURCC('I', '2', '6', '3') = H263I codec
CV_FOURCC('F', 'L', 'V', '1') = FLV1 codec

Wherein CV_FOURCC ( 'M', 'P ', '4', '2') is in this way with a minimal footprint
fps: frame rate, usually using 20-30
the frameSize: the size of each image, in a manner tuple parameter passing (W, H)
Read () function
flag, frame = cap.read ()

flag and return received frame value, a flag is a Boolean value indicating whether a read frame, the data frame to the current frame
appreciated that the above two functions, then video face recognition like to understand, is acquired per picture frames and then use face detection to draw a box marked, then there is a 20ms latency between each frame me, so together is a video, take a look at our output

Here Insert Picture Description
Kazakhstan seems still a little error, this article which video processing is actually broken down into images for processing.
As for the back image and video processing algorithms and principles I learned again for everyone to share, there is little interest in the partnership may also exchange with the look

Published 21 original articles · won praise 28 · views 3707

Guess you like

Origin blog.csdn.net/LPJCSY/article/details/104916028