Opencv-python basic knowledge of graphics and image processing

☞ ░ to the old ape Python Bowen directory ░

OpenCV is a classic dedicated library in computer vision, which supports multi-language, cross-platform, and powerful functions. OpenCV-Python provides a Python interface for OpenCV, allowing users to call C/C++ in Python, and realize the required functions while ensuring legibility and operating efficiency.

1. Installation

There are many installation methods. There is no C++ environment on the old ape machine, so use pip to install directly. The OpenCV-Python module is named opencv-python (the case is not sensitive under windows and other operating systems are not verified). The specific installation commands are as follows:

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple opencv-python

Two, load OpenCV

Importing OpenCV module instructions is very simple:
import cv2 as cv

Most OpenCV functions are in the cv2 module, cv2 does not mean version 2.X, but because this version is based on the original cv version using object-oriented programming methods to re-implement, providing a better API interface .

Three, read the image file

3.1. Syntax:`imread(filename,mode)`

3.2, parameter description

filename: Image file name. Different operating systems support different file types, but all support bmp image files, and may also include jpeg, png, tiff and other format files
mode: File reading mode, there are three commonly used values

cv.IMREAD_COLOR: The corresponding value is 1, and the color image is loaded. The transparency of any image will be ignored. It is the default flag.
cv.IMREAD_GRAYSCALE: The corresponding value is 0, and the image is loaded in gray mode
cv.IMREAD_UNCHANGED: the corresponding value is -1, load the image, including the alpha channel

Note: In addition to these three commonly used values, there can also be multiple values. The relevant values and meanings are as follows:
Insert picture description here

3.3 Return value description

imread returns an image object in BGR format, whose type is a numpy array.

3.4, case

img = cv2.imread(r'F:\screenpic\redflower.jpg')

note:

The image file name cannot be a Chinese name, otherwise the recognition will report an error or cannot be read;
imread will remove the alpha channel information of the image

Four, display the image

4.1. Calling syntax:`imshow(title,img)`

4.2. Parameter description:

title: title and name of the image display window
img: opencv image object

The image read by imread can be displayed using imshow. The display can set a title for the display window. This title is also the name of the display window. Imshow with different titles will display different windows. The title is an English string, and windows with the same title are the same window. For windows, OpenCV provides mouse and keyboard event handling mechanisms.

The window of imshow can be closed through destroyWindow and destroyAllWindows. The former should have the title of the window, and the latter is to close all windows created by the current program.

4.3, case

img = cv2.imread(r'F:\screenpic\redflower.jpg')
cv2.imshow('img',img)

Five, VideoCapture reads the camera, image file, or video stream

VideoCapture not only supports reading from video files (.avi, .mpg format), but also supports reading directly from a camera (such as a computer with a camera). VideoCapture is a class. If you want to get a video, you need to create a VideoCapture object. There are three ways to create a VideoCapture object:

Calling syntax:

VideoCapture(int deviceIndex，int apiPreference = CAP_ANY): Turn on the camera to capture video. deviceIndex is the camera serial number, open the default camera to pass 0, apiPreference is the VideoCapture API back-end identifier, the old monkey did not study carefully, just use the default value
VideoCapture(filename, int apiPreference = CAP_ANY)`: open the file specified by filename
VideoCapture(): Create an object, but did not confirm the source of the captured data, you need to confirm the source of the captured data through other methods of VideoCapture

For more VideoCapture content, please refer to " opencv learning—Basic knowledge of VideoCapture class ".

The following code opens the default camera to capture video, and displays the captured content as a window video, and writes it into the video file to save, press q to terminate and exit:

import cv2 

def captureVideoFromCamera():
    cap = cv2.VideoCapture(0,cv2.CAP_DSHOW)
    WIDTH = 1920
    HEIGHT = 1920
    FILENAME = r'f:\video\myvideo.avi'

    FPS = 24
    cap.set(cv2.CAP_PROP_FPS, 24)
    # 建议使用XVID编码,图像质量和文件大小比较都兼顾的方案
    fourcc = cv2.VideoWriter_fourcc(*'XVID')

    out = cv2.VideoWriter(FILENAME, fourcc=fourcc, fps=FPS,frameSize=(WIDTH,HEIGHT))

    if not cap.isOpened():
        print("Cannot open camera")
        exit()
    while True:
        # 逐帧捕获
        ret, frame = cap.read()
        # 如果正确读取帧，ret为True
        if not ret:
            print("Can't receive frame (stream end?). Exiting ...")
            break
        frame = cv2.flip(frame, 1)  # 水平翻转
        ret = out.write(frame)
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        # 显示结果帧e
        cv2.imshow('frame', frame)
        if cv2.waitKey(1) == ord('q'):  break
    # 完成所有操作后，释放捕获器
    out.release()
    cap.release()
    cv2.destroyAllWindows()

captureVideoFromCamera()

Six, OpenCV-Python mouse event capture

OpenCV provides a mechanism for setting the mouse event callback function to provide mouse event processing. The method of setting the callback function is as follows:
cv2.setMouseCallback(winName, OnMouseFunction, param)
where winName is the name of the window to be set for mouse callback processing, OnMouseFunction is the callback function to process the mouse response, and param is to set the callback The application-specific parameters passed in the function can not be set, but it is very useful when the callback function accesses and sets the callback function object properties.

Example:

cv2.namedWindow('image')
cv2.setMouseCallback('image', draw_circle)

Seven, waitKey keyboard event processing

openCV provides fast keyboard processing support function waitKey, calling syntax:
retval = cv.waitKey( [, delay] )
Among them:

delay: the time to wait for the keyboard response, the unit is milliseconds, if it is 0, it is to wait until there is keyboard input, otherwise it is to wait for the corresponding time to return before the input
retval: If it is overtime, it returns -1, otherwise it returns the ASCII code of the corresponding keyboard key, but note that for some function keys such as F1–F10, the return value is 0, and other function keys have not been tested one by one. You can confirm that the ESC key value is normal Return (return value 27), Ctrl+c returns 3

Eight, OpenCV rectangle drawing

OpenCV provides a method for drawing geometric figures in images. The drawn images include rectangles, ellipses, sectors, arcs, etc. This article mainly introduces the drawing of rectangles. The specific calling syntax is as follows:

rectangle(img, pt1, pt2, color, thickness=None, lineType=None, shift=None)

The parameters:

img: The image to be displayed, as a numpy array, in BGR format
pt1: the coordinates of the upper left corner point
pt2: the coordinates of the lower right corner point
color: the color to be drawn, which is a triplet in BGR format, such as (255,0,0) for blue
thickness: the thickness of the border, if it is a negative number, the rectangle is a solid rectangle, otherwise it is a hollow rectangle
linetype: line type, including 4-connected, 8-connected and anti-aliased linetypes, just use the default value
shift: the precision of the coordinate value, 2 means accurate to 2 decimal places

In addition, this method has a variant calling method:,
rectangle(img, rec, color[, thickness[, lineType[, shift]]])where rec is the rectangle constructed by pt1 and pt2 above.

In addition to rectangles, OpenCV also supports drawing points, lines, circles, ellipses, text (Chinese is not supported), etc. For details, please refer to " Drawing of Lines, Circles, Rectangles and Texts in Detailed Explanation of Image Processing with Python OpenCV ". introduced.

The following sample code is to open the video playback, after the mouse click, pause the playback and draw a circle at the clicked position, click again to resume playback:

import cv2

def mouseEvent( event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONDOWN:
        param[0] = not param[0]
        param[1] = (x,y)

def playVideoFile():
    cap = cv2.VideoCapture(r'f:\video\zbl1.mp4')
    fps = 1
    eventInf = [False,None]
    frame = None
    if not cap.isOpened():
        print("Cannot open camera")
        exit()
    cv2.namedWindow('image')
    cv2.setMouseCallback('image', mouseEvent,eventInf)
    while True:
        # 逐帧捕获
        pause,mousePos = eventInf
        if not pause:
            ret, frame = cap.read()
            if not ret:
                if frame is None :
                    print("The video has end.")
                else:
                    print("Read video error!")
                break

        else:
            if mousePos:
                cv2.circle(frame, mousePos,60,  (255,0,0),2)
        cv2.imshow('image', frame)
        ch = cv2.waitKey(int(1000/fps))
        if  ch == ord('q'):  break
    cap.release()
    cv2.destroyAllWindows()

playVideoFile()

Nine, OpenCV color space conversion method

cv2.cvtColor is the color space conversion function provided by openCV. The calling syntax is as follows
cvtColor(src, code, dstCn=None)
:

src: the image to be converted
code: Conversion code, which indicates the conversion from which type of image to which type. For example, the cv2.COLOR_BGR2GRAY that needs to be used below is to convert a BGR format color image into a grayscale image. For the specific conversion code, please refer to the official website document
dstCn: the number of channels of the target image, if it is 0, it will be automatically confirmed according to the number of channels of the source image and the conversion code

For more information, please refer to " Learning Opencv's cvtColor ". For examples, please refer to the image threshold processing section below.

10. Image threshold processing

The threshold processing of openCV images is also called binarization. The reason why it is called binarization is that it can convert an image into an interesting part (foreground) and an uninteresting part (background). During conversion, a certain value (ie, threshold) is usually used as the criterion for distinguishing processing, and pixels that exceed the threshold are usually used as foreground.

There are two ways of threshold processing. One is a fixed threshold method, which also includes multiple processing modes. The other is a non-fixed threshold. The program calculates the appropriate threshold for the image according to the algorithm and the maximum threshold given, and then uses this threshold for the second Value processing and non-fixed threshold processing need to superimpose combination marks on the basis of fixed threshold processing.

Calling syntax:
retval, dst = cv2.threshold (src, thresh, maxval, type)
where:

src: source image, numpy array of 8-bit or 32-bit image
thresh: Threshold, a number between 0-255, set different outputs based on the threshold value when processing
maxval: The maximum threshold. When using the fixed threshold method, it is the specified threshold. When the mark is superimposed, it is the maximum allowable threshold. The algorithm must calculate the appropriate threshold within the range of less than this value.
type: processing method, the specific value and meaning are as follows:
dst: numpy array of the result image after thresholding, the size and number of channels are the same as the source image
retval: superimpose cv2.THRESH_OTSU or cv2.THRESH_TRIANGLE mark to return to the actual threshold value used

Case:

ret, mask = cv2.threshold(img, 35, 255, cv2.THRESH_BINARY|cv2.THRESH_OTSU)

Supplementary note:

When the threshold value is judged, the boundary condition is less than or equal to the threshold value and greater than the threshold value
If it is a 32-bit color image, the value of each channel of RGB is compared with the threshold separately, and the threshold is processed for each channel, and what is returned is the respective value of RGB after thresholding. Please refer to " OpenCV Threshold Processing Function Threshold Processing 32-bit Color Image Case ".

The following code generates a mask image of an image:

def createImgMask(img):
    # 创建img的掩码
    if img is None:return None
    if len(img.shape)>=3:
        img2gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    else:img2gray = img
    ret, mask = cv2.threshold(img2gray, 35, 255, cv2.THRESH_BINARY)

    return mask

Eleven, adaptiveThreshold adaptive thresholding image processing

The image threshold processing of the threshold function described above will be weak for some images with uneven illumination. In the image thresholding operation, we are more concerned about separating the target area and the background area from the binarized image. It is difficult to achieve the ideal segmentation effect only through a fixed threshold. The gray scale in the picture is not uniform, so usually the thresholds of different areas in the picture are different. In this way, a method is needed to perform threshold processing by calculating the local threshold according to the brightness or gray distribution of different areas of the image. This method is adaptive thresholding image processing. In fact, this can be called a local threshold method. The adaptiveThreshold in OpenCV is this method.

Calling syntax:
adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C, dst=None)

Description:

src: source image, must be an 8-bit grayscale image
dst: The processed target image, the size and type are the same as the source image
maxValue: used to specify the gray value of the pixel setting that meets the condition
adaptiveMethod: The adaptive threshold algorithm used, there are two types of ADAPTIVE_THRESH_MEAN_C algorithm (local neighborhood block average) or ADAPTIVE_THRESH_GAUSSIAN_C (local neighborhood block Gaussian weighted sum). The calculation method of ADAPTIVE_THRESH_MEAN_C is to calculate the average value of the neighborhood and subtract the first The value of six parameters C, the calculation method of ADAPTIVE_THRESH_GAUSSIAN_C is to calculate the Gaussian uniform value of the neighborhood and subtract the value of the sixth parameter C. Use BORDER_REPLICATE | BORDER_ISOLATED mode when processing boundaries
thresholdType: Threshold type, which can only be one of THRESH_BINARY or THRESH_BINARY_INV. For details, please refer to the table of "Image Threshold Processing" above
blockSize: indicates the size of the neighborhood block, which is used to calculate the area threshold, generally 3, 5, 7...
C: Represents a constant, which is a constant extracted from a uniform or weighted average, usually a positive number, but it can also be a negative number or zero
Return value: the processed image

Supplementary note:

The binarization threshold of the image area with higher brightness will usually be higher, and the binarization threshold of the image area with lower brightness will be smaller.
In a grayscale image, the area where the grayscale value changes obviously is often the contour of the object, so dividing the image into small blocks to calculate the threshold will often get the contour of the image. Therefore, the function adaptiveThreshold can not only binarize the grayscale image, but also perform edge extraction.
The reason for edge extraction is that when the block is very small, such as block_size=3 or 5 or 7, the degree of "adaptation" is very high, that is, it is easy to appear that the pixel values in the block are almost the same, so that it cannot be binarized , But can only achieve binarization where the gradient is large at the edge, and it turns out that it is an edge extraction function
When blockSize is set to a relatively large value, such as blockSize=21 or 31 or 41, adaptiveThreshold is a binary function
blockSize must be an odd number greater than 1 (the principle is still unclear),
If the average value method is used, the average value is 180, the difference delta is 10, and maxValue is 255. Then pixels with a grayscale less than 170 are 0, pixels greater than or equal to 170 are 255, if it is reverse binarization, pixels with grayscale less than 170 are 255, and pixels greater than or equal to 170 are 0

Case:

import cv2

img = cv2.imread(r'F:\screenpic\1.jpg',cv2.IMREAD_GRAYSCALE)
newImg = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 3, 5)
cv2.imshow('img',img)
cv2.imshow('newImg',newImg)
cv2.waitKey(60000)

Operation effect:
Source image:
Insert picture description here
The following is the result image of setting different block sizes. The left image has a block size of 31 and the right image is 3:

You can see that the blockSize is small, the contour recognition effect is obvious, and when the blockSize is large, it is a binary image .

Twelve, OpenCV image repair method

The cv2.inpaint() function in OpenCV uses interpolation to repair the image. The calling syntax is as follows:
dst = cv2.inpaint（src，mask, inpaintRadius，flags）

The meaning of the parameters is as follows:

src: input 8-bit 1-channel or 3-channel image
inpaintMask: Repair mask, 8-bit 1-channel image. Non-zero pixels indicate areas that need to be repaired
dst: output an image with the same size and type as src
inpaintRadius: the radius of the circular neighborhood of each point considered by the algorithm
flags: Repair algorithm flags, where INPAINT_NS represents the Navier-Stokes method, and INPAINT_TELEA represents the Alexandru Telea method. The specific method is not introduced here

The following code performs logo removal processing on a picture containing a bright logo:

import cv2


def createImgMask(img):
    # 创建img的掩码
    if img is None:return None
    if len(img.shape)>=3:
        img2gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    else:
        img2gray = img
    ret, mask = cv2.threshold(img2gray, 35, 255, cv2.THRESH_BINARY)

    return mask

def delLogFromImg(img):
    imgMask = createImgMask(img)
    imgMask = cv2.inpaint(img, imgMask , 3, cv2.INPAINT_TELEA)
    return imgMask

img1 = cv2.imread(r'F:\temp\logo.jpg')
img2 = cv2.imread(r'F:\temp\logo.jpg',cv2.IMREAD_GRAYSCALE)

newImg1 = delLogFromImg(img1)
newImg2 = delLogFromImg(img2)

cv2.imshow('img1',img1)
cv2.imshow('newImg1',newImg1)

cv2.imshow('img2',img2)
cv2.imshow('newImg2',newImg2)
cv2.waitKey(60000)

Perform screenshot 1:
Insert picture description here

summary

This article describes in detail the basic knowledge of opencv-python installation, image file loading, camera capture, image display, mouse event capture, keyboard event processing, geometric image rendering, color space conversion, image threshold processing, image restoration, etc. It is suitable for starting to learn opencv -Python beginners.

Paid column about the old ape

Lao Yuan’s paid column " Developing Graphical Interface Python Applications Using PyQt " specifically introduces the basic tutorials of PyQt graphical interface development based on Python, and the paid column " Moviepy Audio and Video Development Column " details the related methods and usage of moviepy audio and video editing and synthesis processing The method is used to process related editing and synthesis scenes. The two columns add up to only 19.9 yuan, which is suitable for novice readers who have a certain Python foundation but no relevant patent knowledge. These two paid columns have corresponding free columns, but the articles in the paid column are more specific, more in-depth, and more cases.

Paid column article catalog : " Moviepy audio and video development column article directory ", " Use PyQt to develop graphical interface Python application column directory ".

Regarding the content of Moviepy audio and video development, please refer to the guided introduction of " Python Audio and Video Clip Library MoviePy1.0.3 Chinese Tutorial Guide and Executable Tool Download ".

For those who lack Python foundation, you can learn Python from scratch through Lao Yuan’s free column " Column: Python Basic Tutorial Directory ".

If you are interested and willing to support the readers of Old Ape, welcome to buy paid columns.