pyopencv basic operation guide

Personal study arrangement, corrections are welcome!

experimental version

Python version: 3.6.13

opencv version: 2.4.9

1. Introduction to opencv

official website

  • http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_tutorials.html
  • https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_tutorials.html
  • http://docs.opencv.org/2.4/genindex.html

opencv installation

#环境设置
D:\python;
D:\python\Lib\site-packages;
D:\python\Scripts;
#模块下载
pip install opencv-python
pip install numpy #需要安装numpy模块
#测试是否下载成功
import cv2
print(cv2.__Version__)
#2.4.9

2. Image read, display, save

Basic process: image reading, window creation, image display, image saving, resource release

image reading

path = 'image\gezi.jpg'  #图像路径
#图像读取
img = cv2.imread(path,flags=1)  # flags=0读取图像为单通道的灰度图,1为BRG格式的三通道图像
print(type(img)) # <class 'numpy.ndarray'>
print(img.shape) # (1005, 1200, 3)  h*w*c

path: Note that the path format uses \ or /, and there should be no Chinese in the path
If the path is wrong, opencv will not prompt, but the read image is None
take exception handling

if img is None: #img==None是错误的
    print('read img error')

The flags parameter in cv2.imread(path,flags=1)

  1. cv2.IMREAD_COLOR: The default parameter, read in a color picture, ignore the alpha channel. Macro definition 1
  2. cv2.IMREAD_GRAYSCALE: Read in grayscale images. macro definition 0
  3. cv2.IMREAD_UNCHANGED: As the name suggests, read in the complete picture, including the alpha channel. macro definition-1

Create window
cv2.namedWindow(winname: Any, flags: int = ...)

cv2.namedWindow('image',cv2.WINDOW_GUI_NORMAL)
  1. The first parameter indicates the window name, just pass in a string
  2. The second parameter, the window display mode, takes the following values

flags parameter

  1. cv2.WINDOW_AUTOSIZE: The window size cannot be changed
  2. cv2.WINDOW_FREERATIO: window size adaptive ratio
  3. cv2.WINDOW_KEEPRATIO: keep the window size proportional
  4. cv2.WINDOW_GUI_EXPANDED: display color becomes dark

Image display

#图像显示
cv2.imshow('image',img)
  1. The first parameter, set the name of the window to be displayed
  2. The second parameter, fill in the image to be displayed

If the image window was not created with cv2.namedWindow before, it will automatically call cv2.namedWindow to create the window.

image save

#图像保存
cv2.imwrite('save.png',img)
  • The first parameter is to set the saved file name, you need to fill in the suffix, such as "1.bmp"
  • The second parameter, the Mat type image data to be saved
  • The third parameter indicates the parameter code saved in a specific format, and generally adopts the default value and does not fill in

The same path will replace the original one.

resource release

#资源释放
cv2.destroyAllWindows()
cv2.destroyWindow() #销毁指定窗口,参数填窗口名称

Window waiting
It is used for the window to stay and display, otherwise the window will flash by and end the program.

key = cv2.waitKey(delay=100)
if key == ord('s'):
	print('save image')

End condition: the delay time is over, or the keyboard is triggered (the key accepts the ASCII code of the keyboard key). delay = 0 is infinite wait.

full code

import cv2
import numpy as np

print(cv2.__version__)

path = 'image\gezi.jpg'  #图像路径
#图像读取
img = cv2.imread(path,flags=1)  # flags=0读取图像为单通道的灰度图,1为BRG格式的三通道图像
print(type(img)) # <class 'numpy.ndarray'>
print(img.shape) # (1005, 1200, 3)  h*w*c
#窗口创建
cv2.namedWindow('image',flags=0) #flags可以调节窗口的大小
#图像显示
cv2.imshow('image',img)
#图像保存
cv2.imwrite('save.png',img)
key = cv2.waitKey(delay=0)
#资源释放
cv2.destroyAllWindows()

The images read by other supplementary
cv2 are in numpy format, so numpy operations are applicable.

  • img.shape[0] Get the number of image lines (height)
  • img.shape[1] Get the number of image columns (width)
  • img.shape[2] Get the number of image channels
  • img.size Get the total number of pixels (width x height x number of channels)
  • img.dtype Gets the data type of the image. uint8 (0-255)
    One thing to note is that after using the numpy method to operate, pay attention to whether the dtype is still uint8, if not, you need to call astype(np.uint8) to convert.

3. Camera video reading and writing

The VideoCapture class provides a C++ interface for capturing video from a camera or a video file.
Basic process: create video interface, capture video frames, play video, save video, release resources

Create a video interface
Method a:

filename = 'gamevideo.mp4'
cap = cv2.VideoCapture()
cap.open(filename)

way b:

filename = 'gamevideo.mp4'
cap = cv2.VideoCapture(filename)

The above two opened videos can be local or network videosfilename = "http://www.laganiere.name/bike.avi"

Method c: get video from camera

dev = 0
cap = cv2.VideoCapture(dev)

Capture video frames
and read video to add exception judgment

way a:

if cap.isOpened():
    print('打开失败')

Method b: Judging when reading

ret,frame = cap.read()
if ret == False:
	print('ret is false')

ret, frame = cap.read(), frame is the picture of each frame.

video playback

while True:
    ret,frame = cap.read()
    if ret == False:
        print('ret is false')
        break
    cv2.imshow('frame',frame)
    cv2.waitKey(1000//24)

Note that a delay is required when the video is displayed, 1s24 frames.

Video Save
Save each frame that makes up the video.

fourcc = cv2.VideoWriter_fourcc(*'mp4v')   #自定义视频编解码器
out = cv2.VideoWriter('ouput.mp4',fourcc,20.0,(w,h)) #创建保存视频类
out.write(frame)#写入帧

cv2.VideoWriter()

  1. The first parameter, save the path
  2. Encoder. Different formats of video, different encoders
  3. 20 fps
  4. The size of the video needs to be the same as the size of the written frame,It should be noted here that when w, h is not h, w, the time value returned by shape is h first.

resource release

cap.release()  #关闭视频文件

full code

import cv2
import numpy as np

print(cv2.__version__)


#
filename = 'gamevideo.mp4'
cap = cv2.VideoCapture(filename)
# dev = 0
# cap = cv2.VideoCapture(dev)

if not cap.isOpened():
    print('打开失败')
ret,frame = cap.read()
if ret == False:
    print('ret is false')

fourcc = cv2.VideoWriter_fourcc(*'mp4v')   #自定义视频编解码器
h,w = frame.shape[:-1]
out = cv2.VideoWriter('ouput.mp4',fourcc,20.0,(w,h))

while True:
    ret,frame = cap.read()
    if ret == False:
        print('ret is false')
        break
    cv2.imshow('frame',frame)
    out.write(frame[...,::-1])
    cv2.waitKey(1000//24)

#资源释放
cap.release()  #关闭视频文件
cv2.destroyAllWindows()

Other functions

  1. open()—opens a video file or camera
  2. isOpened()–judging whether the video file is read correctly, and returning true if it is correct
  3. release()—Close the video stream file
  4. grab()—grabs the next frame of a video file or device
  5. retrieve() — decodes and returns a video frame
  6. get()—returns the relevant parameter information of the specified video class
  7. set()—sets an attribute of class information
  8. cv2.CAP_PROP_POS_FRAMES #Current frame position
  9. cv2.CAP_PROP_FRAME_COUNT #Video total frames
  10. cv2.CAP_PROP_FPS #Video frame rate
  11. fourcc = vd.get(cv2.CAP_PROP_FOURCC) #Get video codec

4. Color space

Colors are represented by tuples in opencv. For example, BGR (0,255,255, alpha) means yellow, and alpha means transparency, which is generally not used if it is not used.

5. Basic drawing functions

create background

#创建黑底
bg = np.zeros((500,5003),dtype=np.uint8) 
cv2.imshow('bg',bg)
cv2.waitKey()

Three channels are needed here, otherwise the color of the drawing function will not come out, only black and white.

draw straight line

bg = np.zeros((500,500,3),dtype=np.uint8)
l = cv2.line(bg,(100,100),(200,200),color=(0,255,0),thickness=2)
cv2.imshow('bg',bg)
cv2.waitKey()

cv2.line(img, pt1, pt2, color[, thickness[, lineType[, shift]]]) → img

  1. img, background image
  2. pt1, the coordinates of the starting point of the line
  3. pt2, the coordinates of the end point of the line
  4. color, the color of the current painting. As in BGR mode, pass (255,0,0) for blue brush. In the grayscale image, only the brightness value needs to be passed.
  5. thickness, the thickness of the brush, the line width. If it is -1, it means to draw a closed image, such as a filled circle. The default value is 1.
  6. lineType, the type of line
  7. shift: Number of decimal places in center coordinates and radius values.

draw circle

bg = np.zeros((500,500,3),dtype=np.uint8)
cir = cv2.circle(bg,(250,250),100,(0,255,0),thickness=-1)
cv2.imshow('bg',bg)
cv2.waitKey()

cv2.circle(img, center, radius, color[, thickness[, lineType[, shift]]])

  1. img, background image
  2. center: the position of the center of the circle
  3. radius: the radius of the circle
  4. color: the color of the circle
  5. thickness: Thickness of the circle outline (if positive). A negative thickness indicates that a solid circle is to be drawn.
  6. lineType: The type of circle border.
  7. shift: Number of decimal places in center coordinates and radius values.

thickness=-1
insert image description here
thickness>0
insert image description here

draw rectangle

bg = np.zeros((500,500,3),dtype=np.uint8)
rec = cv2.rectangle(bg,(100,100),(200,200),(0,255,0))
cv2.imshow('bg',bg)
cv2.waitKey()

(100,100), (200,200) are the coordinates of the upper left corner and the lower right corner respectively.

add text

bg = np.zeros((500,500,3),dtype=np.uint8)
cv2.putText(bg,'Opencv',(100,100),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,0),1)
cv2.imshow('bg',bg)
cv2.waitKey()

insert image description here
cv2.putText(bg,‘Opencv’,(100,100),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,0),1)

  1. background
  2. The text to add. Chinese cannot be displayed normally, and you need to write a function to achieve it yourself.
  3. The position of the text, the coordinates of the lower left corner.
  4. type of font
  5. font size. The larger the value, the larger the font
  6. font color
  7. Font thickness, bigger and bolder

6. OpenCV interface event operation

mouse

cv2.setMouseCallback(windowName, onMouse [, param]) -> None

  1. Which window is the windowName control placed on?
  2. onMouse Callback function for mouse operation. This callback function will be called automatically when there is a mouse operation
def onmouse(event,x,y,flags,*param):
    print(x,y,param)
#创建背景
bg = np.full((500,500),fill_value=255,dtype=np.uint8)
cv2.imshow('bg',bg)
#创建鼠标事件
cv2.setMouseCallback('bg',onmouse,0)

The input parameters of the callback function onmouse(event,x,y,flags,*param) should preferably be like this.

  • The macro definition corresponding to the event mouse action.
  • x, y mouse position
  • Some operations of flags keyboard + mouse
  • param is an variable length parameter, indicating that the user can return multiple data
Event:
#define CV_EVENT_MOUSEMOVE 0             //滑动
#define CV_EVENT_LBUTTONDOWN 1           //左键点击
#define CV_EVENT_RBUTTONDOWN 2           //右键点击
#define CV_EVENT_MBUTTONDOWN 3           //中键点击
#define CV_EVENT_LBUTTONUP 4             //左键放开
#define CV_EVENT_RBUTTONUP 5             //右键放开
#define CV_EVENT_MBUTTONUP 6             //中键放开
#define CV_EVENT_LBUTTONDBLCLK 7         //左键双击
#define CV_EVENT_RBUTTONDBLCLK 8         //右键双击
#define CV_EVENT_MBUTTONDBLCLK 9         //中键双击
flags:
#define CV_EVENT_FLAG_LBUTTON 1       //左鍵拖曳
#define CV_EVENT_FLAG_RBUTTON 2       //右鍵拖曳
#define CV_EVENT_FLAG_MBUTTON 4       //中鍵拖曳
#define CV_EVENT_FLAG_CTRLKEY 8       //(8~15)按Ctrl不放事件
#define CV_EVENT_FLAG_SHIFTKEY 16     //(16~31)按Shift不放事件
#define CV_EVENT_FLAG_ALTKEY 32       //(32~39)按Alt不放事件

Reference: Opencv function setMouseCallback mouse event response

slider operation

createTrackbar(trackbarName, windowName, value, count, onChange) -> None

  1. trackbarname-----scroll bar name
  2. winname-----The name of the window to which the scroll bar is attached (created by namedWindow)
  3. value—the initial position of the slider
  4. count—the maximum position of the slider, the default minimum is 0
  5. onChange----pointer to the callback function, the prototype must be func(x). Where x is the position of the slider
def onChange(x):
    print(x)
cv2.createTrackbar('woshihuakuai','bg',10,255,onChange)

7. Contrast brightness adjustment and channel separation merged

Contrast Brightness Adjustment:

Mathematical formula

insert image description here

  • The parameter f(x) represents the original image pixel
  • The parameter g(x) represents the output image pixel
  • The parameter a (a>0), known as gain, is usually used to control the contrast of the image
  • Parameter b is usually called bias (bias), usually used to control the brightness of the image
path = r'datas\testimage.png'
contrast = 100*0.1
b = 50
img = cv2.imread(path,0)
for i in range(0,img.shape[0]):
    for j in range(0,img.shape[1]):
        bright = img[i,j]*contrast+b
        if bright>255:
            bright = 255
        img[i,j] = bright

Overflow protection: ensure that the pixel value is an integer, and when the gray value is greater than 255, force it to 255

Channel separation and merging

separate

b,g,r = cv2.split(img)

merge

dst = cv2.merge([b,g,r])

8. Basic operations on images

mask operation

ROI region of interest

roi = img[200:500,200:500]

Note the order of ROI parameters y1, y2, x1, x2

mask

  • mask - (mask) - is a8 bit single channelImage (grayscale/binary image)
  • If a certain position of the mask is 0, the operation on this position will not work
  • If a certain position of the mask is not 0, the operation at this position will work
  • Can be used to extract irregular ROI

image arithmetic

The size and type of the operand are required to be the same

image addition

add(src1, src2[, dst[, mask[, dtype]]]) -> dst

  1. src1 is the first image
  2. src2 is the second image
  3. mask mask.
    insert image description here
dst = cv2.add(img1,img2)

img1 and img2shape must be the same, and the type must be the same.

addWeighted(src1, alpha, src2, beta, gamma[, dst[, dtype]]) -> dst

dst = cv2.addWeighted(img1,0.7,img2,0.3,0)

weighted addition

image subtraction

cv2.subtract()
cv2.absdiff()

multiply and divide

Image logical operation

image with

bitwise_and(src1, src2[, dst[, mask]]) -> dst

  1. src1, the first image
  2. src2 second image
  3. The image output by dst. You can choose not to fill in, but take the initiative to accept
bg = np.zeros((500,500,3),dtype=np.uint8)
rect = cv2.rectangle(bg.copy(),(100,100),(400,400),(255,255,255),thickness=-1)
cv2.imshow('rect',rect)
cir = cv2.circle(bg.copy(),(250,250),180,(255,255,255),thickness=-1)
cv2.imshow('cir',cir)
dst = cv2.bitwise_and(rect,cir)
cv2.imshow('dst',dst)

pay attention to copy
src1
insert image description here
src2
insert image description here
dst
insert image description here

Take the pixel point as the intersection of (255, 255, 255)

logical OR

cv2.bitwise_or

XOR

cv2.bitwise_xor

9. Image geometric transformation

image scaling

resize(src, dsize[, dst[, fx[, fy[, interpolation]]]]) -> dst

  • src: input image
  • dst: output image
  • dsize: Size type, specifies the output image size, if it is equal to 0, it is calculated by the following formula: dsize = Size(round(fx * src.cols), round(fy * src.rows))
  • fx: zoom factor along the horizontal direction, the default value is 0, when it is equal to 0, it is calculated by the following formula: (double)dsize.width/src.cols.<number of columns of the src.cols matrix>
  • fy: The zoom factor along the vertical direction, the default value is 0, when it is equal to 0, it is calculated by the following formula: (double)dsize.height/src.rows.<row number of src.rows matrix>
  • interpolation: used to specify the interpolation method, the default is cv2.INTER_LINEAR (linear interpolation)

interpolation

  • INTER_NEAREST nearest neighbor interpolation
  • INTER_LINEAR bilinear interpolation (default setting, use –fast for enlarged images)
  • INTER_AREA Resampling using pixel area relations. It may be the method of choice for image decimation, as it produces cloud-free texture results. But when the image is scaled, it is similar to the INTER_NEAREST method. (recommended for reduced image size)
  • INTER_CUBIC Bicubic interpolation of 4x4 pixel neighborhoods (enlarged image use –slow)
  • INTER_LANCZOS4 Lanczos interpolation of 8x8 pixel neighborhood
resize_img = cv2.resize(img,dsize=(img.shape[0]//2,img.shape[1]//2),interpolation=cv2.INTER_AREA)

affine

  1. To construct the change matrix m, thematrix data type requires np.float32.
  2. Use the cv2.warpAffine() function

move

To translate is to change the position of an object. If you want to move in the (x, y) direction, the moving distance is (tx, ty), you can build a moving matrix m.
Use numpy to build m
insert image description here

m  = np.array([[1,0,50],[0,1,50]],dtype=np.float32)

insert image description here
warpAffine(src, M, dsize[, dst[, flags[, borderMode[, borderValue]]]]) -> dst

  1. src – the input image
  2. M – transformation matrix
  3. dsize – the size of the output image
  4. flags – combination of interpolation methods (int type!, same as above interpolation)
  5. borderMode - border pixel mode (int type!)
  6. borderValue – (Important!) The border padding value; by default, it is 0.
dst = cv2.warpAffine(img,m,dsize=img.shape[:-1])

to rotate

  • OpenCV does not provide a function to directly rotate the image, image rotation can be achieved by affine transformation
  • Use cv2.getRotationMatrix2D() to construct the m matrix.
    insert image description here
    insert image description here
    insert image description here
    Deduce m through mathematics:
    insert image description here
    opencv can
    construct m through the built-in function getRotationMatrix2D(center, angle, scale) -> retval
  • centerRotation centerOx,Oy
  • angle rotation angle
  • Scale after scale rotation
M = cv2.getRotationMatrix2D((w/2, h/2), 90, 1) #中心点,旋转90度,不缩放
print(M.shape)  #(2, 3)

If you are not making subsequent changes, m can take the first two lines (0, 0, 1 are specially added for the result to be able to perform subsequent affine), so the printed shape of m is 2*3

rotation affine

dst = cv2.warpAffine(img1, M, (w, h))

perspective

Figure A, there is a perspective effect
picture A
Our purpose is to convert the image into Figure B.
insert image description here
The key to affine is to construct M, so how to construct M in the transformation here. We can go looking for 4 points.
insert image description here
Labeled by some methods, 4 points before transformation and 4 points after transformation

src_p = np.float32([(13,66),(574,66),(55,380),(554,380)])  # x,y
dst_p = np.float32([(13,66),(574,66),(10,380),(576,380)])  # x',y'

Use getPerspectiveTransform(src, dst[, solveMethod]) -> retval to reverse solve M, and then use affine to change warpPerspective.

m = cv2.getPerspectiveTransform(src_p,dst_p)
dst = cv2.warpPerspective(scard,m,(w,h))

10. Image filtering

  1. Filtering is actually a concept of signal processing. An image can be regarded as a two-dimensional signal, where the gray value of pixels represents the strength of the signal.
  2. High frequency: the part of the image that changes drastically
  3. Low frequencies: slowly changing, flat parts of the image
  4. According to the high and low frequency characteristics of the image, set the high-pass and low-pass filters. High-pass filtering can detect sharp and obvious changes (edges) in the image, and low-pass filtering can smooth the image and eliminate noise interference.
  5. Image filtering is an important part of OpenCV image processing. It is widely used in image preprocessing. The quality of image filtering determines the result of subsequent processing.

Image filtering function method:
6. Linear filtering: box filtering, mean filtering, Gaussian filtering
7. Non-linear filtering: median filtering, bilateral filtering
Neighborhood operator: An operator that uses the pixel values ​​​​around a given pixel to determine the final output value of this pixel

linear filtering

A commonly used neighborhood operator, the pixel output depends on the weighted sum of the input pixels, as shown below:
insert image description here
the anchor point is the middle position.

insert image description here
The output pixel g(i, j) of the linear filter is the weighted sum of the input pixels f(i+k, j+I), where h(k, l) is called the kernel and is the weighting coefficient of the filter. Basically this is the convolution operation in CNN.

Box filter cv2.boxFilter()

When the kernel
insert image description here
normalize is true, the box filter becomes the mean filter. That is to sayMean filtering is a special case of box filtering after normalization
boxFilter(src, ddepth, ksize[, dst[, anchor[, normalize[, borderType]]]]) -> dst

  • src the image to process
  • ddepth: The depth of the output image, -1 represents the depth of the original image, that is, src.depth() link: related concepts of image representation: the difference and relationship between image depth, pixel depth, and bit depth
  • ksize: The Size type indicates the kernel size. Generally, Size(w,h) is used to indicate the kernel size, and Size(3,3) indicates the kernel size of 3x3
  • Anchor: Indicates the anchor point (that is, the point to be smoothed), the default value is Point(-1, -1), indicating that the anchor point is in the center of the nucleus
  • normalize: default value true, identifier, indicating whether the kernel is normalized
  • borderType: Image pixel border mode, generally use the default value
dst = cv2.boxFilter(img,-1,(5,5),normalize=True)

Effect contrast
insert image description here
Brightness becomes uniform -> blurred

mean filter cv2.blur

The mean filter is a special case of box filter normalization, which is to replace the pixel value of the point with the mean value of the pixels in the neighborhood. The mean filter also destroys the image details while denoising.

blur(src, ksize[, dst[, anchor[, borderType]]]) -> dst

  • ksize: The Size type indicates the kernel size. Generally, Size(w,h) is used to indicate the kernel size, and Size(3,3) indicates the kernel size of 3x3
  • Anchor: Indicates the anchor point (that is, the point to be smoothed), the default value is Point(-1, -1), indicating that the anchor point is in the center of the nucleus
  • borderType: Image pixel border mode, generally use the default value
dst = cv2.blur(img,(3,3))

Effect comparison
insert image description here

Gaussian filter cv2.GaussianBlur

The Gaussian filter is called the most useful filter, each pixel is obtained by weighted average of itself and other pixel values ​​in the neighborhood,The closer the weighting coefficient is to the center, the larger it is, and the farther it is from the center, the smaller it is, which can filter out noise well.
insert image description here
GaussianBlur(src, ksize, sigmaX[, dst[, sigmaY[, borderType]]]) -> dst

  • ksize: Gaussian kernel size, generally Size(w,h) is used to represent the kernel size, w and h can be different, but must be a positive odd number or 0, calculated by sigma
  • sigmaX: Indicates the standard deviation of the Gaussian function in the X direction.Probability theory, it seems to be 95% reliable or something, if you don’t understand, check the information when you need it
  • sigmaY: Indicates the standard deviation of the Gaussian function in the Y direction, if sigmaY=0, set it to sigmaX;
  • borderType: Image pixel border mode, generally use the default value
dst = cv2.GaussianBlur(img,(3,3),0)

Effect comparison
insert image description here

Median filter cv2.medianBlur()

Median filtering is a kind of nonlinear filtering, which replaces the gray value of the point with the median value of the gray value of the neighborhood of the pixel point, and can remove impulse noise and salt and pepper noise.
What is the median value:
median({1,2,3,3,7,5,1,8})=3 The middle value after sorting. If there are two numbers in the middle, you can take the mean, or take any value on the left or right.

medianBlur(src, ksize[, dst]) -> dst

  • ksize is int on it.
dst = cv2.medianBlur(img,3)

Effect comparison
insert image description here

Bilateral filtering cv2.bilateralFilter()

Bilateral filtering is a kind of non-linear filtering, which is a compromise between image spatial proximity and pixel value similarity, trying to preserve edges while denoising.

bilateralFilter(src, d, sigmaColor, sigmaSpace[, dst[, borderType]]) -> dst

  • d: Indicates the neighborhood diameter of each pixel in the filtering process
  • sigmaColor: The sigma value of the color space filter, the larger the value, the wider the range of colors in the neighborhood of the pixel
    will be mixed together, resulting in a larger semi-equal color area
  • sigmaSpace: The sigma value of the filter in the coordinate space, the standard deviation of the coordinate space
  • borderType: Image pixel border mode, generally use the default value
dst = cv2.bilateralFilter(img,20,500,500)

Effect comparison
insert image description here

11. Image Thresholding

  1. Image thresholding is an important basic part of image processing. It is widely used and can be used to segment different parts of the image according to the difference in gray level.
  2. The thresholded image is generally a single-channel image (grayscale image)
  3. Thresholding parameter setting can use the slider to debug
  4. Thresholding processing is easy to be affected by light, so care should be taken when processing

There are two main approaches to image thresholding functions:

  1. Global fixed threshold: cv2.threshold()
  2. Local adaptive threshold: cv2.adaptiveThreshold()

Global fixed threshold: cv2.threshold()

threshold(src, thresh, maxval, type[, dst]) -> retval, dst

  1. src: single-channel image (grayscale or binary image, in fact, 3-channel BGR will not report an error)
  2. dst: The output image requires the same size and type as src
  3. thresh: the given threshold
  4. maxval: The fifth parameter is set to the maximum value of the CV_THRESH_BINARY or CV_THRESH_BINARY_INV threshold type
  5. type: The fifth parameter is the operation flag. Generally set to CV_THRESH_BINARY or CV_THRESH_BINARY_INV.
    return value
  6. retval, the set thresh or automatically obtained through the dynamic threshold.Common errors, useless to receive this return value
  7. dst, binary image

maxval

macro definition Pixel value>thresh Other cases
cv2.THRESH_BINARY =0 maxval 0
cv2.THRESH_BINARY_INV =1 Invert above 0 maxval
cv2.THRESH_TRUNC =2 thresh current gray value
cv2.THRESH_TOZERO =3 current gray value 0
cv2.THRESH_TOZERO_INV =4 0 current gray value

insert image description here

Auxiliary macro definition automatically obtains thresh

macro definition Function
cv2.THRESH_OTSU Use the least squares method to process pixels, suitable for bimodal maps
cv2.THRESH_TRIANGLE Use the triangulation algorithm to process pixels, suitable for unimodal images
cv2.THRESH_MASK /

A unimodal or bimodal plot refers to a grayscale histogram.
insert image description here
This is suitable for
reference: CV2 simple threshold function: cv2.threshold()
cv2.THRESH_OTSU, cv2.THRESH_TRIANGLE are flag bits, they can be used together with other parameters, such as cv2.THRESH_BINARY. When using cv2.THRESH_OTSU or cv2.THRESH_TRIANGLE, the threshold is dynamic, and the corresponding value will be returned according to the calculation result.

_,bingary = cv2.threshold(gray,120,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

Effect comparison
insert image description here

Adaptive Threshold cv2.adaptiveThreshold()

Using an adaptive threshold operation on the matrix,The adaptive threshold is to determine the binarization threshold at the pixel position according to the pixel value distribution of the neighborhood block of the pixel

adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C[, dst]) -> dst
parameters are basically the same as cv2.threshold(),Note that there is only one return value.

  • adaptiveMethod: Specifies the adaptive threshold algorithm, the possible value is cv2.ADAPTIVE_THRESH_MEAN_C or cv2.ADAPTIVE_THRESH_GAUSSIAN_C
  • blockSize: The size of the neighborhood used to calculate the threshold 3, 5, 7,…must be odd
  • C: constant value after subtracting average or weighted average

Calculation of each local threshold T(x, y)
if adaptiveMethod== cv2.ADAPTIVE_THRESH_MEAN_C:
first find the mean value in the block, and then subtract C
if adaptiveMethod== cv2.ADAPTIVE_THRESH_GAUSSIAN_C:
first find the weighted sum in the block ( gaussian), minus C

adaptiveThreshold transforms the grayscale image into a binary image, using the following formula:
thresholdType= cv2.THRESH_BINARY :
insert image description here
thresholdType= cv2.THRESH_BINARY _INV:
insert image description here

the code

binary = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)

Effect comparison
insert image description here
cv2.adaptiveThreshold() has certain contour extraction capabilities.

12. Image Processing - Morphology

Mathematically speaking, dilation or erosion is the convolution of an image (or region) A with a kernel B.Max Pooling and Min Pooling

convolution kernel

Building a core is a prerequisite for expansion and erosion. The kernel can be of any size and shape, and it has an independently defined reference point (anchor point). In most cases, the kernel is a small solid square or disk with a reference point in the middle, which can be regarded as a template or mask .
A kernel of specified shape and size can be obtained using cv2.getStructuringElement.

getStructuringElement(shape, ksize[, anchor]) -> retval

  1. shape is the shape. cv2.MORPH_RECT represents a matrix, cv2.MORPH_CROSS represents a cross, and cv2.MORPH_ELLIPSE represents an ellipse.
  2. ksize convolution kernel size.
k1 = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
k1:
 [[0 1 0]
 [1 1 1]
 [0 1 0]]

Of course, you can use numpy to build directly, but pay attention to the data type and experiment on your own.

expand

Expansion is the operation of finding the local maximum value. The core B is convolved with the graph, that is, the maximum value of the pixels in the area covered by the core B, and this maximum value is copied to the pixel specified by the reference point, which will make the image in the image The highlighted area grows gradually.
insert image description here

cv2.dilate(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) -> dst

  1. src: Input the original image (binary image is recommended), and the return value of BGR is a grayscale image.
  2. dst: The output image requires the same size and type as src. Generally do not fill in, choose to receive actively.
  3. kernel: The kernel of the expansion operation, which can be constructed by getStructuringElement. When NULL, means to use a 3x3 kernel with the reference point at the center.
  4. interations: number of expansions
  5. anchor: the position of the anchor, the default value is Point(-1,-1), which means it is in the center
  6. borderType: border mode, generally use the default value
  7. borderValue: border value, generally use the default value
#膨胀
k1 = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))  #构建卷积核
dst_dilate = cv2.dilate(img_2v,k1,iterations=1)#img_2v为一张二值图

Effect comparison
insert image description here

corrosion

Corrosion is the opposite of dilation. It takes a local minimum value, and the highlighted area gradually decreases, as shown in the figure below:
insert image description here
erode(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) - > The dst
parameter is the same as the expansion and will not be introduced.

k2 = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dst_erosion = cv2.erode(img_2v,k2,iterations=1)

Effect comparison
insert image description here

Other Morphological Operations

Opening operation, closing operation, top hat, black hat, morphological gradient. These operations are based on dilation and erosion.
This is achieved by calling cv2.morphologyEx(src, op, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]] -> dst.

  1. src: input original image
  2. op: Indicates the type of morphological operation.
  3. kernel: Morphological operation kernel, if it is NULL, it means to use the 3x3 kernel whose reference point is in the center, generally use the getStruecuringElement function to get
  4. anchor: the position of the anchor, the default value is Point(-1,-1), which means it is in the center
  5. interations: The number of times to iterate the function, the default is 1
  6. borderType: border mode, generally use the default value
  7. borderValue: boundary value, generally using the default value.
    The parameters are basically the same as expansion and erosion, and different ops are selected according to the operation.

open operation

corrosion first expansionprocess, the opening operation can be used to eliminate small objects (high brightness), separates objects at slender points, and inconspicuously changes the area of ​​larger objects while smoothing their boundaries. op selects cv2.MORPH_OPEN

k = cv2.getStructuringElement(cv2.MORPH_CROSS,(4,4))
dst_open=cv2.morphologyEx(img_2v,cv2.MORPH_OPEN,k)

The size of the kernel needs to be adjusted with a more noisy size. I personally think that the size of the corroded and expanded cores directly using the cv2.morphologyEx interface is the same. It may be better to choose different cores to use cv2.erode and cv2.dilate.
Effect comparison
insert image description here

Close operation

The closing operation is a process of first expansion and then erosion, and the closing operation can be used to eliminate small black holes (black areas).
op selects cv2.MORPH_CLOSE

k = cv2.getStructuringElement(cv2.MORPH_CROSS,(4,4))
dst_close=cv2.morphologyEx(img_2v,cv2.MORPH_CLOSE,k)

Effect comparison
insert image description here

Morphological Gradient

The morphological gradient is the difference between the expansion map and the erosion map to highlight the edges of the blobs, which can be used to preserve the edge contours.
op selects cv2.MORPH_GRADIENT

k = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dst_gd = cv2.morphologyEx(img_2v,cv2.MORPH_GRADIENT,k)

Effect comparison:
insert image description here

top hat

The top hat operation is also called "top hat", which is the difference between the result of the opening operation and the original image, and can be used to separate the patches that are brighter than the adjacent ones.
op is cv2.MORPH_TOPHAT

k = cv2.getStructuringElement(cv2.MORPH_CROSS,(4,4))
dst_th = cv2.morphologyEx(img_2v,cv2.MORPH_TOPHAT,k)

Effect comparison
insert image description here

black hat

The black hat operation is the result of the difference between the original image and the open operation, and can be used to separate patches that are darker than adjacent points.

k = cv2.getStructuringElement(cv2.MORPH_CROSS,(4,4))
dst_b = cv2.morphologyEx(img_2v,cv2.MORPH_BLACKHAT,k)

Effect comparison
insert image description here

edge detection

The edge information changes greatly, and the gradient is large, so the place with a large gradient is the edge.
Edge detection can extract important image contour information, reduce image content, and can be used for image segmentation, feature extraction, etc.
General steps of edge detection:

  • Filtering ---- (filter out the influence of noise on the detection edge)
  • Enhancement ---- (can highlight changes in the intensity of the pixel neighborhood - gradient operator)
  • Detection ---- (threshold method to determine the edge)

Commonly used edge detection operators:

  • Sobel operator
  • Canny operator
  • Scharr operator
  • Laplacian operator

Sobel operator

The Sobel operator is a discrete differential operator mainly used for edge detection, which combines Gaussian smoothing and differential derivation to calculate the grayscale function of the imageapproximate gradient.
insert image description here
Sobel edge detection function—cv2.Sobel()

Sobel(src, ddepth, dx, dy[, dst[, ksize[, scale[, delta[, borderType]]]]]) -> dst

  • src: input original image
  • ddepth: the depth of the output image
  • dx: order of difference in X direction
  • dy: The order of difference in the Y direction
  • ksize: The default value is 3, indicating the Sobel kernel size, 1,3,5,7.int, odd
  • scale: The scaling factor when calculating the derivative value, the default value is 1, which means no scaling
  • delta(δ): Indicates the optional delta value before the result is stored in the target graph, the default value is 0
  • borderType: border mode, generally use the default
  • dst: The output image requires the same size and type as src

ddepth is as follows:
insert image description here
The larger the bit depth, the more digits after the decimal point, the more accurate the calculation, but the slower the speed.

img_sb = cv2.Sobel(binary,cv2.CV_16S,1,0,ksize=3).astype(np.uint8)

or

img_sb = cv2.Sobel(binary,cv2.CV_16S,1,0,ksize=3)
img_sb = cv2.convertScaleAbs(img_sb)

dx = 1, dy = 0
insert image description here
Note that more attention is paid to the vertical line at this time

Note that type conversion is required here

G=0.5*Gx+0.5Gy

img_sb_x = cv2.Sobel(binary,cv2.CV_16S,1,0,ksize=3)
img_sb_y = cv2.Sobel(binary,cv2.CV_16S,0,1,ksize=3)
img_sb_x = cv2.convertScaleAbs(img_sb_x)
img_sb_y = cv2.convertScaleAbs(img_sb_y)
img_sb = cv2.addWeighted(img_sb_x,0.5,img_sb_y,0.5,0)

renderings
insert image description here

Canny

The biggest difference between Canny and Sobel is the use of dual thresholds.

  • If the magnitude of a pixel is greater than the high threshold, the pixel is considered an edge pixel and should not be retained.
  • If the magnitude of a certain pixel is less than the low threshold, the pixel is considered not to be an edge pixel and will not be retained.
  • If the magnitude of a pixel is between the high and low thresholds, the pixel is only kept if it is connected to an edge point.

Canny(image, threshold1, threshold2[, edges[, apertureSize[, L2gradient]]]) -> edges

  • src: Input the original image (usually a single-channel 8-bit image)
  • dst: The output edge image requires the same size and type as src (single channel)
  • threshold1: low threshold (for edge connections)
  • threshold2: High threshold (controls edge initial segment). (recommended high and low threshold ratio between 2:1 and 3:1)
  • apertureSize: Indicates the aperture size of the Sobel operator, the default value is 3
  • L2gradient: Calculate the logo of the image gradient magnitude
canny = cv2.Canny(img,180,90)

renderings
insert image description here

Laplacian operator

The Laplacian operator is a second-order differential operator in n-dimensional Euclidean.
Mathematical definition:
insert image description here

Laplacian(src, ddepth[, dst[, ksize[, scale[, delta[, borderType]]]]]) -> dst

  • src: input original image (single-channel 8-bit image)
  • dst: The output edge image requires the same size and number of channels as src
  • ddepth: the depth of the target image
  • Ksize: The filter aperture size used to calculate the second derivative, must be a positive odd number, the default value is 1
  • scale: optional scale factor, default value 1
  • delta: optional parameter δ, default value 0
  • borderType: border mode, generally use the default value

the code

lap = cv2.Laplacian(img,-1,ksize=3)

The Laplacian edge effect in the effect picture
insert image description here
seems to be more delicate.

summary

insert image description here
No type conversion is required after Laplacian and Canny edge detection.

Kirchhoff transformation and its application

Hough Transform (Hough Transform) is a feature extraction technique in image processing. This process obtains a set that conforms to the specific shape as the result of Hough Transform by calculating the local maximum of the accumulated results in a parameter space.

There are two main types of Hough transform in OpenCV:

  • Hough line transform - detect straight lines (line segments)
  • Hough circle transform - detect circles

Mainly used functions:

  • cv2.HoughLines()—standard Hough transform, multi-scale Hough transform
  • cv2.HoughLinesP()—cumulative probability Hough transform
  • cv2.HoughCricles()—Hough circle transformation

Hough line transform

Hough line transform is a method of finding straight lines. Generally, before using Hough transform, the image is firstly subjected to edge detection processing. Generally, the input of Hough transform is an edge binary image..
OpenCV supports three different Hough line transforms, including:

  • Standard Hough Transform (SHT) ------ cv2.HoughLines() function
  • Multi-scale Hough Transform (MSHT)------is a variant of SHT in multi-scale------cv2.HoughLines() function
  • Cumulative Probabilistic Hough Transform (PPHT)------Is an improvement of SHT, Hough Transform is performed within a certain range (reduce calculation time and calculation amount)------cv2.HoughLinesP() function

cv2.HoughLines()

  • src: input original image (generally 8-bit single-channel binary image)
  • lines: The output vector of lines detected after the Hough transform, each line is represented by a vector of two elements (ρ, Θ), where ρ is the distance from the origin of the coordinates, Θ is the rotation angle of the radian line (0 means a vertical line, π/2 degree means horizontal line)
  • rho: the distance precision in pixels, another way to express it is the unit radius of the progressive size when searching in a straight line
  • theta: angular precision in radians, another way to express it is the angular unit of the progressive dimension when searching for a straight line
  • threshold: The threshold parameter of the accumulation plane, that is, the value that must be reached in the accumulation plane when a certain part is recognized as a straight line, and the line segment greater than the threshold threshold can be detected and returned to the result
  • srn: The default value is 0, for multi-scale Hough transform, this is the divisor distance of the third parameter progress size rho
  • stn: The default value is 0, for multi-scale Hough transform, it means unit angle theta

Histogram calculation and drawing

Histogram is a method of statistical data, which can visually express the numerical (frequency) distribution of a certain attribute of an image, including grayscale histogram, RGB histogram and other related concepts and functions
:

  • dims: The number of features that need to be counted, such as only counting the gray value—dims=1, counting the RGB value—dims=3
  • bins: The number of sub-regions of each feature space, also known as the group distance (simple understanding is that the histogram is divided into several columns)
  • range: the value range of each feature space, such as the value range of gray value [0, 255]
  • Calculation histogram function: cv2.calcHist()

cv2.calcHist()

  • images: source image,Input arrays (or sets of arrays), need to have the same depth and size
  • channels: The index of the channel that needs to be counted, indicating which channel or channels to use (attribute)
  • mask: optional operation mask, must be 8 bits if not empty, andSame size as image
  • hist: output target histogram
  • dims: the dimension of the histogram to be calculated, must be a positive number
  • histSize: An array that stores the size of the histogram of each dimension, namely bins
  • ranges: the range of pixel values, generally [0,255] means 0~255
  • uniform: the identifier of whether the histogram is uniform, the default value is true
  • accumulate: accumulation identifier, the default value is false, if it is true, the histogram will not be cleared during the configuration phase
    Except mask, the other four parameters must be marked with [ ].

the code

import cv2
import numpy as np
import matplotlib.pyplot as plt

filename = r'datas\chepai.jpg'
img = cv2.imread(filename,0)
hist = cv2.calcHist([img],[0],None,[256],[0,255])
#hist是一个shape为(256,1)的数组,表示0-255每个像素值对应的像素个数,下标即为相应的像素值
#plot一般需要输入x,y,若只输入一个参数,那么默认x为range(n),n为y的长度
plt.plot(hist,color = 'b')
plt.hist(img.ravel(),256,[0,256])
plt.show()
cv2.waitKey()

Effect comparison
insert image description here

Template matching and application

principle

Template matching is a technique of finding the part of an image that best matches (similarly) to another template image. Template matching is not based on histograms, but by sliding image patches (templates) on the input image whileCompare similarity, to match a template with an input image.
Input Image:
insert image description here
Template:
insert image description here

In practice, the size of the template must be the same as the size of the matching image in the input image

application

  • Target Finding and Positioning
  • moving object tracking

code:

cv2.matchTemplate()

  • image: image to be searched (big image)
  • templ: search template, which needs to be of the same data type as the original image and whose size cannot be larger than the source image
  • result: The mapping image of the comparison result, which must be a single-channel, 32-bit floating-point image. If the size of the original image (image to be searched) is W x H, and the size of the templ is wxh, the result size must be (W-w +1)x(H-h+1).
  • method: the specified matching method
method definition
cv2.TM_SQDIFF Square difference matching method (best match 0, the smaller the better)
cv2.TM_SQDIFF_NORMED Normalized square difference matching method (best match 0)
cv2.TM_CCORR - Correlation matching method (worst match 0)
cv2.TM_CCORR_NORMED Normalized correlation matching method (worst match 0)
cv2.TM_CCOEFF Coefficient matching method (the best match is 1, the bigger the better)
cv2.TM_CCOEFF_NORMED Correlation coefficient matching method (best match 1)

The above mode becomes more complicated as it goes down, and the more time it takes, the higher the accuracy rate should be. Choose according to your needs.

import sys
import cv2
import numpy as np

filename = r'data\01.bmp'
templfile = r'testdata\templ0.bmp'
src  = cv2.imread(filename)
templ = cv2.imread(templfile)
res = cv2.matchTemplate(src,templ,cv2.TM_SQDIFF)
print(res)
# cv2.imshow('src',src)
# cv2.imshow('templ',templ)
# cv2.imshow('res',res)
cv2.waitKey()

The output
insert image description here
template is moved one pixel at a time, and the similarity is calculated each time, and the similarity will be saved in the result matrix. The result of the similarity is not saved in the center point, but in the upper left corner.
insert image description here
Assuming that the closer the pattern is to 0, the more similar it is, then we need to find the position of the smallest element in the res matrix, so that it can correspond to the position in the original image src.

Code:
minMaxLoc(src[, mask]) -> minVal, maxVal, minLoc, maxLoc

  • src: input original image, single channel image
  • minVal: Returns the pointer of the minimum value, if there is no need to return, set 0
  • maxVal: Return the pointer of the maximum value, if there is no need to return, set 0
  • minLoc: Return the pointer of the minimum position, if there is no need to return, set 0
  • maxLoc: returns the pointer of the maximum location, if there is no need to return, set 0
  • mask: optional masking operation
import sys
import cv2
import numpy as np

filename = r'data\01.bmp'
templfile = r'testdata\templ0.bmp'
src  = cv2.imread(filename)
templ = cv2.imread(templfile)
res = cv2.matchTemplate(src,templ,cv2.TM_SQDIFF)
min_value,max_value,min_loc,max_loc = cv2.minMaxLoc(res)
print(min_value,max_value,min_loc,max_loc)
cv2.waitKey()

output:

print(min_value,max_value,min_loc,max_loc)
72.0 1287406848.0 (365, 483) (1319, 1891)

Then (365, 483) is almost the matched position in the original image. Why is it almost the same, because the positions of res and src are not the same, and the difference depends on the size of your template.

Contour finding and drawing

principle

A contour can be simply thought of as a curve connecting consecutive points (connecting boundaries) together, have the same color or grayscale, extracting contours is to extract these curves with the same color or grayscale, or connected domains, contours are very useful in shape analysis and object detection and recognition.
How does the program save these curves? Save the collection of contour points, roughly you can save the contour curve.
Notice:

  • To be more accurate, a binarized image is used. Before looking for contours, thresholding or Canny boundary detection is performed
  • In OpenCV, finding contours is like finding white objects on a black background. you should remember,
    The object you are looking for should be whiteAnd the background should be black.

the code

cv2.findContours()-----find the contour

  • image: input image, 8-bit single-channel image (usually a binary image)
  • contours: detected contours, each contour is stored as a point vector (a list, len(contours) is the number of contours). You can not pass it, and use the return value to accept instead.
  • hierarchy: Optional output vector containing topological information of the image. As a representation of the number of contours, it contains many elements. Each contour contours[i] corresponds to 4 hierarchy elements hierarchy[i][0]~hierarchy[i][3], which respectively represent the next contour, the previous contour, The index number of the parent contour, the embedded contour, if there is no corresponding item, set to a negative number
  • mode: contour retrieval mode
  • method: the approximation method of the contour, the values ​​are shown in Figure 2
  • offset: optional offset for each contour, default Point()
mode significance
cv2.RETR_EXTERNAL=0 Indicates that only the outermost contour is detected (generally selected)
cv2.RETR_LIST=1 Extract all contours and place them in the list, the contours do not establish a hierarchical relationship
cv2.RETR_CCOMP=2 All contours are extracted and organized into a bilayer structure
cv2.RETR_TREE =3 Extract all contours and rebuild mesh contour structure
method significance
cv2.CHAIN_APPROX_SIMPLE Store all contour points.
cv2.CHAIN_APPROX_NONE Compressed storage. For line segments, only the two endpoints of the line segment are stored. It can be determined with only 4 points for the quadrilateral.

insert image description here

cv2.drawContours()---------drawing contours

  • image: Target image, Mat type object
  • contours: all input contours, each contour is stored as a point vector
  • contourIdx: contour drawing indicator variable (index), if it is a negative value, it means drawing all contours
  • color: the color to draw the outline
  • thickness: the thickness of the contour line, the default value is 1, if it is a negative value, the interior of the contour is drawn, the optional macro CV_FILLED
  • lineType: line type, default value 8
  • hierarchy: Optional hierarchy information, default value noArray()
  • maxLevel: Indicates the maximum level for drawing contours, the default value is INT_MAX
  • offset: optional contour offset parameter, default value Point()
filename = r'data\01.bmp'
src = cv2.imread(filename) 
gray = cv2.cvtColor(src,cv2.COLOR_BGR2GRAY)
if src is None:
    print('check the filenam') 
    sys.exit()
#二值化
_,binary = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
#轮廓查找
cnts ,hir = cv2.findContours(binary,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
cv2.namedWindow('binary',cv2.WINDOW_NORMAL)
cv2.imshow('binary',binary)
#轮廓绘制
cv2.drawContours(src,cnts,-1,(0,0,255),2)  # -1绘制所有轮廓,0绘制cnts中第一个轮廓...
cv2.namedWindow('src',cv2.WINDOW_NORMAL)
cv2.imshow('src',src)
cv2.waitKey()

Effect display
insert image description here
Red is the drawn outline.

#打印轮廓数量
print(hir.shape)
(1, 541, 4)

It shows that a total of 514 contours have been extracted from the above picture, and there are too many noise points.

#打印每个轮廓包含的点的数量
for cnt in cnts:
    print(cnt.shape)
(1, 1, 2)
(15, 1, 2)
(1, 1, 2)
(4, 1, 2)
(2, 1, 2)
(1, 1, 2)
(1, 1, 2)
(5, 1, 2)
(12, 1, 2)
(6, 1, 2)
(36, 1, 2)
...

Through the number of points, you can also filter out the outline you need.
Notice:

  • Binarized images often require filtering to avoid glitches in the above image.
  • When calling the contour drawing function, the original src will be directly changed. If you need it later in src, you can pass in src.copy()
  • The contour is extracted in the binary image, and the contour is drawn in the original three-channel image better, and the color is prominent. Binary images are only black and white.

Surround with an outline of a specific shape

In practical applications, there is often a need to represent the detected contours with polygons. Extracting the polygons surrounding the contours is also convenient for us to do further analysis. There are mainly the following types of contour surrounds:

  • Contour bounding rectangle
  • Contour minimum circumscribed rectangle (rotation)
  • contour minimal enclosing circle
  • contour fit ellipse
  • Contour approximation to polygonal curves
    insert image description here

Contour bounding rectangle—cv2.boundingRect()

boundingRect(array) -> retval

  • points: The input 2D point set. It is an element in the outline list in our outline check above
  • Return value: Rect class rectangle object (x, y, w, h). This way we can draw the rectangle
dst = src.copy()
#轮廓外接矩形
for cnt in cnts:
    if cnt.shape[0]>500:
        x,y,w,h = cv2.boundingRect(cnt)
        cv2.rectangle(dst,(x,y),(x+w,y+h),(0,0,255),3)

Filter targets by number of points.
insert image description here
Set a suitable threshold for the number of points until the contour is accurately found.

Contour minimum circumscribed rectangle—cv2.minAreaRect()

minAreaRect(points) -> retval

  • points: input 2D point set
  • Return value: RotatedRect class rectangle object, the main members of the circumscribed rotated rectangle are center, size, angle. (This angle may need 90-angle or other conversion)

Can you draw a rectangle with center, size, and angle in hand? Maybe, we use another way, using a function to get each point of the rectangle.

boxPoints(box[, points]) -> points

  • box:center、size、 angle
        box = cv2.boxPoints(res)
        box = np.int0(box)
for cnt in cnts:
    if cnt.shape[0]>500:
        res = cv2.minAreaRect(cnt)
        box = cv2.boxPoints(res)
        box = np.int0(box)
        cv2.drawContours(dst,[box],-1,(0,0,255),3)

Since the obtained box is a set of points, we can use cv2.drawContours to draw the minimum bounding rectangle.
renderings
insert image description here

Minimum circumscribed circle

Contour minimum circumscribed circle—cv2.minEnclosingCircle()

minEnclosingCircle(points) -> center, radius

  • points: input 2D point set
  • center: the center of Point2f& type, the output center of the circle
  • radius: float& type, indicating the output radius of the circle

Ellipse fitting

Contour ellipse fitting—cv2.fitEllipse()

fitEllipse(points) -> retval

  • points: The input two-dimensional point set, which can be filled in Mat type or std::vector
  • Return value: RotatedRect class rotated rectangle object

Approximate Polygonal Curves

cv2.approxPolyDP

  • curve: input 2D point set or contour
  • approxCurve: The result of polygonal approximation, its type is consistent with the type of input two-dimensional point set
  • epsilon: the precision of the approximation, which is the maximum value between the original curve and the approximate curve
  • closed: if true, the approximated curve is a closed curve, otherwise the approximated curve is not closed

Profile properties

Calculate the contour area—cv2.contourArea()

contourArea(contour[, oriented]) -> retval

  • contour: input 2D point set or contour
  • oriented: The default value is false, indicating that the returned area is an absolute value, otherwise it is signed
  • Return value: Double type returns the contour area
#通过轮廓面积筛选轮廓
for cnt in cnts:
    box_area = cv2.contourArea(cnt)
    if box_area>2000:
        cv2.drawContours(dst,cnt,-1,(0,0,255),3)

insert image description here

Calculate contour length (perimeter or curve length)

arcLength(curve, closed) -> retval

  • curve: input 2D point set or contour
  • colsed: the identifier used to indicate whether the curve is closed, the default value is true, indicating that the curve is closed
  • Return value: Double type returns the contour length

color space

HSV color space

The HSV color space is closer to the color seen by the human eye, so it is often used in color
detection and recognition. Among them, H (hue), S (saturation), V (brightness)
H—different colors (red/green/blue)—range: 0~360
S—color depth (light red/dark red)—range: 0.0 ~1.0
V—color light and dark (dark red/bright red)—range: 0.0~1.0 The
default HSV ranges of OpenCV are:
H: 0~180, S: 0~255, V: 0~255

insert image description here
In opencv, H, S, and V are still amplified to 0-255 directly, so the HSV values ​​​​of each color are as follows.
insert image description here

Color space conversion—cv2.cvtColor()

Color range deletion—cv2.inRange()

inRange(src, lowerBound, upperbBound[, dst]) -> dst

  • src: input original image or array
  • lowerb: lower border or color threshold
  • upperb: upper border or color threshold
  • dst: output target image, it needs to be the same size as the original image and the type needs to be CV_8U

the code

import sys
import cv2
import numpy as np

filename = r'datas\chepai.jpg'
src = cv2.imread(filename) 
if src is None:
    print('check the filenam') 
    sys.exit()
# 颜色区间范围删选
    #BGR2GRAY
hsv = cv2.cvtColor(src,cv2.COLOR_BGR2HSV)
l_v = np.array([100,43,46])  #HSV 低阈值
h_v = np.array([124,255,255]) #HSV 高阈值
mask = cv2.inRange(hsv,l_v,h_v)
cv2.namedWindow('mask',cv2.WINDOW_NORMAL)
cv2.imshow('mask',mask)
cv2.waitKey()

Effect comparison:
insert image description here
insert image description here
You can see that the blue pixel is set to 255.
Better results can be obtained by adjusting the threshold. The threshold setting in the above picture is as follows

l_v = np.array([100,43,46])  #HSV 低阈值
h_v = np.array([124,255,255]) #HSV 高阈值

We can see in the original picture that the blue above the license plate is relatively dark, so let’s increase the V of the low threshold

l_v = np.array([100,43,170])  #HSV 低阈值
h_v = np.array([124,255,255]) #HSV 高阈值

Results
insert image description here
You can see a significant improvement.

Guess you like

Origin blog.csdn.net/qq_42911863/article/details/125429361