Article directory
1. What is OpenCv?
opencv is a tool for quickly processing images and computer vision problems, and supports multiple languages for development such as c++, python, java, etc. All examples in this tutorial are based on opencv-python, using the python language to process and study digital images.
2. Load/display/save image
1.imread to read the picture
img = cv2.imread() # 读入的图像会被转化为ndarray,未读取到图像,返回None
filename:图像路径
flags:标志以什么形式读入图像,可以选择一下方式:
· cv2.IMREAD_COLOR: 默认模式,图像调整为BGR,任何图像的透明度都将被忽略 可以输入数字 1 代替
· cv2.IMREAD_GRAYSCALE:以灰度模式加载图像 可以输入数字 0 代替
· cv2.IMREAD_UNCHANGED:保留读取图片原有的颜色通道 可以输入数字 -1 代替
2. imshow display
The role of the imshow function is to create a window and display the image in the window. The window automatically fits the size of the image. The size
of the window displaying the image can be adjusted through the imutils module
cv2.imshow(winname,mat)
winname: 窗口名称(字符串)
mat: 图像对象,类型是numpy中的ndarray
------
# 我们一般就传入两个参数,一个需要修改大小的图像,一个weight或height,图像的新尺寸和原尺寸长宽比不变
imutils.resize(image,width=None,height=None)
3. imwrite save
cv2.imwrite(filename, image)
filename: 保存的图像名称(字符串)
image: 图像对象,类型是numpy中的ndarray类型
4.waitKey() & destroyAllWindows()
waitkey controls the duration of imshow. When imshow is not followed by waitkey, it is equivalent to not providing time for imshow to display the image. The waitKey
function is a function that waits for keyboard events. When the parameter value delay<=0, the waiting time is infinitely long, and delay is positive When integer n, wait at least n milliseconds before the end
of waitKey. When any key is pressed during the waiting period, the function ends, and the key value (ASCII code) of the key is returned. If the key is not pressed after the waiting time is over, -1 is returned. The ord() function can be
used To obtain the ADCII code of the character, determine whether the letter A key is pressed during the waiting period, and use the return value of waitKey ==ord("a")
Then when we destroy the window, imagine two ways:
(1) Let the window stay for a while and then destroy it automatically;
(2) Receive the specified command, such as receiving the specified keyboard input, and then end the window
retval=cv2.waitKey(delay=None) # delay表示等待键盘触发的时间,单位是ms,默认为0
- delay > 0 : 等待delay毫秒时仍未接收到键盘输入,图像将自动销毁
- delay <= 0 : 无限等待,接收到任意键盘输入便会进行窗口销毁
When we use the imshow function to display images, we need to destroy the image display window in the program at last, otherwise the program will not terminate normally. The commonly used functions for destroying windows include the following two
- cv2.destroyWindow(winname) # 销毁单个特定窗口 winname:将要销毁的窗口的名字
- cv2.destroyAllWindows() # 销毁全部窗口,无参数
3. Simple drawing
Public parameters:
img: Indicates the image object ndarray that needs to be drawn
Color: Indicates the color of the drawn geometric figure, using BGR
thickness: Indicates the thickness of the line in the drawn geometric figure, the default is 1, for closed images such as circles and ellipses, take - 1 is to fill the inside of the graph
. lineType: Indicates the type of drawing geometric graph lines. The default 8-connected line is smooth. When cv2.LINE_AA is used, the line is smoother
- Straight line
cv2.line(img, pt1, pt2, color, thickness=None, lineType=None, shift=None)
pt1, pt2 represent the pixel coordinates of the starting point and the ending point of the line respectively - Rectangle
cv2.rectangle(img, pt1, pt2, color, thickness=None, lineType=None, shift=None)
pt1, pt2 represent the coordinates of the upper left corner and lower right corner of the rectangle respectively - Circle
cv2.circle(img, center, radius, color, thickness=None, lineType=None, shift=None)
center and radius represent the coordinates of the center of the circle and the radius of the circle respectively - Add text
cv2.putText(img,text,org,fontFace,fontScale,color,thickness=None,lineType=None)
text is the text content to be drawn
org is the position of the drawn font, the lower left corner of the text is the starting point
fontFace font type, Example cv2.FONT_HERSHEY_SIMPLEX
fontScale font size
- Mouse interaction
Create a response function, write the operation to be implemented in the function
def OnMouseAction(event,x,y,flags,param)
OnMouseAction is the name of the response function, you can customize
the event to indicate which event was triggered, such as cv2. EVENT_LBUTTONDOWN Pressing the left button
x, y indicates the coordinates of the mouse in the window (x, y) when the mouse event is triggered.
flags indicates the dragging event of the mouse . After defining the response function
as the function ID
, bind the function to a specific window, let When the mouse in the window triggers an event, the response function can be found and executed.
Use the function cv2.setMouseCallback(winname,onMouse) to bind the window and the response function
winname is the window name
and onMouse is the response function name
# 示例:单击鼠标左键,输出鼠标的坐标信息
def OnMouseAction(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
print(x, y)
img = cv2.imread("lena.jpg")
cv2.namedWindow("x")
cv2.setMouseCallback("x", OnMouseAction)
cv2.imshow("x", img)
cv2.waitKey()
cv2.destroyAllWindows()
- Scrollbar
cv2.createTrackbar(trackbarname,winname,value,count,onChange)
trackbarname scrollbar name
winname window name
value initial value, slider position
count scrollbar maximum value, usually the minimum value is 0
onChange callback function, set the scrollbar to The operation after the table is written in this function
Get the return value of the scroll bar through the function retval = cv2.getTrackbarPos(trackbarname, winname)
4. Basics of image processing
- When a color image is read in using OpenCv, the pixel values of the B, G, and R channels of the image will be read in sequence, and the values in the image array can be accessed in the form of an index, for example, image[0,0,0] accesses the first image of the image 0th row, 0th column, 0th channel (B channel) pixel
- Channel operation
Channel splitting
by index b=img[:,:,0] g =img[:,:,1] r=img[:,:,2] b,g,r=cv2.split(
img )
channel merge
cv2.merge([b,g,r]) - Get the image attribute
img.shape row, column, channel -> H, W, C
img.dtype data type
img.size total number of pixels
Five, color space
- GRAY (grayscale image) color space
usually refers to an 8-bit grayscale image with 256 gray levels, and the range of pixel values is [0,255]
- RGB->GRAY
Gray=0.299R+0.587G+0.114*B - GRAY->RGB
R=Gray, G=Gray, B=Gray
- HSV color space
Hue, light color, value range [0,360]
Saturation, color depth, range [0,1]
Value brightness, light brightness, range [0,1] - Type conversion function
dst = cv2.cvtColor(src,code,dst=None,dstCn=None)
dst represents the output image, which has the same data type and depth as the input image.
src represents the input image, and uint8,uint16,float32
code represents the color space The conversion code
dstCn is the number of channels of the target image
6. Geometric transformation
- Scaling
dst = cv2.resize(src,dsize,dst=None,fx=None,fy=None,interpolation=None)
dst output target image, image size is dsize (when the value is not 0)
src original image
dsize output image Size
fx Horizontal zoom ratio
fy Vertical zoom ratio interpolation interpolation
method, the default is cv2.INTER_LINEART, bilinear interpolation
The size of the target image dst is specified by one of dsize or fx\fy
- If you specify the value of dsize, regardless of whether you specify the value of fx\fy, the size of dst is determined by dsize dsize
(w,h), w and h correspond to the width and height of the zoomed target image, w is related to fx, and h is related to fy related
When dsize is specified, the scaling size in the x direction is fx=double(dsize.w / src.w)
Similarly, the scaling size in the y direction is fy=double(dsize.h / src.h) - Specify by the parameters fx and fy
If the value of the parameter dsize is None, the size of the target image at this time is
w=round(fx src.w), h=round(fy src.h)
-
Flip
dst = cv2.flip(src, flipCode)
dst represents the target image with the same size and data type as the original image
src original image
flipCode rotation typeflipCode effect 0 Flip around the x-axis A positive number Flip around the y axis (most commonly used) negative number Flip around x, y at the same time -
Affine
Affine transformation means that the image can be translated and rotated through a series of geometric transformations
dst = cv2.warpAffine(src,M,dsize,dst=None,flags=None)
dst target image
src original image
M Transformation matrix, 2 rows and 3 columns
dsize Output image size, the order is (w, h), first row and then column
dst (x, y) = src (M11 x+M12 y+M13, M21 x+M22 y+M23)
- Translation M=np.float32([[1,0,detx],[0,1,dety]]), detx and dety are translation distances in x and y directions
- The rotation obtains the transformation matrix M through the function retval = cv2.getRotationMatrix2D(center, angle, scale). The
center is the center point of the rotation, (x, y)
angle is the rotation angle, a positive number means counterclockwise, and a negative number means clockwise.
scale is the transformation scale , to scale the size
7. Video processing
A video is made up of a series of images called frames. The speed of playing frames is called the frame rate, the unit is frame/second, and the corresponding English is FPS (Frames Per Second)
- VideoCapture class
- Initialization
Capture object = cv2.VideoCapture("Camera ID number")
The camera ID number is -1 by default, which means that a camera is randomly selected. If there are multiple cameras, use the numbers 0, 1, and 2 to represent the camera ID number in turn
" Capture object " is the return value, which is the instantiated object of the VideoCapture class
. When initializing the video file, the parameter is the file name
Capture object = cv2.VideoCapture("file name") - cv2.VideoCapture.isOpend() checks whether the initialization is successful, returns True on success, and returns False on failure
- Capture frame
retval, image = cv2.VideoCapture.read()
retval indicates whether the capture frame is successful, True/False
image indicates the returned frame, if there is no frame, return None - cv2.VideoCapture.release() Close the camera
- Property setting
retval = cv2.VideoCapture.get(propld) Get the property
propld value of the VideoCapture class object
cv2.CAP_PROP_FRAME_WIDTH
cv2.CAP_PROP_FRAME_HEIGHT
retval = cv2.VideoCapture.set(propld,value) Change the property of the VideoCapture class object
- The VideoWriter class
saves the picture as a video file/modifies the properties of the video, including completing the conversion of the video type
- Initialization
instantiation object = cv2.VideoWriter(filename, fourcc, fps, frameSize)
filename, if the filename already exists, it will overwrite the file
fourcc, often use cv2.VideoWriter_fourcc('X','V','I', 'D'), indicating the mp4 encoding type, the generated file extension is .avi
fps, the frame rate
frameSize, the width and height of each frame - write function
None = cv2.VideoWriter.write(image)
image is the video frame to be written, and the format of the color image is BGR - Release
cv2.VideoWriter.release()
import cv2
cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
forcc = cv2.VideoWriter_fourcc(*"XVID")
out = cv2.VideoWriter("xx.avi", forcc, 25, (640, 480))
while cap.isOpened():
retval, frame = cap.read()
if retval:
out.write(frame)
cv2.imshow("xx", frame)
k = cv2.waitKey(1)
if k == 27:
break
else:
break
cap.release()
out.release()
cv2.destroyAllWindows()
Summarize
Tip: Here is a summary of the article:
For example: the above is what I will talk about today. This article only briefly introduces the use of pandas, and pandas provides a large number of functions and methods that allow us to process data quickly and easily.