Python-opencv study notes 1 GUI features in opencv

CompatibilityOpenCV >= 3.4.4

1.1 Getting Started with Images

important function

cv::imread()

Read an image from a file.

  • Parameter 1: The specified file path (required)
  • Parameter 2: The format of the image (optional)
    • IMREAD_COLOR loads an image in BGR8-bit format. This is the default used here.
    • IMREAD_UNCHANGED loads the original image (including alpha channel, if present).
    • IMREAD_GRAYSCALE loads an image in grayscale.

Return value:
After reading in the image, the data will be stored in a cv::Mat object .
Description:
OpenCV provides support for image formats such as Windows bitmap (bmp), portable image format (pbm, pgm, ppm) and Sun raster (sr, ras). You can also load image formats like JPEG (jpeg, jpg, jpe) with the help of plugins (if you build the library yourself, you need to specify to use them, but in the package we provide, there are plugins by default) , JPEG 2000 (jp2 - known as Jasper in CMake), TIFF files (tiff, tif) and Portable Network Graphics (png). Also, OpenEXR is also a possibility.

cv::imshow()

Display an image in an OpenCV window.

  • Parameter 1: The title of the window
  • Parameter 2: cv::Mat object to be displayed

illustrate:

  1. The title is also the identifier of the window. If multiple windows need to be opened at the same time, the title of the window cannot be repeated.
  2. Displaying images is computer resource intensive and can seriously slow things down.

cv.destroyAllWindows()

Destroy all windows
Note: After using the imshow() method to display images, you should routinely call cv.destroyAllWindows() to destroy the windows.

cv::imwrite()

Write the image to a file.

  • Parameter 1: file path
  • Parameter 2: cv::Mat object

cv.samples.findFile(“starry_night.jpg”)

Find starry_night.jpg from OpenCV's sample library

  • Parameter 1: Sample file name

return value:

  • file path

cv.waitKey(0)

Waiting for the user to press a key

  • Parameter 1: Waiting for timeout, that is: how long to wait for user input
    • Unit: millisecond
    • 0 means wait forever

return value:

  • the value of the key pressed

ord()

The built-in function ord() in python returns the ASCII value of the string

  • Parameter 1: string

return value:

  • ASCII value of the parameter string

sys.exit(“any things”)

sys library functions. The system exits and displays any things

  • Parameter 1: display statement (optional)

source code

  • Load, display, save images
import cv2 as cv											# OpenCV的python库被导入,重命名为cv
import sys													# sys库 用于关于系统的

img = cv.imread(cv.samples.findFile("starry_night.jpg"))   	# 从样品库中找到starry_night.jpg,并加载
if img is None:												# 检测img是否加载成功
    sys.exit("Could not read the image.")   				# 系统退出,并显示
cv.imshow("Display window", img)							# 显示img, 显示框标题为 Display window
k = cv.waitKey(0)											# 等待键盘任意键,参数0 表示不设置超时
if k == ord("s"):											# python 中内置函数ord()返回字符串的ASCII数值
    cv.imwrite("starry_night.png", img)						# 将图像写入硬盘,名称为starry_night.png

1.2 Video Getting Started

important function

cv.VideoCapture()

Create a video grabber

  • Parameter 1: required
    • device index
    • video file name

Return value:
video capture object
Explanation:
the device index is a number, which is used to specify which camera. Typically, a camera will be connected. So I just pass 0 (or -1). You can select the second camera by passing 1, and so on. After that, you can capture frame by frame.
Note at the end, don't forget to release the capture .

cap.isOpened()

Determine whether the video capture object (corresponding camera) has been opened

cap.open()

Open the video capture object (corresponding camera)

Explanation: cap = cv.VideoCapture() When creating a video capture object, the camera has already been opened. It is generally not necessary to use cap.open() to open again. However, sometimes, the cap may not have initialized the capture, and you need to call cap.open().

ret, frame = cap.read()

Capture frame by frame (image from camera or video file)
Return value: 2

  1. ret
    bool(true/false). Indicates whether the capture was successful
  2. frame
    frame image

cap.get(propId)

Access some features of this video

  1. The parameter propid
    propId is a number from 0 to 18, each number represents a property of the video (if it applies to the video)

cap.set(propId, value)

Modify video properties

  1. The parameter propid
    propId is a number from 0 to 18, each number represents a property of the video (if it applies to the video)
  2. Parameter value
    The new value of the attribute

For example, I can check frame width and height by cap.get(cv.CAP_PROP_FRAME_WIDTH) and cap.get(cv.CAP_PROP_FRAME_HEIGHT). The default value it gives me is 640x480. But I want to modify it to 320x240. Just use ret = cap.set(cv.CAP_PROP_FRAME_WIDTH,320) and ret = cap.set(cv.CAP_PROP_FRAME_HEIGHT,240).

cap.release()

release the catcher object

cv.cvtColor(frame, cv.COLOR_BGR2GRAY)

Convert image format

  1. The parameter
    cv::Mat type object is: image object
  2. parameter
    conversion keyword
    • cv.COLOR_BGR2GRAY color to grayscale

cv.VideoWriter_fourcc(*‘XVID’)

Creating a FourCC object
FourCC is a 4-byte encoding used to specify the video codec. A list of available encodings can be found at fourcc.org. The specific encoding depends on the platform.
The following codecs are commonly used:

  • In Fedora: divx, xvid, mjpg, x264, wmv1, wmv2. (XVID is preferable. MJPG results in large sized video. X264 gives very small sized video)
  • In Windows: DIVX (more to be tested and added)
  • In OSX: MJPG (.mp4), DIVX (.avi), X264 (.mkv).

FourCC encoding is passed as *cv.VideoWriter_fourcc('M','J','P','G') or cv.VideoWriter_fourcc( 'MJPG') for MJPG.

cv.VideoWriter()

Instructions: save video

  • Create a VideoWriter object. We should specify the output filename (eg: output.avi).
  • Then we should specify the FourCC code (see next paragraph for details).
  • The frames per second (fps) and frame size should then be passed.
  • The last one is the isColor flag. If it is "true", the encoder will use color frames, otherwise it will use grayscale frames.

For example:

out = cv.VideoWriter('output.avi', fourcc, 20.0, (640,  480))  # 输出视频名称,fourcc对象,帧率,图像尺寸
...
out.write(frame)	# 输出一帧图像
...
out.release()

cv.flip()

Flip a 2D array around vertical, horizontal, or both axes
Parameters:

  1. src: input array
  2. dst: output array, output array with the same size and type as src
  3. flipCode: A flag specifying how to flip the array;
    • 0 means to rotate around the x-axis, that is, flip in the vertical direction ,
    • A positive value (such as 1) indicates rotation around the y-axis, that is, horizontal rotation .
    • A negative value (for example -1) means to flip on both axes, flipping both horizontally and vertically .

exit()

Python's built-in exit function, which has the same function as sys.exit().

Source code to read camera video

  • Read video, display video and save video
  • Capture video from camera and display
import numpy as np
import cv2 as cv

cap = cv.VideoCapture(0)						# 获取摄像机来捕获实时流  参数0表示本台设备的序号为0的采集设备(camera)
if not cap.isOpened():							# 判断是否已经打开
    print("Cannot open camera")
    exit()
while True:										# 循环捕捉图像
    # Capture frame-by-frame
    ret, frame = cap.read()
    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break
    # Our operations on the frame come here
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    # Display the resulting frame
    cv.imshow('frame', gray)
    if cv.waitKey(1) == ord('q'):
        break
# When everything done, release the capture
cap.release()
cv.destroyAllWindows()

Source code to play video from file

import numpy as np
import cv2 as cv

cap = cv.VideoCapture('vtest.avi')
while cap.isOpened():
    ret, frame = cap.read()
    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break
    gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
    cv.imshow('frame', gray)
    if cv.waitKey(1) == ord('q'):
        break
cap.release()
cv.destroyAllWindows()

source code save video

import numpy as np
import cv2 as cv

cap = cv.VideoCapture(0)									# 创建视频捕捉对象
# Define the codec and create VideoWriter object
fourcc = cv.VideoWriter_fourcc(*'XVID')						# 第1步:创建FourCC对象  我理解为规定压缩视频格式的对象
out = cv.VideoWriter('output.avi', fourcc, 20.0, (640,  480))   # 第2步:创建视频写入对象  并规定文件名,fourcc, 帧率, 尺寸(640,680)
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break
    frame = cv.flip(frame, 0)								# 翻转图像  flipCode=0 表示X轴翻转,即垂直翻转,或垂直镜像
    # write the flipped frame
    out.write(frame)										# 第3步:输出图像
    cv.imshow('frame', frame)
    if cv.waitKey(1) == ord('q'):
        break
# Release everything if job is finished
cap.release()
out.release()
cv.destroyAllWindows()

1.3 Drawing function in OpenCV

OpenCV draws different geometries

important function

cv.line()

draw straight line in image img

parameter:

  • img : the image you want to draw the shape
  • starting point
    • In tuple form, e.g. (0,0)
  • end position
  • color : The color of the shape.
    • For BGR , pass as a tuple , eg. (255,0,0) means blue.
    • For grayscale, just pass a scalar value.
  • thickness : the thickness of the line or circle etc.
    • If you pass **-1 for a closed shape like a circle , it will fill the shape**.
    • Default thickness = 1
  • lineType : The type of line, whether it is 8-line or anti-aliased line, etc.
    • cv.LINE_AA gives antialiased lines, which is great for curves.

cv.circle()

Parameters:
img, color, thickness, lineType parameters are the same as above.

cv.rectangle()

Parameters:
img, color, thickness, lineType parameters are the same as above.

cv.ellipse()

Parameters:
img, color, thickness, lineType parameters are the same as above.
Exclusive parameters:

  • center position (x, y)
  • Axis length (major axis length, minor axis length)
  • Angle is the rotation angle of the ellipse in the counterclockwise direction
  • startAngle indicates the starting point of the ellipse arc
  • endAngle indicates the end point of the arc of the ellipse, measured clockwise from the main axis

cv.putText()

Parameters:
img, color, thickness, lineType parameters are the same as above.

img = np.zeros((512,512,3), np.uint8)

Create a pure black color image (three-dimensional array), element unit: uint8
known from the shape parameter (512,512,3), image size 512x512, depth 3 (each pixel of the color image includes three RGB bits)

Create a grayscale image as follows:

img = np.zeros((512,512,1), np.uint8)

Note that the shape parameter can be (512,512,1) or (512,512). Both methods can achieve grayscale images

Source code to draw a straight line

To draw a line, you need to pass the start and end coordinates of the line. We will create a black image and draw a blue line on it from the upper left corner to the lower right corner.

import numpy as np
import cv2 as cv

# Create a black image
img = np.zeros((512,512,3), np.uint8)                # 创建一个彩色图像(纯黑)
# Draw a diagonal blue line with thickness of 5 px
cv.line(img,(0,0),(511,511),(255,0,0),5)

draw a rectangle

To draw a rectangle, you need the upper left and lower right corners of the rectangle . This time we will draw a green rectangle in the upper right corner of the image.

cv.rectangle(img,(384,0),(510,128),(0,255,0),3)  

draw circle

To draw a circle, you need its center coordinates and radius . We will draw a circle inside the rectangle drawn above.

cv.circle(img,(447,63), 63, (0,0,255), -1)

cvtutorials.com: The **-1 in the circle drawing (closed image) syntax means that the circle is filled**.

draw an ellipse

In order to draw an ellipse, we need to pass several parameters. One parameter is the center position (x,y).
The next arguments are the lengths of the axes (major axis length, minor axis length).
Angle is the rotation angle of the ellipse in the counterclockwise direction.
startAngle and endAngle represent the start and end points of the arc of the ellipse, measured clockwise from the main axis. See the documentation of cv.ellipse() for more details.
The example below draws a semi-ellipse in the center of the image.

cv.ellipse(img,(256,256),(100,50),0,0,180,255,-1)  
"""
param1:img表示图像, cv::Mat类型(Python中用ndarray代替)
param2: (256,256) 圆心点   元组形式
param3: (100,50)   主轴长度,小轴长度   元组形式
param4: 0  表示角度
param5: 0  startAngle 椭圆弧线起点
param6: 180 endAngle  椭圆弧线终点
param7: color  标准写法(255,0,0),没想到也可以直接写255(蓝色)
param8:  thickness   线条宽度
"""

draw polygon

To draw a polygon, first you need the coordinates of the vertices. Turn these points into an array of ROWSx1x2 shape, where ROWS is the number of vertices, it should be of type int32. Here we have drawn a small polygon with four vertices in yellow.

pts = np.array([[10,5],[20,30],[70,20],[50,10]], np.int32)   	# 顶点坐标二维数组  int32类型
pts = pts.reshape((-1,1,2))										# 变成一个ROWSx1x2形状的数组
cv.polylines(img,[pts],True,(0,255,255))						# [pts] 写法需要注意

If the third parameter is False , you will get a polyline connecting all the points instead of a closed shape.
cv.polylines() can be used to draw multiple lines . Just create a list of all the lines you want to draw and pass it to the function. All lines will be drawn individually.
This is a better and faster way to draw a set of lines than calling cv.line() for each line.

add text to image

To add text to an image, you need to specify the following things:

  • the text data you want to write
  • The coordinates of where you want to put it (for example, where the bottom left data starts).
  • Font type (check cv.putText() documentation for supported fonts).
  • font scale (specifies the size of the font)
  • General stuff like color, thickness, lineType, etc. For better appearance, lineType = cv.LINE_AA is recommended.

We will display white OpenCV on our image.

font = cv.FONT_HERSHEY_SIMPLEX			# 字体
cv.putText(img,'OpenCV',(10,500), font, 4,(255,255,255),2,cv.LINE_AA)
"""
param1: img
param2: 需要显示的文本
param3: 显示位置  注意:文字左下角位置
param4: 字体
param5: 字体比例
param6: color
param7:thickness   线条宽度
param8: 线条类型
"""

1.4 Mouse as brush

Learn how to handle mouse events in OpenCV

important function

cv.setMouseCallback()

Bind the mouse callback function with the OpenCV window
Function prototype:

  void   SetMouseCallback(const string & winname,MouseCallback onMouse,void* userdata=0)
  • param1: openCV window name string (the window needs to be named in advance, cv.namedWindow('image') )
  • param2: mouse callback function
  • param3: User-defined parameters passed to the callback function.

def callback_function(event,x,y,flags,param): …

Mouse callback function
Note: The mouse callback function has a fixed format, and the callback name is arbitrary, but the parameters must be the above 4

  • param1: event means opencv time For example: if event == cv.EVENT_LBUTTONDBLCLK: …

    • EVENT_MOUSEMOVE swipe
    • EVENT_LBUTTONDOWN left click
    • EVENT_RBUTTONDOWN right click
    • EVENT_MBUTTONDOWN middle click
    • EVENT_LBUTTONUP Left button release
    • EVENT_RBUTTONUP Right click and release
    • EVENT_LBUTTONDBLCLK left double click
    • EVENT_RBUTTONDBLCLK right double click
    • EVENT_MBUTTONDBLCLK middle button double click
  • param2: x The x coordinate of the mouse focus

  • param3: y The y coordinate of the mouse focus

  • param4: flags is a combination of CV_EVENT_FLAG, the status of flags are:

    • EVENT_FLAG_LBUTTON left click and drag
    • EVENT_FLAG_RBUTTON Right click and drag
    • EVENT_FLAG_MBUTTON middle button drag
    • EVENT_FLAG_CTRLKEY Hold down Ctrl
    • EVENT_FLAG_SHIFTKEY Press and hold Shift
    • EVENT_FLAG_ALTKEY Hold down Alt
  • param5: param is a user-defined parameter passed to the setMouseCallback function call.

  1. simple demo

Here we create a simple application that draws a circle where we double click on an image.

First, we create a mouse callback function that is executed when a mouse event occurs. The mouse event can be any event related to the mouse, such as left button down, left button up, left button double click, etc. It gives us the coordinates (x,y) of each mouse event. With this event and location, we can do whatever we want. To list all available events, run the following code in the Python terminal.

Display all variables of cv containing "EVENT"

import cv2 as cv
events = [i for i in dir(cv) if 'EVENT' in i]    # 获得cv的全部包含“EVENT”的变量
print( events )

There is a specific format for creating mouse callback functions that is the same everywhere . It differs only in what the function does. So our mouse callback function does only one thing, draw a circle where we double-clicked. So please see the code below. The code is self-explanatory, as can be seen from the comments.

import numpy as np
import cv2 as cv

# mouse callback function  固定格式(参数)
def draw_circle(event,x,y,flags,param):				# 鼠标响应函数的固定格式; param1:事件;param2:鼠标x坐标;param3:鼠标y坐标;param4:param作用不明??
    if event == cv.EVENT_LBUTTONDBLCLK:
        cv.circle(img,(x,y),100,(255,0,0),-1)
        
# Create a black image, a window and bind the function to window
img = np.zeros((512,512,3), np.uint8)				# numpy创建全黑彩色图像,彩色图像的shap为(w,h,3); 注,(0,0,0)全黑,(255,255,255) 全白
cv.namedWindow('image')								# 命名窗口
cv.setMouseCallback('image',draw_circle)			# 创建鼠标回调函数,响应区域窗口"image",响应函数"draw_circle"
while(1):
    cv.imshow('image',img)
    if cv.waitKey(20) & 0xFF == 27:					# 等待
        break
cv.destroyAllWindows()								# 销毁全部窗口
  1. more advanced demo

Now we are going to make a better application. Here we draw rectangles or circles (depending on the mode we choose) by dragging the mouse, just like we do in the Paint application. So our mouse callback function has two parts, one part is used to draw a rectangle, and the other part is used to draw a circle. This specific example will be very helpful for creating and understanding some interactive applications, such as object tracking, image segmentation, etc.

import numpy as np
import cv2 as cv

drawing = False 			# true if mouse is pressed
mode = True 				# if True, draw rectangle. Press 'm' to toggle to curve
ix,iy = -1,-1

# mouse callback function   固定格式(名称随意,参数必须是event,x,y,flags,param)
def draw_circle(event,x,y,flags,param):
    global ix,iy,drawing,mode						# 使用全局变量
    if event == cv.EVENT_LBUTTONDOWN:
        drawing = True
        ix,iy = x,y
    elif event == cv.EVENT_MOUSEMOVE:
        if drawing == True:
            if mode == True:
            	cv.rectangle(img,(ix,iy),(x,y),(0,0,0),-1)     	# 简易擦拭,不理想,只为了模拟橡皮筋效果
                cv.rectangle(img,(ix,iy),(x,y),(0,255,0),-1)
            else:
                cv.circle(img,(x,y),5,(0,0,255),-1)
    elif event == cv.EVENT_LBUTTONUP:
        drawing = False
        if mode == True:
        	cv.rectangle(img,(ix,iy),(x,y),(0,0,0),-1)    		# 简易擦拭,不理想,只为了模拟橡皮筋效果
            cv.rectangle(img,(ix,iy),(x,y),(0,255,0),-1)
        else:
            cv.circle(img,(x,y),5,(0,0,255),-1)
       

Next we have to bind this mouse callback function to the OpenCV window. In the main loop, we should set up a keybinding for the 'm' key to toggle between rectangle and circle.

img = np.zeros((512,512,3), np.uint8)					# 全黑彩色图像
cv.namedWindow('image')									# 命名窗口
cv.setMouseCallback('image',draw_circle)				# 绑定 鼠标回调函数与OpenCV窗口
while(1):
    cv.imshow('image',img)
    k = cv.waitKey(1) & 0xFF
    if k == ord('m'):
        mode = not mode
    elif k == 27:
        break
cv.destroyAllWindows()

Note :
When dragging the mouse to draw a rectangle, the previous rectangle (covered by fill) will not be cleared, so this example cannot achieve the rubber band effect.
To achieve the rubber band effect, each drawing action must be stored.

1.5 Trackbar as Palette

Learn to bind the trackbar to the OpenCV window

important function

cv.getTrackbarPos()

get current positions of four trackbars Get the current position of the slider of the trackbar

param1: the name of the trackbar
param2: the name of the window to which the trackbar belongs

cv.createTrackbar()

The function prototype of the function is: (C++ writing)

CV_EXPORTS int createTrackbar(const String& trackbarname, const String& winname,int* value, int count,TrackbarCallback onChange = 0, void* userdata = 0);
  • param1: trackbarname scroll bar name;
  • param2: winname specifies the window (window name) where the scroll bar needs to be arranged;
  • param3: value Set the initial position of the slider, and record the position of the slider at the same time;
  • param4: count The maximum value of the scroll bar (the value of the slider to the rightmost end);
  • param5: onChange callback function name, the default value is 0;
  • param6: userdata The parameter passed to the callback function, used to handle trackbar events, the default value is 0.
  1. code demo

Here we'll create a simple application that displays a color you specify. You have a window showing the colors and three track bars to specify the B, G, R colors. You slide the track bar and the color of the window changes accordingly. By default, the initial color will be set to black.

For the cv.createTrackbar() function , the first parameter is the name of the trackbar, the second parameter is the name of the window it is connected to, the third parameter is the default value, the fourth parameter is the maximum value, and the fifth parameter is the callback function that will be executed every time the value of the trackbar changes. The callback function always has one default parameter, which is the position of the trackbar. In our case, the function does nothing, so we simply pass.

Another important application of the track bar is as a button or switch. OpenCV, by default, has no button functionality. So you can use trackbars to get this kind of functionality. In our application, we have created a switch, only when the switch is on, the application will work, otherwise the screen is always black.

import numpy as np
import cv2 as cv

def nothing(x):				
    pass
    
# Create a black image, a window
img = np.zeros((300,512,3), np.uint8)    		# numpy 创建彩色画面
cv.namedWindow('image')							# 命名窗口
# create trackbars for color change
cv.createTrackbar('R','image',0,255,nothing)	# 创建滚动条 R,G,B
cv.createTrackbar('G','image',0,255,nothing)
cv.createTrackbar('B','image',0,255,nothing)
# create switch for ON/OFF functionality
switch = '0 : OFF \n1 : ON'
cv.createTrackbar(switch, 'image',0,1,nothing)		# 创建滚动条 switch
while(1):
    cv.imshow('image',img)
    k = cv.waitKey(1) & 0xFF
    if k == 27:
        break
    # get current positions of four trackbars
    r = cv.getTrackbarPos('R','image')
    g = cv.getTrackbarPos('G','image')
    b = cv.getTrackbarPos('B','image')
    s = cv.getTrackbarPos(switch,'image')
    if s == 0:
        img[:] = 0
    else:
        img[:] = [b,g,r]				# 为img图像中所有点 赋值 [b,g,r]
"""
img[:] 表示 [:]第一1维的全部元素, 即全体元素的意思
img = [b,g,r]	这种写法是不正确的
"""        							
cv.destroyAllWindows()

Notes: Revisit understanding of numpy.ndarray slices

img = np.zeros((5,5,3), np.uint8)    		# numpy 创建彩色画面
img[:] 相当于对ndarray的切片,只是该切片为数组全部元素
img[1:2] 截取 第一维的5个元素中,第二个元素,结果如下:
print("img", img[1:2])
img [[[0 0 0]
[0 0 0]
[0 0 0]
[0 0 0]
[0 0 0]]]

img[1:2, 1:3]  截取 第一维的5个元素中,第二个元素;并在该元素中截取第二维的 1-2个元素,结果如下:
img[1:2, 1:3] = 
[[[0 0 0]
[0 0 0]]]

img[1:2, 1:3, 0:2]  截取 第一维的5个元素中,第二个元素;并在该元素中截取第二维的 1-2个元素;并且截取第三维的0-1个元素。结果如下:
img[1:2, 1:3, 0:2] = 
[[[0 0]
[0 0]]]

这么展开后,切片就好理解了

Guess you like

Origin blog.csdn.net/wu_zhiyuan/article/details/126765853