OpenCv-Python Basics


1. What is OpenCv?

opencv is a tool for quickly processing images and computer vision problems, and supports multiple languages ​​for development such as c++, python, java, etc. All examples in this tutorial are based on opencv-python, using the python language to process and study digital images.

2. Load/display/save image

1.imread to read the picture

img = cv2.imread() 							# 读入的图像会被转化为ndarray,未读取到图像,返回None
filename:图像路径
flags:标志以什么形式读入图像,可以选择一下方式:
· cv2.IMREAD_COLOR: 默认模式,图像调整为BGR,任何图像的透明度都将被忽略     可以输入数字 1 代替
· cv2.IMREAD_GRAYSCALE:以灰度模式加载图像							  可以输入数字 0 代替
· cv2.IMREAD_UNCHANGED:保留读取图片原有的颜色通道 					  可以输入数字 -1 代替

2. imshow display

The role of the imshow function is to create a window and display the image in the window. The window automatically fits the size of the image. The size
of the window displaying the image can be adjusted through the imutils module

cv2.imshow(winname,mat)
winname: 窗口名称(字符串)
mat: 图像对象,类型是numpy中的ndarray
------
# 我们一般就传入两个参数,一个需要修改大小的图像,一个weight或height,图像的新尺寸和原尺寸长宽比不变 
imutils.resize(image,width=None,height=None) 					

3. imwrite save

cv2.imwrite(filename, image)
filename: 保存的图像名称(字符串)
image: 图像对象,类型是numpy中的ndarray类型

4.waitKey() & destroyAllWindows()

waitkey controls the duration of imshow. When imshow is not followed by waitkey, it is equivalent to not providing time for imshow to display the image. The waitKey
function is a function that waits for keyboard events. When the parameter value delay<=0, the waiting time is infinitely long, and delay is positive When integer n, wait at least n milliseconds before the end
of waitKey. When any key is pressed during the waiting period, the function ends, and the key value (ASCII code) of the key is returned. If the key is not pressed after the waiting time is over, -1 is returned. The ord() function can be
used To obtain the ADCII code of the character, determine whether the letter A key is pressed during the waiting period, and use the return value of waitKey ==ord("a")

Then when we destroy the window, imagine two ways:
(1) Let the window stay for a while and then destroy it automatically;
(2) Receive the specified command, such as receiving the specified keyboard input, and then end the window

retval=cv2.waitKey(delay=None)  				# delay表示等待键盘触发的时间,单位是ms,默认为0
- delay > 0 :	等待delay毫秒时仍未接收到键盘输入,图像将自动销毁
- delay <= 0 : 无限等待,接收到任意键盘输入便会进行窗口销毁

When we use the imshow function to display images, we need to destroy the image display window in the program at last, otherwise the program will not terminate normally. The commonly used functions for destroying windows include the following two

- cv2.destroyWindow(winname) 			# 销毁单个特定窗口 winname:将要销毁的窗口的名字

- cv2.destroyAllWindows() 				# 销毁全部窗口,无参数

3. Simple drawing

Public parameters:
img: Indicates the image object ndarray that needs to be drawn
Color: Indicates the color of the drawn geometric figure, using BGR
thickness: Indicates the thickness of the line in the drawn geometric figure, the default is 1, for closed images such as circles and ellipses, take - 1 is to fill the inside of the graph
. lineType: Indicates the type of drawing geometric graph lines. The default 8-connected line is smooth. When cv2.LINE_AA is used, the line is smoother

  1. Straight line
    cv2.line(img, pt1, pt2, color, thickness=None, lineType=None, shift=None)
    pt1, pt2 represent the pixel coordinates of the starting point and the ending point of the line respectively
  2. Rectangle
    cv2.rectangle(img, pt1, pt2, color, thickness=None, lineType=None, shift=None)
    pt1, pt2 represent the coordinates of the upper left corner and lower right corner of the rectangle respectively
  3. Circle
    cv2.circle(img, center, radius, color, thickness=None, lineType=None, shift=None)
    center and radius represent the coordinates of the center of the circle and the radius of the circle respectively
  4. Add text
    cv2.putText(img,text,org,fontFace,fontScale,color,thickness=None,lineType=None)
    text is the text content to be drawn
    org is the position of the drawn font, the lower left corner of the text is the starting point
    fontFace font type, Example cv2.FONT_HERSHEY_SIMPLEX
    fontScale font size

  1. Mouse interaction
    Create a response function, write the operation to be implemented in the function
    def OnMouseAction(event,x,y,flags,param)
    OnMouseAction is the name of the response function, you can customize
    the event to indicate which event was triggered, such as cv2. EVENT_LBUTTONDOWN Pressing the left button
    x, y indicates the coordinates of the mouse in the window (x, y) when the mouse event is triggered.
    flags indicates the dragging event of the mouse . After defining the response function
    as the function ID
    , bind the function to a specific window, let When the mouse in the window triggers an event, the response function can be found and executed.
    Use the function cv2.setMouseCallback(winname,onMouse) to bind the window and the response function
    winname is the window name
    and onMouse is the response function name
# 示例:单击鼠标左键,输出鼠标的坐标信息
def OnMouseAction(event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONDOWN:
        print(x, y)
img = cv2.imread("lena.jpg")
cv2.namedWindow("x")
cv2.setMouseCallback("x", OnMouseAction)
cv2.imshow("x", img)
cv2.waitKey()
cv2.destroyAllWindows()
  1. Scrollbar
    cv2.createTrackbar(trackbarname,winname,value,count,onChange)
    trackbarname scrollbar name
    winname window name
    value initial value, slider position
    count scrollbar maximum value, usually the minimum value is 0
    onChange callback function, set the scrollbar to The operation after the table is written in this function
    Get the return value of the scroll bar through the function retval = cv2.getTrackbarPos(trackbarname, winname)

4. Basics of image processing

  1. When a color image is read in using OpenCv, the pixel values ​​of the B, G, and R channels of the image will be read in sequence, and the values ​​in the image array can be accessed in the form of an index, for example, image[0,0,0] accesses the first image of the image 0th row, 0th column, 0th channel (B channel) pixel
  2. Channel operation
    Channel splitting
    by index b=img[:,:,0] g =img[:,:,1] r=img[:,:,2] b,g,r=cv2.split(
    img )
    channel merge
    cv2.merge([b,g,r])
  3. Get the image attribute
    img.shape row, column, channel -> H, W, C
    img.dtype data type
    img.size total number of pixels

Five, color space

  1. GRAY (grayscale image) color space
    usually refers to an 8-bit grayscale image with 256 gray levels, and the range of pixel values ​​​​is [0,255]
  • RGB->GRAY
    Gray=0.299R+0.587G+0.114*B
  • GRAY->RGB
    R=Gray, G=Gray, B=Gray
  1. HSV color space
    Hue, light color, value range [0,360]
    Saturation, color depth, range [0,1]
    Value brightness, light brightness, range [0,1]
  2. Type conversion function
    dst = cv2.cvtColor(src,code,dst=None,dstCn=None)
    dst represents the output image, which has the same data type and depth as the input image.
    src represents the input image, and uint8,uint16,float32
    code represents the color space The conversion code
    dstCn is the number of channels of the target image

6. Geometric transformation

  1. Scaling
    dst = cv2.resize(src,dsize,dst=None,fx=None,fy=None,interpolation=None)
    dst output target image, image size is dsize (when the value is not 0)
    src original image
    dsize output image Size
    fx Horizontal zoom ratio
    fy Vertical zoom ratio interpolation interpolation
    method, the default is cv2.INTER_LINEART, bilinear interpolation
    The size of the target image dst is specified by one of dsize or fx\fy
  • If you specify the value of dsize, regardless of whether you specify the value of fx\fy, the size of dst is determined by dsize dsize
    (w,h), w and h correspond to the width and height of the zoomed target image, w is related to fx, and h is related to fy related
    When dsize is specified, the scaling size in the x direction is fx=double(dsize.w / src.w)
    Similarly, the scaling size in the y direction is fy=double(dsize.h / src.h)
  • Specify by the parameters fx and fy
    If the value of the parameter dsize is None, the size of the target image at this time is
    w=round(fx src.w), h=round(fy src.h)
  1. Flip
    dst = cv2.flip(src, flipCode)
    dst represents the target image with the same size and data type as the original image
    src original image
    flipCode rotation type

    flipCode effect
    0 Flip around the x-axis
    A positive number Flip around the y axis (most commonly used)
    negative number Flip around x, y at the same time
  2. Affine
    Affine transformation means that the image can be translated and rotated through a series of geometric transformations
    dst = cv2.warpAffine(src,M,dsize,dst=None,flags=None)
    dst target image
    src original image
    M Transformation matrix, 2 rows and 3 columns
    dsize Output image size, the order is (w, h), first row and then column
    dst (x, y) = src (M11 x+M12 y+M13, M21 x+M22 y+M23)

  • Translation M=np.float32([[1,0,detx],[0,1,dety]]), detx and dety are translation distances in x and y directions
  • The rotation obtains the transformation matrix M through the function retval = cv2.getRotationMatrix2D(center, angle, scale). The
    center is the center point of the rotation, (x, y)
    angle is the rotation angle, a positive number means counterclockwise, and a negative number means clockwise.
    scale is the transformation scale , to scale the size

7. Video processing

A video is made up of a series of images called frames. The speed of playing frames is called the frame rate, the unit is frame/second, and the corresponding English is FPS (Frames Per Second)

  • VideoCapture class
  1. Initialization
    Capture object = cv2.VideoCapture("Camera ID number")
    The camera ID number is -1 by default, which means that a camera is randomly selected. If there are multiple cameras, use the numbers 0, 1, and 2 to represent the camera ID number in turn
    " Capture object " is the return value, which is the instantiated object of the VideoCapture class
    . When initializing the video file, the parameter is the file name
    Capture object = cv2.VideoCapture("file name")
  2. cv2.VideoCapture.isOpend() checks whether the initialization is successful, returns True on success, and returns False on failure
  3. Capture frame
    retval, image = cv2.VideoCapture.read()
    retval indicates whether the capture frame is successful, True/False
    image indicates the returned frame, if there is no frame, return None
  4. cv2.VideoCapture.release() Close the camera
  5. Property setting
    retval = cv2.VideoCapture.get(propld) Get the property
    propld value of the VideoCapture class object
    cv2.CAP_PROP_FRAME_WIDTH
    cv2.CAP_PROP_FRAME_HEIGHT
    retval = cv2.VideoCapture.set(propld,value) Change the property of the VideoCapture class object
  • The VideoWriter class
    saves the picture as a video file/modifies the properties of the video, including completing the conversion of the video type
  1. Initialization
    instantiation object = cv2.VideoWriter(filename, fourcc, fps, frameSize)
    filename, if the filename already exists, it will overwrite the file
    fourcc, often use cv2.VideoWriter_fourcc('X','V','I', 'D'), indicating the mp4 encoding type, the generated file extension is .avi
    fps, the frame rate
    frameSize, the width and height of each frame
  2. write function
    None = cv2.VideoWriter.write(image)
    image is the video frame to be written, and the format of the color image is BGR
  3. Release
    cv2.VideoWriter.release()
import cv2
cap = cv2.VideoCapture(0, cv2.CAP_DSHOW)
forcc = cv2.VideoWriter_fourcc(*"XVID")
out = cv2.VideoWriter("xx.avi", forcc, 25, (640, 480))
while cap.isOpened():
    retval, frame = cap.read()
    if retval:
        out.write(frame)
        cv2.imshow("xx", frame)
        k = cv2.waitKey(1)
        if k == 27:
            break
    else:
        break
cap.release()
out.release()
cv2.destroyAllWindows()

Summarize

Tip: Here is a summary of the article:
For example: the above is what I will talk about today. This article only briefly introduces the use of pandas, and pandas provides a large number of functions and methods that allow us to process data quickly and easily.

Guess you like

Origin blog.csdn.net/goodlmoney/article/details/126830538