python-opencv study notes 2 core operations

2.1 Basic operation of images

Basic operation:

  • access pixel values ​​and modify them
  • access image properties
  • Set the region of interest (ROI)
  • Split and merge images

Note: Almost all operations in this section are mainly related to Numpy rather than OpenCV.

important function

  1. numpy.ndarray.item()
    Function: Copy the elements of the array to the standard Python scalar (scalar) and return.
    Prototype: ndarray.item(*args)
    Parameters: *args: Arguments (variable number and type)
    - When the parameter is a number, the array will be expanded first, and then get the item corresponding to the number
    - When the parameter is a tuple , the corresponding item will be obtained in the form of an N-dimensional matrix
>>> np.random.seed(123)							# 设置随机数种子 
>>> x = np.random.randint(9, size=(3, 3))		# 生成随机数矩阵
>>> x
array([[2, 2, 6],
[1, 3, 6],
[1, 0, 1]])

>>> x.item(3)			# 参数为一个数字;展开数组并获得index为3的项,注意:index从0开始
1
>>> x.item(7)			# 同上
0
>>> x.item((0, 1))		# 参数为一个元组;以n维的方式获得对应项  第一维index为0([2, 2, 6]),第二维index为1(即[2, 2, 6]中的第二个元素 2)
2
>>> x.item((2, 2))		# [1, 0, 1]中的第3个元素 1
1

Note: It can be used to obtain the value in the numpy matrix and assign a new value to a given position according to the position, which is faster than using index.

  1. numpy.ndarray.itemset()
    Function: Change an element (value) in ndarray
    Prototype: ndarray.itemset(*args, newValue )
    Parameter 1: *args: Arguments (variable number and type)
    - When the parameter is a For numbers, the array will be expanded first, and then the item corresponding to the number will be obtained
    - when the parameter is a tuple, the corresponding item will be obtained in the form of an N-dimensional matrix
    Parameter 2: newValue new value
>>> x = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]], np.int32)
>>> x
array([[[ 1,  2,  3],
        [ 4,  5,  6]],
       [[ 7,  8,  9],
        [10, 11, 12]]], dtype=int32)    # 三维数组 shape = (2,2,3)
# 把 index = 1 的 value 改成 999
>>> x.itemset(1, 999)   			    # 参数1是数字1,那么,先展开数组后,将index=1的元素 改为999, 即[1,2,3....11,12]中的2改为999
>>> x
array([[[  1, 999,   3],
        [  4,   5,   6]],
        [[  7,   8,   9],
        [ 10,  11,  12]]], dtype=int32)
# 把 index = (1, 1, 2) 的值改成 888
>>> x.itemset((1, 1, 2), 888)			
# 参数1是元组(1,1,2);那么,按矩阵方式,获取一维数组index=1的元素(即:[[7,8,9],[10,11,12]]),再获得第二维index=1的元素(即:[10,11,12]),再获得第三维index=2的元素(即:12);将这元素12 改为 888
>>> x
array([[[  1, 999,   3],
        [  4,   5,   6]],
       [[  7,   8,   9],
        [ 10,  11, 888]]], dtype=int32)
  1. cv.split()
    function: split the image into separate channels
    Parameters: image array numpy.ndarray array object
    Return value: numpy.ndarray array object of 3 channels b, g, r
>>> b,g,r = cv.split(img)       # 分割图像

'''
numpy的等效方法,推荐
numpy的方法比cv.split()更快
b == img[:,:,0]
g == img[:,:,1]
r == img[:,:,2]
'''

Warning: cv.split() is a time-consuming operation . So use it only when necessary.

  1. cv.merge()
    Function: Merge channels
    Parameters: b, g, r three-color channels
    Return value: 3D image array

Note: b, g, r size should be the same

>>> img = cv.merge((b,g,r))     # 合并通道
  1. cv.copyMakeBorder()
    function: used to draw borders, convolution operations, zero padding
    Parameters:
    • src - the input image
    • top, bottom, left, right - border widths in corresponding directions, in pixels
    • borderType - A flag defining the type of border to add. It can be of the following types:
      • cv.BORDER_CONSTANT - add a constant color border. The value should be given as the next parameter;
      • cv.BORDER_REFLECT - the border will be a mirror reflection of the border element, like this: Fedcba|abcdefgh|hgfedcb;
      • cv.BORDER_REFLECT_101 or cv.BORDER_DEFAULT - same as above with slight changes, like this: gfedcb|abcdefgh|gfedcba
      • cv.BORDER_REPLICATE - the last element is copied in its entirety, like this: aaaaa|abcdefgh|hhhhh;
      • cv.BORDER_WRAP - cannot be interpreted, will look like this: cdefgh|abcdefgh|abcdefg;
    • value - if the border type is cv.BORDER_CONSTANT, it refers to the color of the border.
      return value:

Access and modify pixel values

Let's load a color image first.

>>> import numpy as np
>>> import cv2 as cv
>
>>> img = cv.imread('messi5.jpg')     # cv.imread 读取图像 需要注意的是CV图像的像素是BGR排序的。一般彩色图像为RGB

You can access a pixel value by its row and column coordinates .

  • For BGR images, it returns an array of blue, green, red values.
  • For grayscale images, only the corresponding intensities are returned.
# img 为numpy.ndarray数组,且img.shape为3维。
# 第一维表示图像的行数(即:图像的高),第二维表示图像的列数(即:图像的宽),第三维表示图像的深度(彩色为3,分别BGR,灰度为1,即强度)

>>> px = img[100,100]    # 获取图像的一个像素;[100,100] 指定了第一维(行)和第二维(列)
>>> print( px )
# 输出结果:
[157 166 200]			 # 这证明了 彩色图像的深度为3,即:3个字节,分别是B G R

# accessing only blue pixel
>>> blue = img[100,100,0]			# 只访问一个像素的blue分量
>>> print( blue )
# 输出结果
157

Notes : The first dimension represents the number of rows of the image (ie: the height of the image), the second dimension represents the number of columns of the image (ie: the width of the image), and the third dimension represents the depth of the image (color is 3, respectively, BGR, grayscale is 1, the intensity)

You can modify pixel values ​​in the same way.

>>> img[100,100] = [255,255,255]
>>> print( img[100,100] )
[255 255 255]

Warning : Numpy is an optimized library for fast array calculations . Therefore, simply accessing every pixel value and modifying it would be very slow and we discourage it.

Note : The above method is usually used to select a certain area of ​​an array (ie: array slice), such as the first 5 rows and the last 3 columns.
For individual pixel access , the Numpy array methods, array.item() and array.itemset() are considered better . However, they always return a scalar, so if you want to access all B, G, R values, you will need to call array.item() for each value separately.

Better pixel access and editing methods:

# accessing RED value
>>> img.item(10,10,2)      # numpy.ndarray.item()方法
59
# modifying RED value
>>> img.itemset((10,10,2),100)   # numpy.ndarray.itemset()方法
>>> img.item(10,10,2)
100

access image properties

Image attributes include the number of rows, columns, and channels ; the type of image data ; the number of pixels , etc.

  • Image shape (rows, columns, channels)
  • type of data
  • number of pixels

image shape property img.shape

The shape of the image is accessed by img.shape . It returns a tuple containing the number of rows, columns and channels (if the image is color).

>>> print( img.shape )   # 彩色图像  (342, 548, 3)   灰度图像 (342, 548) 也可以是(342, 548, 1) 
(342, 548, 3)

If an image is grayscale, the returned tuple only contains the number of rows (height) and columns (width), so this is a good way to check whether the loaded image is grayscale or color.

The total number of pixels of the image attribute img.size

The total number of pixels is accessed by Img.size.

>> print( img.size )  # ndarray 展开后的元素数量
562248

Image data type attribute img.dtype

The image data type is obtained from img.dtype.

>> print( img.dtype )    # 数据类型: 如果是彩色B,G,R  其中一个分量的值由几位存储。例如B,需要一个8位的字节进行存储
uint8

Note : img.dtype is very important when debugging, because a large number of errors in OpenCV-Python code are caused by invalid data types.

Image ROI

Sometimes, you have to work on certain areas of the image. For eye detection in images, face detection is first performed on the entire image. When getting a face, instead of searching the whole image, we select the face region individually and search for eyes within it. It improves accuracy (since the eyes are always on the face) and performance (since we search in a small area).

Get the ROI again using Numpy indexing. Here I selected the ball and copied it to another area of ​​the image.

>> ball = img[280:340, 330:390]    # 切片 注意:切片并不会创建新的内存,而是引用了img数组对应的部分;
>> img[273:333, 100:160] = ball    # 切片赋值给另一个切片 

Note : Slicing does not create new memory , but refers to the corresponding part of the img array;

Split and merge image channels

Sometimes you need to process the B, G, R channels of an image separately. In this case, you need to split the BGR image into individual channels. In other cases, you may need to concatenate these individual channels to create a BGR image. You can do this simply in the following way.

>>> b,g,r = cv.split(img)       # 分割图像
>>> img = cv.merge((b,g,r))     # 合并通道

or:

>>> b = img[:,:,0]             # 切片的方式获得通道0 即:b

Set one channel to one number, Numpy slices faster

Let's say you want to set all red pixels to zero - you don't need to split the channels first. Numpy indexing is faster .

>>> img[:,:,2] = 0

Warning : cv.split() is a time-consuming operation . So use it only when necessary. Otherwise, use Numpy indexing.

Make borders (padding) for images

If you want to create a border around the image, similar to a photo frame, you can use cv.copyMakeBorder(). But it has more applications in convolution operations, zero padding, etc. This function requires the following parameters:

  • src - the input image
  • top, bottom, left, right - border widths in corresponding directions, in pixels
  • borderType - A flag defining the type of border to add. It can be of the following types:
    • cv.BORDER_CONSTANT - add a constant color border. The value should be given as the next parameter;
    • cv.BORDER_REFLECT - the border will be a mirror reflection of the border element, like this: Fedcba|abcdefgh|hgfedcb;
    • cv.BORDER_REFLECT_101 or cv.BORDER_DEFAULT - same as above with slight changes, like this: gfedcb|abcdefgh|gfedcba
    • cv.BORDER_REPLICATE - the last element is copied in its entirety, like this: aaaaa|abcdefgh|hhhhh;
    • cv.BORDER_WRAP - cannot be interpreted, will look like this: cdefgh|abcdefgh|abcdefg;
  • value - if the border type is cv.BORDER_CONSTANT, it refers to the color of the border.

Here is a sample code demonstrating all these border types for better understanding:

import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt

BLUE = [255,0,0]   # 一个纯蓝像素
img1 = cv.imread('opencv-logo.png')     # 读取一幅图像
# 画边框  后会获得一幅新图像(即:为replicate数组新建内存)
replicate = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REPLICATE)  # 
reflect = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT)
reflect101 = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_REFLECT_101)
wrap = cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_WRAP)
constant= cv.copyMakeBorder(img1,10,10,10,10,cv.BORDER_CONSTANT,value=BLUE)     # 固定颜色边框
plt.subplot(231),plt.imshow(img1,'gray'),plt.title('ORIGINAL')
plt.subplot(232),plt.imshow(replicate,'gray'),plt.title('REPLICATE')
plt.subplot(233),plt.imshow(reflect,'gray'),plt.title('REFLECT')
plt.subplot(234),plt.imshow(reflect101,'gray'),plt.title('REFLECT_101')
plt.subplot(235),plt.imshow(wrap,'gray'),plt.title('WRAP')
plt.subplot(236),plt.imshow(constant,'gray'),plt.title('CONSTANT')
plt.show()

2.2 Arithmetic operations on images

Learn several arithmetic operations on images such as addition, subtraction, bitwise operations, etc.

important function

  1. cv.add()
    function: image addition
    parameters:

    1. img1
    2. img2
      return value: new image (new memory)
  2. cv.addWeighted()
    function: image weighted addition, that is: image mixing

  3. cv.bitwise_and()
    function: bitwise and

  4. cv.bitwise_or()
    function: bitwise or

  5. cv.bitwise_not()
    function: bitwise inversion

  6. cv.bitwise_xor()
    function: bitwise exclusive or

  7. mask mask parameter

    1. In some image processing functions, there will be mask parameters in some parameters , that is, this function supports mask operation. First of all, what is a mask and what is its use, as follows:
      The concept of mask in digital image processing is based on PCB plate making In the process of semiconductor manufacturing, many chip process steps use photolithography technology, and the graphic "negatives" used in these steps are called masks (also called "masks"). Mask an opaque graphics template in the selected area, and then the underlying erosion or diffusion will only affect the area outside the selected area.
      Image mask is similar to it, using a selected image, figure or object to block the processed image (all or part) to control the area or process of image processing.
      In digital image processing, the mask is a two-dimensional matrix array, and sometimes a multi-valued image is used. The image mask is mainly used for:
      ① extracting the ROI of the region of interest, multiplying the pre-made mask of the region of interest with the image to be processed, The image of the region of interest is obtained, and the image values ​​in the region of interest remain unchanged, while the values ​​of the images outside the region are all 0.
      ② Shielding function, use a mask to shield certain areas on the image, so that they do not participate in the processing or calculation of processing parameters, or only process or count the shielded areas.
      Structural feature extraction , using similarity variables or image matching methods to detect and extract structural features similar to the mask in the image.
      ④ Production of special shape images.
    2. Among the operation functions of all image basic operations, all processing functions with a mask (mask) have their masks involved in the operation (after the input image operation is completed, it will be operated with the mask image or matrix).

The following is a picture of the face mask, the background is white, and the outline of the mask is clearly cut out by bitwise operation and masking technology.
insert image description here

// C++
// 转换目标(一个面具)为灰度图像
cvtColor(faceMaskSmall, grayMaskSmall, CV_BGR2GRAY);
// 隔离图像上像素的边缘,仅与面具有关(即面具的白色区域剔除),下面函数将大于230像素的值置为0,小于的置为255
threshold(grayMaskSmall, grayMaskSmallThresh, 230, 255, CV_THRESH_BINARY_INV);
// 通过反转上面的图像创建掩码(因为不希望背景影响叠加)
bitwise_not(grayMaskSmallThresh, grayMaskSmallThreshInv);
//使用位“与”运算来提取面具精确的边界
bitwise_and(faceMaskSmall, faceMaskSmall, maskedFace, grayMaskSmallThresh);
// 使用位“与”运算来叠加面具
bitwise_and(frameROI, frameROI, maskedFrame, grayMaskSmallThreshInv);

Use the mask (mask) to perform "AND" operation, that is, the white area of ​​the mask image is to retain the image pixels that need to be processed, and the black area is to eliminate the image pixels that need to be processed. The rest of the bitwise operations are similar in principle but have different effects.
insert image description here

insert image description here

insert image description here

Arithmetic operations all functions

# 注意 以下为C++版本 
/*
两幅图像可以相加、相减、相乘、相除、位运算、平方根、对数、绝对值等;
图像也可以放大、缩小、旋转,还可以截取其中的一部分作为ROI(感兴趣区域)进行操作,各个颜色通道还可以分别提取及对各个颜色通道进行各种运算操作
*/
// 相加
void add(InputArray src1, InputArray src2, OutputArray dst,InputArray mask=noArray(), int dtype=-1);//dst = src1 + src2
// 相减
void subtract(InputArray src1, InputArray src2, OutputArray dst,InputArray mask=noArray(), int dtype=-1);//dst = src1 - src2
// 相乘
void multiply(InputArray src1, InputArray src2,OutputArray dst, double scale=1, int dtype=-1);//dst = scale*src1*src2
// 除法
void divide(InputArray src1, InputArray src2, OutputArray dst,double scale=1, int dtype=-1);//dst = scale*src1/src2
void divide(double scale, InputArray src2,OutputArray dst, int dtype=-1);//dst = scale/src2

void scaleAdd(InputArray src1, double alpha, InputArray src2, OutputArray dst);//dst = alpha*src1 + src2
// 加权相加  即:图像混合
void addWeighted(InputArray src1, double alpha, InputArray src2,double beta, double gamma, OutputArray dst, int dtype=-1);//dst = alpha*src1 + beta*src2 + gamma
// 平方根
void sqrt(InputArray src, OutputArray dst);//计算每个矩阵元素的平方根
// 幂运算
void pow(InputArray src, double power, OutputArray dst);//src的power次幂
// 指数运算
void exp(InputArray src, OutputArray dst);//dst = e**src(**表示指数的意思)
// 对数运算(幂运算的逆运算)
void log(InputArray src, OutputArray dst);//dst = log(abs(src))


/*位运算*/
//bitwise_and、bitwise_or、bitwise_xor、bitwise_not这四个按位操作函数。
// 位与
void bitwise_and(InputArray src1, InputArray src2,OutputArray dst, InputArray mask=noArray());//dst = src1 & src2
// 位或
void bitwise_or(InputArray src1, InputArray src2,OutputArray dst, InputArray mask=noArray());//dst = src1 | src2
// 位异或
void bitwise_xor(InputArray src1, InputArray src2,OutputArray dst, InputArray mask=noArray());//dst = src1 ^ src2
// 位反
void bitwise_not(InputArray src, OutputArray dst,InputArray mask=noArray());//dst = ~src

image addition

You can add two images with OpenCV function cv.add() , or simply with numpy operation res = img1 + img2 .
Requirements for image addition operations:

  • Both images should be the same depth and type ,
  • Or the second image could just be a scalar value .

Note : There is a difference between
OpenCV's addition and Numpy's addition . OpenCV addition is a saturating operation , while Numpy addition is a modulo operation .

Note: Addition in opencv is a saturated operation, that is, there is an upper limit, and numpy will take the modulo of the result.

  • Saturation operation opencv
    example:

    • If the result should be 256, but because the type of the output matrix is ​​also CV_8U, and the range of CV_8U is 0~255, the value is set to 255 (that is, there is an upper limit).
    • If the result should be -252, but because the type of the output matrix is ​​also CV_8U, and the range of CV_8U is 0~255, the value is set to 0.
  • Modulus operation The numpy
    modulo operation is to find the remainder of the division of two numbers.
    Its algorithm: target image = image 1 + image 2, and the calculation result is modulo calculation

    • When the pixel value <= 255, the result is "image1 + image2", eg 128 + 26 = 154
    • When the pixel value >= 255, the result is the result of 255 demoulding, for example: (255+64)% 255 = 64

Recommended use: cv.add()

For example, consider the following example:

>>> x = np.uint8([250])
>>> y = np.uint8([10])
>>> print( cv.add(x,y) ) # 250+10 = 260 => 255
[[255]]
>>> print( x+y )          # 250+10 = 260 % 256 = 4
[4]

This will be more apparent when you add two images.
Please use OpenCV functions as they will give a better result.
Note: The image operation modulo of numpy.+ is obviously unreasonable (it is unreasonable to use it on the image), so the image addition uses opencv.add()

image blending

This is also image addition , but gives the image a different weight to give the impression of blending or transparency . Images are added according to the following formula:

insert image description here

By changing alpha from 0 → 1, you can make cool transitions between one image and another.
Here I took two images to blend. The first image has a weight of 0.7 and the second image has a weight of 0.3. cv.addWeighted() applies the following formula to the image.
insert image description here

Here γ takes 0.

img1 = cv.imread('ml.png')
img2 = cv.imread('opencv-logo.png')
dst = cv.addWeighted(img1,0.7,img2,0.3,0)
cv.imshow('dst',dst)
cv.waitKey(0)
cv.destroyAllWindows()

Please see the results below:
insert image description here

bit manipulation

This includes bitwise AND, OR, NOT and XOR operations.
Notes : XOR XOR, English is exclusive OR, abbreviated as xor.
They will be very useful in extracting any part of an image (as we will see in the next chapters), defining and manipulating non-rectangular ROIs , etc. Below we will see an example of how to change an area in an image.

I want to put the OpenCV logo on top of an image.

  • If I add two images cv.add ( ), it will change color ( note : corresponding pixel is added one by one, and saturation operation is performed).
  • If I blend them cv.addWeighted ( ), I get a transparent effect ( note : the corresponding weights are added pixel by pixel).

But I want it to be opaque. If it's a rectangular area, I can use ROI like we did in the last chapter. But the OpenCV logo is not a rectangular shape (for example: cutout). So you can do it with bit operations as shown below.

# Load two images
img1 = cv.imread('messi5.jpg')
img2 = cv.imread('opencv-logo-white.png')

# I want to put logo on top-left corner, So I create a ROI
rows,cols,channels = img2.shape   
# 笔记:图像的形状,返回值=行,列,通道;那么反推 创建时形状参数[512,256,3],第一维512是行数,第二维256是列数,第三维3维是通道
roi = img1[0:rows, 0:cols]    									# roi切片,  整幅图像
# Now create a mask of logo and create its inverse mask also
img2gray = cv.cvtColor(img2,cv.COLOR_BGR2GRAY)					# 获得灰度图像
ret, mask = cv.threshold(img2gray, 10, 255, cv.THRESH_BINARY)	# 阈值化图像 获得二值图像
mask_inv = cv.bitwise_not(mask)									# 掩膜  笔记:用二值化图计算掩膜

# Now black-out the area of logo in ROI
img1_bg = cv.bitwise_and(roi,roi,mask = mask_inv)      			# 现在将ROI中的logo区域涂黑
# Take only region of logo from logo image.	
img2_fg = cv.bitwise_and(img2,img2,mask = mask)					# 从标志图像中只取标志的区域。
# Put logo in ROI and modify the main image
dst = cv.add(img1_bg,img2_fg)									# 在ROI中放置logo,并修改主图像
img1[0:rows, 0:cols ] = dst
cv.imshow('res',img1)
cv.waitKey(0)
cv.destroyAllWindows()

Notes : The shape of the image, the return value row, column, and channel; then invert the shape parameter [512,256,3] when creating, the length of the first dimension is 512 = the number of rows, the length of the second dimension is 256 = the number of columns, and the length of the third dimension is 3 = aisle

Please see the results below. The image on the left is the mask we created. The image on the right is the final result. For better understanding, show all intermediate images in the above code, especially img1_bg and img2_fg.
insert image description here

2.3 Performance measurement and improvement techniques

In image processing, since you're dealing with a huge number of operations per second, it's imperative that your code not only provide the correct solution, but provide it in the fastest way possible. Therefore, in this chapter, you will learn: Test the performance of your code; Some techniques to improve the performance of your code.

Important functions:

  • cv.getTickCount
    has the same function as the getTickCount() function in C++ (it feels the same)
    function:
    return a reference event (such as the moment the machine is turned on) to the number of clock cycles after the moment this function is called

  • cv.getTickFrequency
    Function:
    Return the frequency of the clock cycle, or the number of clock cycles per second.

  • time module

  • profile module
    Function: Python code performance analysis
    Usage:
    When you need to check the performance of which function, you can enter the name of the function to be analyzed in profile.run().

import profile    # 调用profile模块

def a():
    sum = 0
    for i in range(1,10001):
        sum += i
    return sum

def b():
    sum = 0
    for i in range(1,100):
        sum += a()
    return sum

if __name__ == "__main__":
    profile.run("b()")       # profile模块使用方法
  • The memory_profiler module
    needs to be installed separately; install memory_profiler with pip
    • Method 1: Debugging and running is to add or delete @profile
import time
			 # 注意正式运行时,需要删除@profile
@profile     # 在需要做性能分析的函数前面加装饰器 @profile
def my_func():
    a = [1] * (10 ** 6)
    b = [2] * (2 * 10 ** 7)
    time.sleep(10)
    del b
    del a
    print("+++++++++")

if __name__ == '__main__':
    my_func()

Then open the command line window in the directory where the execution file is located, and enter python -m memory_profiler xxx.py

The performance analysis result of the xxx.py function can be returned

Among the results obtained:

Mem usage :内存占用情况

Increment :执行该行代码后新增的内存

Occurences :执行次数

Line Contents :行内容
- 使用方法二:引入模块,在调试和运行时使用
from memory_profiler import profile

@profile(precision=4,stream=open('memory_profiler.log','w+'))
# precision:精确到小数后几位
# stream:此模块分析结果会保存到 'memory_profiler'日志文件,若无此参数,则结果会在控制台展示
# @profile 当不需要分析该函数性能时,将该行注释掉即可
def test1():
    c=0
    for item in xrange(100000):
        c+=1
    print c

if __name__=='__main__':
    test1()

Note : The performance analysis process itself takes no less than the code running time, because the performance analysis process is actually equivalent to running the program all over.

  • IPython
    IPython is a python interactive shell, which is much easier to use than the default python shell. It supports variable auto-completion, auto-indentation, bash shell commands, and many useful built-in functions and functions.
    Installation:
    As long as sudo apt-get install ipython is installed under ubuntu, start it through ipython.

  • The Cython library
    The Cython language makes C extensions to the Python language as easy as Python itself. Cython is a Pyrx-based source code converter , but supports more fringe features and optimizations. The Cython language is a superset of the Python language (almost all Python code is valid, but Cython Cython code) also supports optional static typing to call C functions, use C++ classes and declare blocks of C type variables and attributes of classes. This allows the compiler to generate very efficient C code from Cython code .

This makes Cython an ideal language for writing external C/C++ library code, and fast C modules that increase the execution speed of Python code.

  • SSE2, AVX, SIMD optimize
    SIMD, SSE, AVX instruction set
    The instruction set refers to the set of all instructions that the CPU can execute. Each instruction corresponds to an operation. Any program must be compiled into individual instructions before the CPU can recognize and execute them. The CPU relies on instructions to calculate and control the system, so the strength of instructions is an important indicator to measure CPU performance, and the instruction set has also become an effective tool to improve CPU efficiency.

CPUs have a basic instruction set. For example, most of the processors of Intel and AMD currently use the X86 instruction set, because they are all derived from the X86 architecture. But no matter how fast the CPU is, X86 instructions can only process one data at a time, so the efficiency is very low. After all, in many applications, data appears in groups, such as the coordinates (XYZ) and color (RGB) of a point. , multi-channel audio, etc. In order to improve the performance of the CPU in some aspects, it is necessary to add some special instructions to meet the needs of the times, and these new instructions constitute the extended instruction set. The instruction set adopts single instruction multiple data (single instruction multiple data, referred to as SIMD) extension technology.

History of evolution:
MMX
Intel first introduced the MMX (Multi Media eXtensions) multimedia extension instruction set in 1996, and also pioneered the SIMD (Single Instruction Multiple Data, Single Instruction Multiple Data) instruction set SSE SSE (Streaming SIMD Extensions
,
streaming The Single Instruction Multiple Data Extension) instruction set was first introduced by Intel in the Pentium III processor in 1999, and extended the vector processing capability from 64 bits to 128 bits.
AVX
In August 2007, AMD preemptively announced the SSE5 instruction set (SSE to SSE4 are all produced by Intel). Intel immediately said that it does not support SSE5, and then announced in March 2008 that the Sandy Bridge micro-architecture will introduce a new AVX instruction. In April of the same year, Intel announced the AVX instruction set specification, and then began to continuously update it. The industry generally believes that supporting the AVX instruction set is the most important progress of Sandy Bridge, not one of them.
The AVX (Advanced Vector Extensions, Advanced Vector Extensions) instruction set draws on some AMD SSE5 design ideas, expands and strengthens, and forms a new generation of complete SIMD instruction set specifications.

  • cv.useOptimized()

check if optimization is enabled
Check whether the cv function optimization function is enabled.

  • cv.setUseOptimized()

Set whether the cv function optimization function is enabled/disabled

Besides OpenCV, Python also provides a module time which is helpful for measuring execution time. Another module, profile, helps to get detailed reports of the code, such as how much time is spent in each function in the code, how many times the function is called, etc. However, if you're using IPython , all of these features come together in a user-friendly way. We'll look at some important features, for more details, check out the links in the Additional Resources section.

Measuring performance with OpenCV

The cv.getTickCount function returns the number of clock ticks since a reference event (such as the moment the machine was turned on) to the moment this function was called . So if you call it before and after a function executes, you can get the number of clock cycles it took to execute a function.

The cv.getTickFrequency function returns the frequency of clock ticks, or the number of clock ticks per second.

So to find the execution time in seconds you can do the following.

e1 = cv.getTickCount()   # 获得时钟周期的数量
# your code execution
e2 = cv.getTickCount()	
time = (e2 - e1)/ cv.getTickFrequency()   # 时钟周期的数量 / 频率(周期/秒) = 时间

We will use the following example to demonstrate. The following examples apply median filtering with kernel sizes ranging from 5 to 49. Don't worry about what the result will look like - that's not our goal:

img1 = cv.imread('messi5.jpg')
e1 = cv.getTickCount()
for i in range(5,49,2):
    img1 = cv.medianBlur(img1,i)
e2 = cv.getTickCount()
t = (e2 - e1)/cv.getTickFrequency()
print( t )
# Result I got is 0.521107655 seconds

You can do the same with the time module. Instead of using cv.getTickCount, use the time.time() function. Then take the difference between these two times.

Default optimizations in OpenCV

Many functions of OpenCV are optimized using SSE2 , AVX , etc. It also contains unoptimized code. Therefore, we should take advantage of these features if our system supports them (almost all modern processors support them). At compile time, it is enabled by default . So, if OpenCV has optimized code enabled, it will run optimized code, otherwise it will run unoptimized code. You can use **cv.useOptimized()** to check if it is enabled/disabled , and **cv.setUseOptimized()** to enable/disable it. Let's look at a simple example.

# check if optimization is enabled
In [5]: cv.useOptimized()
Out[5]: True
In [6]: %timeit res = cv.medianBlur(img,49)
10 loops, best of 3: 34.9 ms per loop
# Disable it
In [7]: cv.setUseOptimized(False)
In [8]: cv.useOptimized()
Out[8]: False
In [9]: %timeit res = cv.medianBlur(img,49)
10 loops, best of 3: 64.1 ms per loop

As you can see, the optimized median filter is 2x faster than the unoptimized version. If you check its source code, you can see that the median filter is SIMD optimized . Therefore, you can use it to enable optimization at the top of your code (remember it is enabled by default).

Measuring performance in IPython

Sometimes you may need to compare the performance of two similar operations. IPython gives you the magic command timeit to perform this task. It runs the code several times to get more accurate results. However, it is suitable for measuring single lines of code.

For example, you know which of the following operations is faster, x=5; y=x**2, x=5; y=x x, x= np.uint8([5]); y=x x, or y= np. square(x)? We'll find out with timeit in the IPython shell.

In [10]: x = 5
In [11]: %timeit y=x**2
10000000 loops, best of 3: 73 ns per loop
In [12]: %timeit y=x*x
10000000 loops, best of 3: 58.3 ns per loop
In [15]: z = np.uint8([5])
In [17]: %timeit y=z*z
1000000 loops, best of 3: 1.25 us per loop
In [19]: %timeit y=np.square(z)
1000000 loops, best of 3: 1.16 us per loop

You can see that x = 5; y = x*x is the fastest, about 20 times faster compared to Numpy. If you also take into account the creation of the array, it could be 100 times faster. (Numpy developers are working on this).

Note: Python's scalar operations are faster than Numpy's . So for operations involving one or two elements, Python scalars are better than Numpy arrays. Numpy has an advantage when the size of the array is slightly larger.

We'll try another example. This time, we will compare the performance of cv.countNonZero() and np.count_nonzero() on the same image:

In [35]: %timeit z = cv.countNonZero(img)
100000 loops, best of 3: 15.8 us per loop
In [36]: %timeit z = np.count_nonzero(img)
1000 loops, best of 3: 370 us per loop

See, the OpenCV function is almost 25 times faster than the Numpy function.

Note: In general, OpenCV functions are faster than Numpy functions. So for the same operation, OpenCV functions are preferred . However, there may be exceptions, especially when Numpy uses views instead of copies.

More IPython magic commands

There are other amazing commands to measure performance, profiling, line profiling, memory measurement and more. They all have good documentation. So only links to those documents are provided here. It is recommended that interested readers can give it a try.

performance optimization technology

There are several techniques and coding approaches to get the most out of Python and Numpy. Only relevant techniques and methods are pointed out here, and links to important sources are given. The caveat here is to first try to implement the algorithm in a simple way. Once it works, profile it, find bottlenecks, and optimize it.

  1. Avoid loops in Python as much as possible , especially double/triple loops etc. They are inherently slow.
  2. Vectorize algorithms/code as much as possible , since Numpy and OpenCV are optimized for vector operations. Take advantage of cache coherency.
  3. Do not copy arrays unless necessary . Try to use views (note: slices) instead. Copying of arrays is an expensive operation.
  4. If your code is still slow after doing all of this, or if large loops are unavoidable, use an additional library, such as Cython , to make it faster.

Guess you like

Origin blog.csdn.net/wu_zhiyuan/article/details/126893961