Getting Started with OpenCV Basics

The main understanding includes

  • Opencv download and environment configuration
  • Understanding of opencv directory
  • highgui module in opencv
  • core module in opencv
  • imgproc module in opencv
  • feature2d module in opencv
  • opencv video operation

1. Introduction to OpenCV

Image is the basis of human vision and an objective reflection of natural scenery.

  • Analog images record image information through changes in the strength of certain physical quantities, so they are continuously transformed. Because analog signals are susceptible to interference, they have now been completely replaced by digital images
  • A digital image whose brightness is represented by discrete numerical values
  • Number of digits: 0~255 grayscale images, where 0 represents the darkest, 255 represents the whitest

image classification

  • Binary image: a binary image composed of 0 and 1
  • Gray scale image: 8-bit non-linear pixel scale is used to save, with 256 levels of gray scale, if 16 bits, then 65536 levels of gray scale
  • Color image: usually represented by three components of RGB, respectively between (0~255), using 8-bit unsigned integer

1.1 Installation Tutorial

If it is C++, you can refer to this tutorial,

Opencv+vs studio environment configuration_addict_jun's blog-CSDN blog_opencv_ffmpeg342_64.dll

Or directly log in to the Releases - OpenCV official website, then select a version, and click to download.

In python, it is very simple

pip install

pip install opencv-python == 3.4.2.17

test

import cv2
lena = cv2.imread("1.jpg")
cv2.imshow("image",lena)
cv2.waitKey(0)

expand

pip install opencv-contrib-python==3.4.2.17

1.2 Directory analysis

When we download the version of opencv3.x, we often find that there are two folders, opencv and opencv2.

  • The opencv folder contains the old header files.
  • Opencv2 contains the header files of the new openCV2 series with epochal significance.

You can also see the core header files of OpenCV1.0 in opencv, and they can be understood as a large component as a whole.

Please add a picture description

And our main concern is the opencv2 folder.

Please add a picture description

Here are a few commonly used modules.

  • [calib3d]——The abbreviation of the two words calibration and 3D, mainly used to deal with camera calibration and 3D reconstruction related content, including basic multi-view geometric algorithm, single stereo camera calibration, object pose estimation, stereo similarity algorithm, 3D Restoration of information, etc.
  • [core]——Core module functions, such as basic data structures, dynamic data structures, drawing functions, functions related to array operations, auxiliary functions, system functions and macros, and openGL interoperability.
  • [imgproc]——the abbreviation combination of the two words Image and Process, image processing module
  • 【features2d】——Features2D, that is, the 2D functional framework, you can know that it is related to the features just by listening to the name.
  • [flann]——Fast Library for Approximate Nearest Neighborris, a high-dimensional approximate nearest neighbor fast search algorithm library, including the following two parts.
  • 【gpu】——computer vision module using GPU acceleration
  • 【highgui】——high-level GUI graphical user interface, including media input and output, video capture, encoding and decoding of images and videos, interface of graphical interactive interface, etc.
  • [ml]——Machine learning module, amazing, opencv can even do this
  • 【objectect】——object detection module
  • [ocl]——OpenCL accelerated computer component module
  • 【photo】——image restoration and image denoising
  • 【stiching】——image stitching module
  • 【video】--Video related gradually

Through a catalog file, you can have a rough understanding of the knowledge in the image field.
Please add a picture description

My personal opinion is that the core module is the foundation, while imgproc, gpu, ocl, video, etc. are auxiliary functions, and the top layer is the specific application. It can be seen that 3D stereo matching, image calibration, feature selection, machine learning, and target detection , denoising and stitching are all image application fields.

Then the analysis of the directory is here, and the next step is the tutorial for the installation of opencv.

highgui is a terminal that interacts with people, and the display interface is strictly speaking an upper-level application. As for the learning of opencv, I think it is the core first, and then various applications. Auxiliary application is to learn in the application process.

1.3 Advantages of OpenCV

OpenCV

  • Based on C++ implementation, it also provides interfaces of python, Ruby, Matlab and other languages.
  • Cross-platform: on windows, linux, os x, android and ios. High-speed GPU operation interfaces based on CUDA and OpenCL are also under active development
  • Rich API, complete traditional computer vision algorithms, covering mainstream machine learning algorithms, and added support for deep learning

OpenCV-Python

  • Python is the current mainstream language with high readability
  • Python can be easily extended using C++, you can write computationally intensive code in C++, and create Python wrappers that can be used as Python modules, run as fast as C++, and can be used, coordinated and integrated Numpymore SciPyeasily Matplotlib.

You can also install it before installing OpenCV numpy,matplotlib.

OpenCV 3.4.3 Some of the above classic algorithms cannot be used because of the copyright application, and the new version has certain restrictions

2 highgui high-level GUI graphical user interface

Mainly used for detailed analysis of image loading, display and output to file.

2.1 Read image

python

cv.imread(img, flag)
'''
参数
	-要读取的图像
	-读取的标志
		+ cv.IMREAD*COLOR:以彩色模式加载图片,任何图像的透明度都将被忽略。这是默认参数。
		+ cv.IMREAD*GRAYSCALE:以灰度模式加载图像
		+ cv.IMREAD_UNCHANGED:包括alpha通道的加载图像模式
		可以使用1,0或者-1来替代上面三个标志
'''

C++

imread(const string& filename, int flags=1)

/*
参数:
	+ 文件名
	+ 标志位,以不同的颜色读取图片
*/

2.2 Display image

python

cv.imshow(name, img)
'''
参数
	-显示图像的窗口名称,以字符串类型表示
	-要加载的图像

注意:在调用显示图像API后,要调用cv.waitKey()给图像绘制留下时间,否则窗口会出现无响应情况,导致图像无法显示出来。
'''

#opencv中显示
cv.imshow("image", img)
cv.waitKey(0)

# matplotlib中展示
'''因为cv中采用BGR进行存储,这里需要转换成RGB的'''
plt.imshow(img[:,:,::-1])
plt.show()

C++

imshow(const string& winname, InputArray mat)'

/*
参数:
	+ 显示窗口的名称
	+ 显示的图像
*/
    
//  创建namedWindow()函数
//  如果只是简单使用窗口,imread和imshow就足够了,但是如果需要添加比如滑动条的创建等操作时,则需要使用namedWindow来先创建窗口。
void nameWindow(const string & winname, int flags=WINDOW_AUTOSIZE)

//创建滑动条
createTrackbar()

//滑动条的使用
int getTrackbarPos(const string& trackbarname, const string& winname)  
    
//opencv中的鼠标操作
void setMouseVallback(conststring& winname, MouseCallback onMouse)  

2.3 Save image

python

cv.imwrite(name, img)

'''
参数:
	- 文件名,要保存在哪里
	- 要保存的图像
'''

C++

bool imwrite(cosnt string& filename, InputArray img, const vector<int>& params=vector<int>());

3 core core module

The basic data structure, Mat is the main focus of oepnCV in the 2.0 era. After using the Mat-like data structure as the main focus, OpenCV becomes more and more like Matlab, which requires little programming skills, and it is very easy to get started.

  • The memory allocation of the output image in the opencv function is done automatically
  • There is no need to consider memory release issues when using the C++ interface of opencv
  • The assignment operator and copy constructor only copy the header
  • Use clone() or copyTo() to copy an image's matrix.

3.1 Storage method of pixel value

  • RGB is the most common and works in a similar way to the human eye.
  • HSV and HLS decompose the color into hue, saturation and brightness, lightness.
  • YCrCb is widely used in JPEG image format
  • CIE L*a*b* is a perceptually uniform color space suitable for measuring the distance between two colors.

3.2 Common Data Structures

  • Image representation: Mat class
  • Point representation: Point class
  • Color representation: Scalar class
  • Size representation: Size class
  • Matrix representation: Rect class
  • Color space conversion: cvtColor() function

3.3 Drawing functions

drawing of basic graphics

  • The line function for drawing straight lines
cv.line(img,start,end,color,thickness)

'''
参数
	-img:要绘制直线的图像
	-Start,end:直线的起点和终点
	-color:线条的颜色
	-Thickness:线条宽度
'''
  • ellipse function for drawing ellipses
  • The rectangle function for drawing rectangles
cv.rectangle(img, leftupper, rightdown, color, thickness)

'''
参数
	-img:要绘制矩形的图像
	-Leftupper,rightdown:矩形的左上角和右下角坐标
	-color:线条的颜色
	-Thickness:线条的宽度
'''
  • circle function for drawing circles
cv.circle(img, centerpoint, r, color, thickness)

'''
参数
	-img:要绘制圆形的图像
	-Centerpoint,r:圆心和半径
	-color:线条的颜色
	-Thickness:线条的宽度,为-1时,会填充颜色
'''
  • fillPoly function for drawing filled polygons

If you need to add text to the image

cv.putText(img, text, station, font, fontsize, color, thickness, cv.LINE_AA)

'''
parameters
-img: image
-text: text data to write
-station: control position of text
-font: font
-Fontsize: font size
'''

3.4 Array operation related functions

Access image elements

  • LUT function, Look up table operation
  • Timing functions, getTickCount(), getTickFrequency()
  • access the pixels in the image
    • pointer access
    • iterator access
    • dynamic address calculation
  • ROI region of interest: use rect rectangle, range: from start index to end index.
  • Linear blending operation: folding effect of front and rear page switching
  • Calculate the weighted sum of arrays: addWeighted() function
  • Channel separation: split() function
  • Channel merging: merge() function
  • Image contrast, brightness value adjustment
  • Discrete Fourier Transform: Simply put, the image is decomposed into two parts, sine and cosine. And some mathematical operations such as the magnitude of two-dimensional vectors, natural logarithms, and matrix normalization.
  • Manipulation of XML and YAML files

Here are some explanations in python

The properties of an image include the number of rows, columns and channels, image data type, number of pixels, etc.

img.shape
img.dtype
img.size   #像素个数

Splitting and merging of image channels

Sometimes it is necessary to work separately on B, G, R channel images. In this case, the BGR image needs to be segmented into individual channels. Or in other cases, it may be necessary to merge these individual channels into a BGR image. You can do it in the following way.

#拆分通道
b,g,r = cv.split(img)

#通道合并
img = cv.merge((b,g,r))

color space change

There are more than 150 color space conversion methods in OpenCV. There are two most widely used conversion methods, BGR "" Gray and BGR "" HSV

cv.cvtColor(input_image, flag)

'''
参数
	-input_image:进行颜色空间转换的图像
	-flag:转换类型
		+ cv.COLOR_BGR2GRAY:
		+ cv.COLOR_BGR2HSV:
'''

image addition

cv.add()You can add two images using OpenCV functions, or you can simply add two images through numpy operations, such as res = img1 + img2. Both images should be of the same size and type, or the second image can be a scalar value.

Note: There is a difference between OpenCV addition and Numpy addition. OpenCV's addition is a saturating operation , while Numpy's addition is a modulo operation .

blending of images

This is actually addition, but the difference is that the weights of the two images are different, which will give people a feeling of blending or transparency

cv.addWeighted(img1, 0.7, img2, 0.3, 0)

Please add a picture description

I think the general system is as shown in the figure above, the data is the bottom layer, and the upper layer is the operation of the data, which has always been the case.

4 imgproc image processing

4.1 Traditional image processing

Traditional image processing knowledge mainly includes

  • Three linear filters: box filter, mean filter, Gaussian filter.
  • Two nonlinear filters: median filter and bilateral filter.
  • 7 image processing morphologies: erosion, dilation, opening operation, closing operation, morphological gradient, top hat, black hat.
  • flood fill
  • image scaling
  • image pyramid
  • Thresholding
4.1.1 Three kinds of linear filters
  • box filter
  • mean filtering
  • Gaussian filter
4.1.2 Nonlinear Filtering
  • median filter
  • bilateral filtering
4.1.3 Morphology

Derived from the branch of biology that studies the morphology and structure of plants and animals. The morphology of our image processing often refers to mathematical morphology .

  • Expansion: expand the white highlight
  • Corrosion: corrode the white highlighted part
  • Opening operation: corrode first, then dilate. Perform a separation of the highlighted part, or remove small objects
  • Close operation; expansion first, corrosion. The operation of proposing small objects is performed on the low-brightness part.
  • Morphological Gradient: The difference between dilated and eroded maps. The edges of the clumps can be highlighted.
  • Top hat: the difference between the operation result and the original image, used for background extraction.
  • Black hat: the difference between the result image of closed calculation and the original image.
4.1.4 Flood filling

My understanding is to choose a color and then fill it with other colors, just like we draw a picture, draw a circle, and then fill the white in the middle with other colors.

4.1.5 Image Pyramid

Here we first introduce two concepts

  • Image Upsampling
  • Image downsampling

The image pyramid is a picture that descends from the highest resolution step by step until a certain termination condition is met. The collection of these multi-resolution images is called an image pyramid.

4.1.5 Thresholding

In short, it is to redefine the pixels that exceed a certain threshold.

python set threshold

options Pixel value>thresh Other cases
cv2.THRESH_BINARY maxval 0
cv2.THRESH_BINARY_INV 0 maxval
cv2.THRESH_TRUNC thresh current gray value
cv2.THRESH_TOZERO current gray value 0
cv2.THRESH_TOZERO_INV 0 current gray value
cv2.threshold(src, thresh, maxval, type[, dst])

'''
参数:
	-src:灰度图片
	-thresh:起始阈值
	-maxval:最大值
	-type:如上表的关系
'''

This is the proper Olympic rings.

Please add a picture description

4.2 imgproc image transformation

4.2.1 Edge detection

General steps:

  • Filtering: This is the Gaussian filtering, linear filtering, etc. that we commonly use above.
  • Enhancement: By calculating the change value of the field strength of each point, the points with significant changes are highlighted.
  • Detection: It is also necessary to screen points with large changes in the field

canny operator

Developed in 1986, he is the founder of edge detection computing theory.

  • Low error rate: Represent as many real edges as possible while minimizing false positives from noise.
  • High localization: The marked edge should be as close as possible to the actual edge in the image.
  • Minimal Response: An edge in an image can only be identified once, and possible image noise should not be represented as an edge.

sobel operator

The sobel operator is a discrete differential operator mainly used for edge detection.

Laplacian operator

This operator is a second-order differential operator in the Euclidean space where n is defined as the divergence div of the gradient grad.

scharr filter

As the name suggests, it is a filter, but it exists to cooperate with the operation of the sobel operator.

4.2.2 Hough Transform

One of the basic methods for recognizing geometric shapes.

4.2.3 Mapping transformation

remap

Remaps pixel coordinates.

affine transformation

It can be understood in this way that it is mainly a combination of operations such as zooming, rotation, and translation of the image.

Transmission transformation

The projection transformation is the result of the change of the viewing angle. It refers to using the condition that the perspective center, the image point and the target point are collinear, and according to the law of perspective rotation, the bearing surface (perspective surface) is rotated by a certain angle around the trace line (perspective axis). Destroying the original projected light beams can still keep the transformation of the projected geometric figure on the image bearing surface unchanged.

4.2.4 Histogram equalization

It is to calculate the frequency of the histogram and then distribute it evenly.

Please add a picture description

5 feature2d component of opencv

Image features can be divided into three types

  • edge
  • Corner points (key points of interest)
  • blob (region of interest)

5.1 Corner detection

My feeling for the corner point is a bit similar to the edge point. The area around the point has obvious brightness changes along one direction.

  • Harris corner detection
  • shi-Tomasi corner detection
dst = cv.cornerHarris(src, blockSize, Ksize, k)
'''
参数
	-img:数据类型为float32的输入图像
	-blockSize:角点检测中要考虑的领域大小
	-ksize:sobel求导使用的核大小
	-k:角点检测方程中的自由参数,取值参数为【0.04,0.06】
'''

corners = cv2.goodFeaturesToTrack(image, maxcorners, qualityLevel, minDistance)
'''
参数
	-image:输入灰度图像
	-maxCorners:获取角点数的数目
	-qualityLevel:该参数指出最低可接受的角点质量水平,在0-1之间
	-minDistance:角点之间最小的欧式距离,避免得到相邻特征点
返回
	-corners:搜索到的角点,在这里所有低于质量水平的角点被排除掉了。
'''
  • Sub-pixel corner detection

5.2 Feature detection and matching

  • FAST
  • STAR
  • SIFT
  • SURF
  • ORB
  • MSER
  • GFTT
  • HARRIS
  • Dense
  • SimpleBlob

The above are 10 feature detection algorithms

SIFT

sitf = cv.xfeatures2d.SIFT_create()
kp,des = sift.detectAndCompute(gray, None)
'''
参数
	-gray:进行关键点检测的图像,注意是灰度图像
返回:
	-kp:关键点信息,包括位置,尺度,方向信息
	-des:关键点描述符,每个关键点对应128个梯度信息的特征向量
'''
cv.drawKeypoints(image, keypoints, outputimage, color, flags)
'''
参数:
	-image:原始图像
	-keypoints:关键点信息,将其绘制在图像上
	-outputimage:输出图片,可以是原始图像
	-color:颜色设置,通过修改(b,g,r)的值,更改画笔的颜色
	-flag:绘图功能
		+cv2.DRAW_MATCHES_FLAGS_DEFAULT:创建输出图像矩阵,使用现存的输出图像绘制匹配特征点,每一个关键点只绘制中间点
		+cv2.DRAW_MATCHES_FLAGS_DRAW_OVER_OUTIMG:不创建输出图像矩阵,而是在输出图像上绘制匹配对
		+cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS:对每一个特征点绘制带大小和方向的关键点图形
		+cv2.DRAW_MATCHES_FLAGS_NOT_DRAW_SINGLE_POINT:单点的特征点不被绘制
'''

FAST

fast = cv.FastFeatureDetector_create(threshold, nonmaxSuppression)
'''
参数
	-threshold:阈值t,有默认值10
	-nonmaxSuppression:是否进行非极大值抑制,默认值True
返回
	-Fast:创建FastFeatureDetector对象
'''
kp = fast.detect(grayImg, None)
'''
参数
	-gray:进行关键点检测的图像,注意是灰度图像
返回
	-kp:关键点信息,包括位置,尺度,方向
'''

cv.drawKey(image, keypoints, outputimage, color, flags)

ORB algorithm

orb = cv.xfeatures2d.orb_create(nfeatures)
'''
	-nfeatures:特征点的最大数量
'''

kp, dex = orb.detectAndCompute(gray,None)
'''
参数
	-gray:进行关键点检测的图像,注意是灰度图像
返回
	-kp:关键点
	-res:描述符
'''

cv.deawKeypoints(image, keypoints, outputimage, color, flags)

With the development of the image field, feature2d now often becomes an auxiliary image processing for higher-level applications, such as deep learning. Finally, we learned about video operations and completed the basic understanding of opencv.

6 video operation

6.1 Video reading

In OpenCV, we want to get a video, we need to create a VideoCapture object, specify the video file you want to read

#创建读取视频的对象
cap = cv.VideoCapture(filepath)
'''
参数
	-filepath:视频文件路径
'''

#获取视频的某些属性
retval = cap.get(propId)
'''
参数
	-propId:从0到18的数字,每个数字表示视频的属性
'''

#修改视频的属性信息
retval = cap.set(propId, value)

#判断是否读取成功
isornot = cap.isOpened()

#获取视频的一帧图像
ret, frame = cap.read()
'''
返回
	-ret:成功则返回true
	-Frame:获取到的某一帧图像
'''

#显示
cv.imshow()

#释放调视频对象
cap.realease()

6.2 Save video

Save the video using the VedioWriter object

out = cv2.VideoWriter(filename, fourcc, fps, frameSize)
'''
参数
	-filename:视频保存的位置
	-fourcc:指定视频编解码器的4字节代码
	-fps:帧率
	-frameSize:帧大小
'''

retval = cv2.VideoWriter_fourcc(c1, c2, c3, c4)
'''
参数
	-c1,c2,c3,c4:是视频编码器的4字节代码,在fourcc.org中可以找到代码列表,与平台紧密相关
'''

Summarize

Mainly explain the underlying core of opencv, basic image operations and image processing content.

Guess you like

Origin blog.csdn.net/suren_jun/article/details/128412819