OpenCV in action: A comprehensive guide from image processing to deep learning

This article discusses the application of OpenCV library in image processing and deep learning in a simple way. From basic concepts and operations, to complex image transformations and the use of deep learning models, the article leads you into the actual world of OpenCV with detailed codes and explanations.

1. Introduction to OpenCV

What is OpenCV?

file

OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. It consists of a series of C functions and a small number of C++ classes. It also provides interfaces for languages ​​such as Python, Java, and MATLAB, and implements many general algorithms in image processing and computer vision.

# 导入OpenCV库
import cv2

# 打印OpenCV版本
print(cv2.__version__)

output:

4.5.2

The design goal of OpenCV is to provide a simple and extensible computer vision library, so that it can be easily used in practical applications, research, and development.

History and development of OpenCV

The origin of OpenCV can be traced back to 1999, when it was developed by a group of enthusiastic R&D engineers at Intel Corporation. In 2000, OpenCV was released as an open source, aiming to promote the development of computer vision and help more people apply this technology. Since then, OpenCV has continued to evolve, adding numerous new features, and has become one of the most popular computer vision libraries in the world.

Applications of OpenCV

OpenCV has a very wide range of applications, including but not limited to:
file

  • Face recognition and object recognition : This is an important function of OpenCV, which is used in many fields, such as security monitoring, interaction design, etc.
  • Image and video analysis : such as image enhancement, image segmentation, video tracking, etc.
  • Image synthesis and 3D reconstruction : In the field of image processing and computer vision, OpenCV can be used to create AR or VR effects, generate 3D models, etc.
  • Machine learning : OpenCV has built-in a large number of machine learning algorithms, which can be used for image classification, clustering and other tasks.
  • Deep learning : The dnn module in OpenCV provides a series of interfaces for deep learning models, and users can load pre-trained models for image recognition, target detection and other tasks.
# 例如,以下代码展示了如何使用OpenCV进行图像读取和显示
import cv2

# 读取一张图像
img = cv2.imread('image.jpg')

# 显示图像
cv2.imshow('image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

To sum up, OpenCV has become an important tool for academics and industry due to its powerful functions, open source advantages and wide application fields.

2. OpenCV installation and configuration

The installation method of OpenCV varies according to different operating systems and usage environments. Below we will introduce the installation methods under Windows, Linux and Mac OS, and how to configure the Python environment to use OpenCV.

Installation of OpenCV under Windows system

Under the Windows system, it is recommended to use the Python package management tool pip to install OpenCV. You can run the following command on the command line to install:

pip install opencv-python

If you need to use additional modules of OpenCV (such as xfeatures2d, etc.), you can install the opencv-contrib-python package:

pip install opencv-contrib-python

Installation of OpenCV under Linux system

Under Linux system, we can also use pip to install OpenCV. Open a terminal and run the following command:

pip install opencv-python

Similarly, if you need to use additional modules of OpenCV, you can install the opencv-contrib-python package:

pip install opencv-contrib-python

Installation of OpenCV under Mac OS system

Under Mac OS, we can also use pip to install OpenCV. Open a terminal and run the following command:

pip install opencv-python

If you need to use additional modules of OpenCV, you can install the opencv-contrib-python package:

pip install opencv-contrib-python

Configure the Python environment to use OpenCV

After installing OpenCV, we can import the cv2 module in the Python environment to use the functions of OpenCV. You can create a new Python script and enter the following code in it to test whether OpenCV is installed successfully:

import cv2

# 打印OpenCV版本
print(cv2.__version__)

If the OpenCV version number you installed is output, congratulations, you have successfully installed and configured OpenCV!

In general, installing and using OpenCV is relatively simple, whether it is under Windows, Linux or Mac OS. Only a few simple commands are needed to start your OpenCV journey.

3. OpenCV basics

In this part, we will introduce some basic knowledge of OpenCV, including image loading, display and saving, as well as basic image operations and color space conversion.

Image loading, display and saving

In OpenCV, we usually use imread()functions to load an image, use imshow()functions to display an image, and use imwrite()functions to save an image.

Here is an example:

import cv2

# 载入一张图像
img = cv2.imread('image.jpg')

# 显示图像
cv2.imshow('image', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

# 保存图像
cv2.imwrite('new_image.jpg', img)

Basic operations on images

OpenCV provides a series of functions to perform basic operations on images, including but not limited to:

  • Get and modify pixel values
  • Get the basic properties of the image (such as size, number of channels, number of pixels, etc.)
  • Set the ROI (Region of Interest) of the image
  • Split and merge image channels
# 获取和修改像素值
px = img[100,100]
print(px)

# 修改像素值
img[100,100] = [255,255,255]
print(img[100,100])

# 获取图像属性
print(img.shape)
print(img.size)
print(img.dtype)

# 设置ROI
roi = img[100:200, 100:200]

# 拆分和合并图像通道
b,g,r = cv2.split(img)
img = cv2.merge((b,g,r))

Image color space conversion

OpenCV provides 200+ color space conversion methods, but the most commonly used ones are RGB<->Gray and RGB<->HSV conversion.

We can use cv2.cvtColor()functions to convert color spaces, as in the following example:

# 转换为灰度图像
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 转换为HSV图像
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

This is a brief introduction to the basic operations of OpenCV, which are the basics we need to master before doing more advanced image processing.

4. Basics of image processing and computer vision

In computer vision, image processing is a key link, which includes operations such as image thresholding, edge detection, image filtering, image morphology operations, and image binarization. Below we will introduce them one by one.

file

image thresholding

Image thresholding is the process of converting an image from grayscale to a binary image, and OpenCV provides cv2.threshold()functions to do this.

import cv2
import numpy as np

# 载入图像并转为灰度图
img = cv2.imread('image.jpg',0)

# 阈值化处理
ret,thresh1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)

# 显示处理结果
cv2.imshow('threshold',thresh1)
cv2.waitKey(0)
cv2.destroyAllWindows()

edge detection

Edge detection is a common task in computer vision, which can be used to identify objects in images. Canny edge detection is a commonly used edge detection algorithm, and cv2.Canny()functions can be used in OpenCV for Canny edge detection.

import cv2
import numpy as np

# 载入图像
img = cv2.imread('image.jpg',0)

# 进行Canny边缘检测
edges = cv2.Canny(img,100,200)

# 显示处理结果
cv2.imshow('edges',edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

image filtering

Image filtering is a common image preprocessing method in computer vision. OpenCV provides various filtering functions, such as cv2.filter2D(), cv2.blur(), and cv2.GaussianBlur()so on.

import cv2
import numpy as np

# 载入图像
img = cv2.imread('image.jpg')

# 使用高斯滤波进行图像平滑处理
blur = cv2.GaussianBlur(img,(5,5),0)

# 显示处理结果
cv2.imshow('blur',blur)
cv2.waitKey(0)
cv2.destroyAllWindows()

Image Morphological Operations

Morphological operations are a series of operations based on the image shape, including erosion, expansion, opening and closing operations, etc. OpenCV provides cv2.erode(), cv2.dilate(), cv2.morphologyEx()and other functions to perform morphological operations.

import cv2
import numpy as np

# 载入图像
img = cv2.imread('image.jpg',0)

# 创建一个5x5的结构元素
kernel = np.ones((5,5),np.uint8)

# 进行膨胀操作
dilation = cv2.dilate(img,kernel,iterations = 1)

# 显示处理结果
cv2.imshow('dilation',dilation)
cv2.waitKey(0)
cv2.destroyAllWindows()

Binary image

Binarization is the process of processing an image into only two colors, that is, processing the image into black and white. The binarized image is very helpful for many image processing tasks (such as edge detection, object recognition, etc.), and OpenCV can use cv2.threshold()functions to perform binarization operations.

import cv2
import numpy as np

# 载入图像
img = cv2.imread('image.jpg',0)

# 进行二值化操作
ret,thresh1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)

# 显示处理结果
cv2.imshow('binary',thresh1)
cv2.waitKey(0)
cv2.destroyAllWindows()

The above is the basic knowledge of image processing and computer vision. With this knowledge, you can perform more complex image processing tasks.

5. OpenCV actual combat case

file

Face Detection

First, let's implement a simple face detection program. This program reads an image and then uses a pretrained Haar cascade classifier to detect faces in the image.

import cv2

# 加载预训练的人脸级联分类器
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# 读取图像
img = cv2.imread('face.jpg')

# 将图像转换为灰度图
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# 使用级联分类器检测人脸
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))

# 为每个检测到的人脸绘制一个矩形
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)

# 显示结果
cv2.imshow('Faces found', img)
cv2.waitKey(0)

Real-time face detection

Next, let's implement a real-time face detection program. This program captures video from a webcam in real time and detects faces in the video.

import cv2

# 加载预训练的人脸级联分类器
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

# 打开摄像头
cap = cv2.VideoCapture(0)

while True:
    # 读取一帧
    ret, frame = cap.read()

    # 将帧转换为灰度图
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    # 使用级联分类器检测人脸
    faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(30, 30))

    # 为每个检测到的人脸绘制一个矩形
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)

    # 显示结果
    cv2.imshow('Faces found', frame)

    # 按'q'退出循环
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# 释放摄像头
cap.release()

# 关闭所有窗口
cv2.destroyAllWindows()

Target Tracking

The next practical case is to use the MeanShift algorithm for target tracking. We will select an object from the video and track this object in subsequent frames.

import cv2
import numpy as np

# 打开摄像头
cap = cv2.VideoCapture(0)

# 读取第一帧
ret, frame = cap.read()

# 设置初始的窗口位置
r, h, c, w = 240, 100, 400, 160
track_window = (c, r, w, h)

# 设置初始的ROI用于跟踪
roi = frame[r:r+h, c:c+w]
hsv_roi = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv_roi, np.array((0., 60., 32.)), np.array((180., 255., 255.)))
roi_hist = cv2.calcHist([hsv_roi], [0], mask, [180], [0, 180])
cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX)

# 设置终止条件,迭代10次或者至少移动1次
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)

while(True):
    ret, frame = cap.read()

    if ret == True:
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dst = cv2.calcBackProject([hsv], [0], roi_hist, [0, 180], 1)
        
        # 使用MeanShift算法找到新的位置
        ret, track_window = cv2.meanShift(dst, track_window, term_crit)
        
        # 在图像上画出新的窗口位置
        x, y, w, h = track_window
        img2 = cv2.rectangle(frame, (x, y), (x+w, y+h), 255, 2)
        cv2.imshow('img2', img2)
        
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        break

cap.release()
cv2.destroyAllWindows()

edge detection

Edge detection is an important step in image processing, which can help us identify the outline of objects from images. The following practical case is to use the Canny algorithm for edge detection.

import cv2
import numpy as np

# 读取图像
img = cv2.imread('road.jpg', 0)

# 使用Canny算法进行边缘检测
edges = cv2.Canny(img, 50, 150)

# 显示原图和边缘检测结果
cv2.imshow('Original Image', img)
cv2.imshow('Edge Image', edges)

cv2.waitKey(0)
cv2.destroyAllWindows()

image stitching

Image stitching is to stitch two or more images together under certain geometric and photometric conditions to form a large field of view image that includes all input image fields of view. The following practical case will show how to use OpenCV for image stitching.

import cv2
import numpy as np

# 读取两个图像
img1 = cv2.imread('road1.jpg')
img2 = cv2.imread('road2.jpg')

# 将两个图像拼接成一个图像
stitcher = cv2.Stitcher.create()
result, pano = stitcher.stitch([img1, img2])

if result == cv2.Stitcher_OK:
    cv2.imshow('Panorama', pano)
    cv2.waitKey()
    cv2.destroyAllWindows()
else:
    print("Error during stitching.")

6. Deep Learning and OpenCV

file

The OpenCV library not only provides a large number of basic image processing functions, but also provides strong support for the field of deep learning. It can be used to load pre-trained models and use these models for tasks such as image classification, object detection, image segmentation, etc. Below we will use some practical cases to gain an in-depth understanding of how OpenCV is used in deep learning.

Load the pretrained model

First, we will learn how to load a pretrained model. We will use the DNN module in OpenCV, which supports several deep learning frameworks, including TensorFlow, Caffe, etc.

import cv2

# 加载预训练的模型
net = cv2.dnn.readNetFromCaffe('bvlc_googlenet.prototxt', 'bvlc_googlenet.caffemodel')

image classification

Next, we will use the loaded model for image classification. We will preprocess an image and then feed it into the model to obtain classification results.

import cv2
import numpy as np

# 加载预训练的模型
net = cv2.dnn.readNetFromCaffe('bvlc_googlenet.prototxt', 'bvlc_googlenet.caffemodel')

# 加载标签名
with open('synset_words.txt', 'r') as f:
    labels = f.read().strip().split("\n")

# 加载图像,并进行预处理
image = cv2.imread('image.jpg')
blob = cv2.dnn.blobFromImage(image, 1, (224, 224), (104, 117, 123))

# 将图像输入到网络中,进行前向传播,得到输出结果
net.setInput(blob)
outputs = net.forward()

# 获取预测结果
class_id = np.argmax(outputs)
label = labels[class_id]

print('Output class:', label)

object detection

In addition, we can also use pre-trained models for object detection. We will use a pretrained YOLO model to detect objects in images.

import cv2
import numpy as np

# 加载预训练的模型
net = cv2.dnn.readNet('yolov3.weights', 'yolov3.cfg')

# 加载图像,并进行预处理
image = cv2.imread('image.jpg')
blob = cv2.dnn.blobFromImage(image, 1/255, (416, 416), swapRB=True, crop=False)

# 将图像输入到网络中,进行前向传播,得到输出结果
net.setInput(blob)
layer_names = net.getLayerNames()
output_layers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
outputs = net.forward(output_layers)

# 处理网络的输出结果
for output in outputs:
    for detection in output:
        scores = detection[5:]
        class_id = np.argmax(scores)
        confidence = scores[class_id]
        
        if confidence > 0.5:
            # 将检测到的物体在图像上标记出来
            center_x, center_y, w, h = map(int, detection[0:4] * np.array([image.shape[1], image.shape[0], image.shape[1], image.shape[0]]))
            x = center_x - w // 2
            y = center_y - h // 2
            cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
            
cv2.imshow('Image', image)
cv2.waitKey(0)
cv2.destroyAllWindows()

The above are examples of the application of OpenCV in deep learning. I hope these cases can help you better understand how to use OpenCV for deep learning tasks.

Summary and Outlook

In this blog, we explored how to use OpenCV for various image processing and deep learning tasks. From the most basic image reading and display, to complex image transformation, image segmentation, edge detection, and deep learning image classification and object detection, we have detailed codes and explanations.

OpenCV is a powerful and easy-to-use library that provides many tools for image processing and computer vision. Whether you are a researcher, a developer, or just a beginner interested in image processing and computer vision, OpenCV can help you realize your ideas quickly.

In the future, OpenCV will continue to develop and add more functions and tools. For example, the developers of OpenCV are already considering how to better support 3D image processing and augmented reality. At the same time, with the development of deep learning, OpenCV will continue to provide better support, including loading more pre-trained models, and providing more tools to help developers train their own models.

Overall, OpenCV is an important tool in the field of image processing and computer vision, and whether you are a beginner or an expert, you should be proficient in this library. I hope this blog can be helpful to you, if you have any questions, please feel free to ask me.

Guess you like

Origin blog.csdn.net/magicyangjay111/article/details/132044038
Recommended