OpenCV connected component labeling and analysis

In this tutorial, you will learn how to perform connected component labeling and analysis using OpenCV. Specifically, we'll focus on OpenCV's most commonly used connected component labeling function: cv2.connectedComponentsWithStats .
Connected component labeling (also known as connected component analysis, blob extraction, or region labeling) is an algorithmic application of graph theory to determine the connectivity of "blob"-like regions in a binary image.

We often use connected component analysis in the same context as we use contours; however, connected component labeling often allows us to perform more fine-grained filtering of blobs in binary images. When working with contour analysis, we are often constrained by the hierarchy of contours (i.e. one contour is contained within another). With connected component analysis, we can more easily segment and analyze these structures.

A good example of connected component analysis is computing the connected components of a binary (i.e. thresholded) license plate image and filtering blobs based on their properties (e.g. width, height, area, solidity, etc.). That's exactly what we're here to do today.

1. OpenCV connected component labeling and analysis

In the first part of this tutorial, we'll review four functions provided by OpenCV for performing connected component labeling and analysis. The most popular of these functions is cv2.connectedComponentsWithStats .
First, we'll configure our development environment and review our project directory structure.
Next, we will implement two forms of connected component analysis:

  • One approach will demonstrate how to use OpenCV's connected component labeling and analysis functions, compute statistics for each connected component, and then extract/visualize each connected component individually.
  • The second method shows a practical example of connected component analysis. We threshold the license plate and then use connected component analysis to extract only the license plate characters.

1.1 OpenCV connected component labeling and analysis functions

OpenCV provides four connected component analysis functions:

  • cv2.connectedComponents
  • cv2.connectedComponentsWithStats
  • cv2.connectedComponentsWithAlgorithm
  • cv2.connectedComponentsWithStatsWithAlgorithm

The most popular method is cv2.connectedComponentsWithStats , which returns the following information:

  • Bounding boxes for connected components
  • Area of ​​connected components in pixels
  • centroid/center (x, y) coordinates of connected components

The first method, cv2.connectedComponents , is the same as the second method, except that the statistics above are not returned. In the vast majority of cases you will need statistics, so simply use cv2.connectedComponentsWithStats .
The third method cv2.connectedComponentsWithAlgorithm implements a faster and more efficient connected component analysis algorithm.

If you compile OpenCV with parallel processing support, cv2.connectedComponentsWithAlgorithm and cv2.connectedComponentsWithStatsWithAlgorithm will run faster than the first two.
But in general, stick with cv2.connectedComponentsWithStats until you are familiar with connected component notation.

1.2 Project structure

Before we implement connected component labeling and analysis with OpenCV, let's take a look at our project directory structure.
insert image description here
We will apply connected component analysis to automatically filter the characters in the license plate (license_plate.png).
To accomplish this task and learn more about connected components analysis, we will implement two Python scripts:

  • basic_connected_components.py : Demonstrates how to apply connected component labeling, extract each component and its statistics, and visualize them on our screen.
  • filtering_connected_components.py : Apply connected component marking, filtering out non-license plate characters by checking the width, height and area (in pixels) of each connected component.

2. Case realization

2.1 Using OpenCV to implement basic connected component labeling

Let's start implementing Connected Components Analysis using OpenCV. Open the basic_connected_components.py
file in the project folder and let's get to work:

# 导入相关包
# 导入必要的包
import argparse
import cv2

# 解析构建的参数解析器
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="path to input image")
ap.add_argument("-c", "--connectivity", type=int, default=4, help="connectivity for connected analysis")
args = vars(ap.parse_args())  # 将参数转为字典格式

We have two command line arguments

  • –image: input image path
  • –connectivity: 4 connectivity or 8 connectivity

Next, perform image preprocessing operations

# 加载输入图像,将其转换为灰度,并对其进行阈值处理
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

After thresholding, the following image will be obtained:

insert image description here
Note that the license plate characters appear white on a black background. However, there is also a bunch of noise in the input image that also appears as foreground (white). Our goal is to apply connected component analysis to filter out these noisy regions, leaving only the license plate characters.

But before we start, let's learn how to use the cv2.connectedComponentsWithStats function:

output = cv2.connectedComponentsWithStats(thresh, args["connectivity"], cv2.CV_32S)
(numLabels, labels, stats, centroids) = output

Use OpenCV's cv2.connectedComponentsWithStats to perform connected components analysis. We pass in three parameters here:

  • Thresholded image
  • 4 connected or 8 connected
  • data type (cv2.CV_32S should be used)

Then cv2.connectedComponentsWithStats returns a 4-tuple:

  • The total number of unique labels detected (i.e. the total number of connected components)
  • A mask called labels that has the same spatial dimensions as our input thresholded image. For each position in labels we have an integer ID value corresponding to the connected component to which the pixel belongs. You will learn how to filter the labels matrix later in this section.
  • stats: Statistics for each connected component, including bounding box coordinates and area in pixels.
  • The centroid (i.e., center) (x,y) coordinates of each connected component.

Let's start parsing the values:

# 遍历每个连通分量
for i in range(0, numLabels):
    # 0表示的是背景连通分量,忽略
    if i == 0:
        text = "examining component {}/{} (background)".format(
            i + 1, numLabels)
    # otherwise, we are examining an actual connected component
    else:
        text = "examining component {}/{}".format(i + 1, numLabels)
    # 打印当前的状态信息
    print("[INFO] {}".format(text))
    # 提取当前标签的连通分量统计信息和质心
    x = stats[i, cv2.CC_STAT_LEFT]
    y = stats[i, cv2.CC_STAT_TOP]
    w = stats[i, cv2.CC_STAT_WIDTH]
    h = stats[i, cv2.CC_STAT_HEIGHT]
    area = stats[i, cv2.CC_STAT_AREA]
    (cX, cY) = centroids[i]

If/else statement description:

  • The first connected component, with ID 0, is always the background. We usually ignore the background, but if you need it, remember ID=0 to include it.
  • Otherwise, if i > 0, then we know that connected component is worth exploring further.

Parsing our stats and centroid list:

  • The starting x-coordinates of the connected components
  • The starting y-coordinate of connected components
  • Width of connected components (w)
  • The height of connected components (h)
  • Centroid coordinates (x,y) of connected components
	# 可视化边界框和当前连通分量的质心
    # clone原始图,在图上画当前连通分量的边界框以及质心
    output = image.copy()
    cv2.rectangle(output, (x, y), (x + w, y + h), (0, 255, 0), 3)
    cv2.circle(output, (int(cX), int(cY)), 4, (0, 0, 255), -1)

Create an output image that we can draw on. We then draw the bounding boxes of the current connected components as green rectangles and the centroids as red circles.

Our final code block demonstrates how to create a mask for the current connected components:

	# 创建掩码
    componentMask = (labels == i).astype("uint8") * 255
    # 显示输出图像和掩码
    cv2.imshow("Output", output)
    cv2.imshow("Connected Component", componentMask)
    cv2.waitKey(0)

First find all locations in labels that are equal to the current component ID. We then convert the result to an unsigned 8-bit integer with a background value of 0 and a foreground value of 255. Finally, the original image and the mask image are displayed.

The first connected component is actually our background. We usually skip this, since the background is usually not needed. Then display the remaining connected components. For each connected component, we draw bounding boxes (green rectangles) and centroids/centers (red circles). You may have noticed that some of these connected components are license plate characters, while others are just "noise". We will address this issue in the next section.

2.2 Complete code

# 导入必要的包
import argparse
import cv2

# 解析构建的参数解析器
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", default="plate.jpg", help="path to input image")
ap.add_argument("-c", "--connectivity", type=int, default=4, help="connectivity for connected analysis")
args = vars(ap.parse_args())  # 将参数转为字典格式

# 加载输入图像,将其转换为灰度,并对其进行阈值处理
image = cv2.imread(args["image"])
cv2.imshow("src", image)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
cv2.imshow("threshold", thresh)

# 对阈值化后的图像应用连通分量分析
output = cv2.connectedComponentsWithStats(thresh, args["connectivity"], cv2.CV_32S)
(numLabels, labels, stats, centroids) = output

# 遍历每个连通分量
for i in range(0, numLabels):
    # 0表示的是背景连通分量,忽略
    if i == 0:
        text = "examining component {}/{} (background)".format(
            i + 1, numLabels)
    # otherwise, we are examining an actual connected component
    else:
        text = "examining component {}/{}".format(i + 1, numLabels)
    # 打印当前的状态信息
    print("[INFO] {}".format(text))
    # 提取当前标签的连通分量统计信息和质心
    x = stats[i, cv2.CC_STAT_LEFT]
    y = stats[i, cv2.CC_STAT_TOP]
    w = stats[i, cv2.CC_STAT_WIDTH]
    h = stats[i, cv2.CC_STAT_HEIGHT]
    area = stats[i, cv2.CC_STAT_AREA]
    (cX, cY) = centroids[i]

    # 可视化边界框和当前连通分量的质心
    # clone原始图,在图上画当前连通分量的边界框以及质心
    output = image.copy()
    cv2.rectangle(output, (x, y), (x + w, y + h), (0, 255, 0), 3)
    cv2.circle(output, (int(cX), int(cY)), 4, (0, 0, 255), -1)

    # 创建掩码
    componentMask = (labels == i).astype("uint8") * 255
    # 显示输出图像和掩码
    cv2.imshow("Output", output)
    cv2.imshow("Connected Component", componentMask)
    cv2.waitKey(0)

2.3 Filter connected components

Our previous code samples showed how to use OpenCV to extract connected components, but not how to filter them.

import numpy as np
import argparse
import cv2

ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", default="plate.jpg", help="path to image")
ap.add_argument("-c", "--connectivity", type=int, default=4, help="connectivity for connected component analysis")

args = vars(ap.parse_args())

# 加载图像,转为灰度,二值化
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY)

# 应用连通分量分析
output = cv2.connectedComponentsWithStats(thresh, connectivity=args["connectivity"], ltype=cv2.CV_32S)
(numLabels, labels, stats, centriods) = output

mask = np.zeros(gray.shape, dtype="uint8")

for i in range(1, numLabels):  # 忽略背景
    x = stats[i, cv2.CC_STAT_LEFT]  # [i, 0]
    y = stats[i, cv2.CC_STAT_TOP]  # [i, 1]
    w = stats[i, cv2.CC_STAT_WIDTH]  # [i, 2]
    h = stats[i, cv2.CC_STAT_HEIGHT]  # [i, 3]
    area = stats[i, cv2.CC_STAT_AREA]  # [i, 4]
    # 确保宽高以及面积既不太大也不太小
    keepWidth = w > 50 and w < 500
    keepHeight = h > 150 and h < 650
    keepArea = area > 500 and area < 25000
    # 我使用print语句显示每个连接组件的宽度、高度和面积,
    # 同时将它们单独显示在屏幕上。我记录了车牌字符的宽度、高度和面积,并找到了它们的最小/最大值,
    # 对于您自己的应用程序也应该这样做。

    if all((keepWidth, keepHeight, keepArea)):
        print("[INFO] keep connected component '{}'".format(i))
        componentMask = (labels == i).astype("uint8") * 255
        mask = cv2.bitwise_or(mask, componentMask)

cv2.imshow("Image", image)
cv2.imshow("Chracters", mask)
cv2.waitKey(0)

insert image description here

If we were building an Automatic License Plate/Number Plate Recognition (ALPR/ANPR) system, we would take these characters and then pass them to an Optical Character Recognition (OCR) algorithm for recognition. But it all depends on whether we can binarize the characters and extract them, connected component analysis allows us to do that!

2.4 C++ code example

#include <opencv2/core/utility.hpp>
#include "opencv2/imgproc.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/highgui.hpp"
#include <iostream>
using namespace cv;
using namespace std;
Mat img;
int threshval = 100;
static void on_trackbar(int, void*)
{
    
    
    Mat bw = threshval < 128 ? (img < threshval) : (img > threshval);
    Mat labelImage(img.size(), CV_32S);
    int nLabels = connectedComponents(bw, labelImage, 8);
    std::vector<Vec3b> colors(nLabels);
    colors[0] = Vec3b(0, 0, 0);//background
    for(int label = 1; label < nLabels; ++label){
    
    
        colors[label] = Vec3b( (rand()&255), (rand()&255), (rand()&255) );
    }
    Mat dst(img.size(), CV_8UC3);
    for(int r = 0; r < dst.rows; ++r){
    
    
        for(int c = 0; c < dst.cols; ++c){
    
    
            int label = labelImage.at<int>(r, c);
            Vec3b &pixel = dst.at<Vec3b>(r, c);
            pixel = colors[label];
         }
     }
    imshow( "Connected Components", dst );
}
int main( int argc, const char** argv )
{
    
    
    CommandLineParser parser(argc, argv, "{@image|stuff.jpg|image for converting to a grayscale}");
    parser.about("\nThis program demonstrates connected components and use of the trackbar\n");
    parser.printMessage();
    cout << "\nThe image is converted to grayscale and displayed, another image has a trackbar\n"
            "that controls thresholding and thereby the extracted contours which are drawn in color\n";
    String inputImage = parser.get<string>(0);
    img = imread(samples::findFile(inputImage), IMREAD_GRAYSCALE);
    if(img.empty())
    {
    
    
        cout << "Could not read input image file: " << inputImage << endl;
        return EXIT_FAILURE;
    }
    imshow( "Image", img );
    namedWindow( "Connected Components", WINDOW_AUTOSIZE);
    createTrackbar( "Threshold", "Connected Components", &threshval, 255, on_trackbar );
    on_trackbar(threshval, 0);
    waitKey(0);
    return EXIT_SUCCESS;
}

reference list

https://pyimagesearch.com/2021/02/22/opencv-connected-component-labeling-and-analysis/
https://docs.opencv.org/4.x/de/d01/samples_2cpp_2connected_components_8cpp-example.html

Guess you like

Origin blog.csdn.net/weixin_43229348/article/details/126047746