Computer Vision--Practical Application of Distance Transform Algorithm

insert image description here

前言: Hello everyone, I am Dream. Computer vision CV is a very important field of artificial intelligence . In this distance transformation task, we will use the D4 distance metric method to process images. Through this experiment, we can better understand the application of distance metrics in computer vision. I hope you have a deeper understanding of computer vision and image processing. Let's take a look at the actual calculation results and visualization effects!

Distance transform is a commonly used method, which can help us calculate the distance between each pixel and the nearest foreground pixel. This is critical for tasks such as image analysis, object detection, and image registration. D4距离定义为两个像素点之间在水平和垂直方向上的绝对距离之和. With this metric, we can obtain the distance from each pixel to the nearest foreground pixel. In order to test the effect of distance transformation, we first randomly generated an image of size 8*8 , and randomly selected 10 pixels in it as foreground pixels. Foreground pixels are represented by 1, and background pixels are represented by 0. Next, we implemented a distance function to calculate the D4 distance between two pixel points. Then, we iterate over each pixel in the image , calculate its distance to its nearest foreground pixel, and store the result in a distance matrix. Finally, we visualize the original image and the distance transformed results . A grayscale image is used to represent the original image, and black pixels represent randomly generated foreground pixels. The distance transform results are displayed using a "cool" colormap, where farther pixels appear lighter and closer pixels appear darker.

1. Import the necessary libraries

First, we need to import the necessary libraries, the NumPy and Matplotlib libraries.

import numpy as np
import matplotlib.pyplot as plt

2. Generate random images and define a distance metric

Randomly generate a picture with 0/1 pixel value, the size is 8*8, 0 is the background pixel, 1 is the foreground pixel

image = np.random.randint(2, size=(8, 8))
print('原始图片:\n', image)

insert image description here
Randomly pick 10 foreground pixels:

for i in range(10):
    x, y = np.random.randint(8, size=2)
    image[x, y] = 1
    
print('选取前景像素后的图片:\n', image)

Image after selecting foreground pixels:
[[1 0 1 1 0 1 0 1]
[0 1 1 0 1 0 0 0] [1 1 1 1 1 1 0 1] [ 0 0 1 0 1 0 1 0] [0 1 1 0 0 1 0 1] [0 1 1
1 1 1 1 1] [ 1 1 1 1 1 0 1 1] [ 0 1 0 1 0 0 1 1]]




3. Perform distance transformation


D4 distance introduction: The D4 distance between pixels p(x,y) and q(s,t) is defined as: = |x – s| + |y – t|
In the D4 distance transform algorithm, D4 represents 四邻域距离度量. It only considers the difference between pixels in the horizontal and vertical directions, but ignores the difference in the diagonal direction .

The algorithm steps are as follows:

  1. Initialize a distance matrix with the same size as the original image, where the distance value of all background pixels is 0.
  2. Select a foreground pixel from the image as a starting point.
  3. Traverse each background pixel in the image, and calculate the D4 distance from it to the starting pixel.
  4. Compare the distance between the current pixel point and the starting point with the minimum distance calculated before, if the current distance is smaller, update the distance value of the pixel point to the current distance.
  5. Repeat steps 3 and 4 until all background pixels are traversed.
  6. Select the next foreground pixel as the starting point, and repeat the above steps until all the foreground pixels are traversed.
  7. The final distance matrix is ​​the result of the distance transformation, where the distance value of each pixel represents the distance from the pixel to the nearest foreground pixel.

Define the distance function

def dist(p1, p2, metric='D4'):
    if metric == 'D4':
        return abs(p1[0] - p2[0]) + abs(p1[1] - p2[1])
    elif metric == 'D8':
        return max(abs(p1[0] - p2[0]), abs(p1[1] - p2[1]))

generate distance matrix

matrix = np.zeros_like(image)
for i in range(image.shape[0]):
    for j in range(image.shape[1]):
        # 背景像素的距离为 0
        if image[i, j] == 0:
            matrix[i, j] = 0
        else:  
            # 初始化为一个巨大的数字
            min_dist = 99999           
            for m in range(image.shape[0]):
                for n in range(image.shape[1]):
                    # 只计算背景像素的距离
                    if image[m, n] == 0:
                        d = dist((i, j), (m, n), metric='D4')
                        if d < min_dist:
                            min_dist = d
            matrix[i, j] = min_dist
            
print('距离变换后的结果:\n', matrix)

The result after distance transformation:
[[1 0 1 1 0 1 0 1]
[0 1 1 0 1 0 0 0] [1 1 2 1 2 1 0 1] [0 0 1 0 1 0 1 0] [0 1 1 0 0 1 0 1] [0 1 2
1 1 1 1 2] [ 1 2 1 2 1 0 1 2] [ 0 1 0 1 0 0 1 2]]




4. Visualization

Here, we use a grayscale image to represent the original image, and black pixels to represent randomly generated pixels. Use the "cool" colormap to visualize the results of the distance transform.
Original Image

plt.imshow(image, cmap='gray')
plt.title('Original Image')
plt.show()

insert image description here
Distance Transformed Image

plt.imshow(matrix, cmap='cool')
plt.title('Distance Transformed Image')
plt.colorbar()
plt.show()

insert image description here
This article introduces distance metrics in computer vision, tests them with randomly generated pixels, and visualizes the results. Next, I will continue to expand this article.

5. Analysis of results

Through the above code, we can get the result after distance transformation. In the results, black pixels represent randomly generated foreground pixels, and other colors represent the distance of each pixel to the nearest foreground pixel. We can see that the distance-transformed image can clearly show the distance information from each pixel to the foreground pixel. Pixels farther away are lighter in color, and closer pixels are darker in color.

Summarize

Distance metrics have a wide range of applications in the field of computer vision CV. In tasks such as image segmentation, image registration, object detection, and object tracking, it is necessary to calculate the distance between pixels to process and analyze the image. The distance transformation can help us better understand the relationship and structure between pixels, and provide a basis and reference for subsequent image processing work.

本期推荐:
Python Automation Office Application Encyclopedia (ChatGPT version): Teach programming beginners from scratch to get tedious work done with one click (volume 2)
insert image description here

Guess you like

Origin blog.csdn.net/weixin_51390582/article/details/131896271