Segmentation algorithm|Efficient Graph-Based Image Segmentation

The basic structure of this article:

paper notes

Source code analysis notes

A segmentation idea of ​​this article - undirected graph segmentation

1 Introduction to terms and symbols

undirected graph G, each pixel is a vertex v_i\in V,G = (V,E)

Vis the set of vertices, Eis the set of adjacent edges,\(v_i,v_j\)\in E

C(component): It can be understood as each part after division (similar to region)

S(Segmentation) is Vthe result of being divided into components, each component C \in S

2 split thoughts

2.1 Segmentation principle

The weights of the edges of adjacent vertices in the same component should be as small as possible, and the weights of vertices in different components are larger

2.2 Segmentation method

Define a parameter D: its value determines whether there is a boundary between the two components, that is, whether to merge

Define internal difference (internal difference): represents the attribute of a part (component)

For a component  is the maximum weight in C \subseteq V the minimum spanning tree of C , namelyMST(C,E)

Int(C) =\mathop{max}\limits_{e \in MST(C,E)}w(e)

Define heterodyne: represents the attribute between two parts

The difference between the two components is

        Dif(C_1,C_2) =\mathop{min}\limits_{v_i\in C_1,v_j\in C2,(v_i,v_j)\in E}w((v_i,v_j))

If there are no adjacent edges between the two components, the value is equal to infinity

2.3 Judgment conditions for segmentation

So far, it is possible to define

DThe rule for is, Dif(C_1,C_2)greater than at least one of the components of the internal difference (ie Int(C_1),Int(C_2)),

The expression is:

D(C_1,C_2) = \begin{cases} true & \text{ if } Dif(C_1,C_2)>MInt(C_1,C_2) \\ false & \text{ if } otherwise \end{cases}

        in,MInt(C_1,C_2) = min(Int(C_1)+\tau (C_1),Int(C_2)+\tau (C_2))

\ canIs the threshold function, the control heterodyne needs to be greater than the minimum value of the inner difference

The calculation formula of the threshold function is

\tau(C) = k/|C|

|C|Yes C, the size, that is, the number of vertices, kis a constant parameter. This also protects small components (with a small number) from being judged as true when making judgments.

3 Segmentation Algorithm

3.1 Definition of Segmentation Situation

It is said that the division is too fine : I personally think that it means that the division is too fine

It is said that the division is too coarse : the opposite of too fine (the division is too thick)

Proper refinement: Assume that for an image, there are two segmentation results T, S, both of which store the segmented components;

If each component of T is a subset of a certain component in S (or two components are equal, then T is a proper refinement of S, T is better than Sfiner, and S is better than Tcoarse (T is too fine, S is better than S) too thick

3.2 Algorithm Pseudocode

Simply put

[Initialization] Each pixel is regarded as a vertex of an undirected graph, and adjacent pixels are connected to form an edge. The weight of the edge is the calculated dissimilarity between the two vertices, and the edges are sorted according to the weight from small to large.

[Split] In the initial state, each vertex is a component, and then judge two vertices side by side, use the D function above to judge whether to split, and gradually integrate the components; repeat the operation until our loop exit condition

[Exit condition] The current segmentation result is neither too fine nor too coarse (the initial condition is too fine, because each vertex is a component, it must be too thin, and it gradually moves closer to coarse. The article proves that this algorithm is It is a monotonous process that can achieve the required segmentation results)

Second look at the code

file in package

  1. image.h
  2. disjoint-set.h
  3. segment-graph.h
  4. imutil.h
  5. convolve.h
  6. imconv.h
  7. segment-image.h

I arranged the order, roughly according to the hierarchical relationship, that is, the file with the larger serial number needs to contain the functions defined in the previous header file

1 First the first file image.h

1.1 Define an image class that uses a class template, because the segmentation algorithm is performed on the image, so defining the image class is convenient for subsequent operations

    The methods in the class are

  • image(const int width, const int height, const bool init): initialize width, height and init; when init is true, the image is filled with 0
  • ~image(): delete
  • void init(const T & val): Fill the entire image with val
  • image<T> *copy() : copy a new image
  • int width(): Take the width of the image
  • int height(): Take the height of the image

1.2 defines two functions:

imRef(im,x,y): Take the value of the image im at the coordinates (x, y)

imPtr(im,x,y):  Take the pointer of the image im at coordinates (x, y)

2 The second file disjoint-set.h (Chinese translation disjoint set)

2.1 Header file

Include the header file image.h

2.2 Content

2.2.1 Define the structure uni_elt (forest):

        There are three variables inside the structure:

  •         rank
  •         p: the root node of the tree,
  •         size: the number of nodes the tree contains

2.2.2 Define class universe

The methods of universe are

  • universe (int elements) initializes a uni_elt array of length elements, the rank of each element is 0, size = 1, p = i
  • ~universe() delete
  • int find(int x) Find the ancestor node of the tree where x is located
  • void join(int x, int y): Merge the tree where the node x and y are located

3 The third file segment-graph.h

3.1 Include the header file "disjoint-set.h"

3.2 Content

3.2.1  Structure edge : {weight, vertex number}

3.2.2  Threshold function THRESHOLD

3.2.3 Overloading : overloading operator < to compare the weights of structures a and b

3.2.4 Define the function segment_graph(int num_vertices, int num_edges, edge *edges, float c) to return the object of the universe class

  The operation realized by the function:

  • Sort edges by weight from small to large
  • Generate num_vertices number of disjoint forests
  • Initialize a threshold array, corresponding to the threshold of each tree in the forest
  • loop through each edge
    • Determine whether the current edge belongs to a tree
    • If it is not on a tree, determine whether the weight of the edge is less than the threshold of the nodes at both ends
      • If both are less than, then merge the two trees where the nodes at both ends are located, update the root node, and update the threshold of the root node

4 imutil.h

Contains two functions

min_max : returns the maximum and minimum values ​​of the input image

threshold : returns a new image whose value is the result of comparing the corresponding position of the original image with the threshold t

5 convolve.h

5.1 Introduction

        Two methods of image convolution are defined in this file to realize the convolution operation

5.2 Defining functions

Function convolve_even convolution (for pixels): it is equivalent to taking the pixel as the starting point, placing the filter along the direction of x, placing it towards the left and towards the right (symmetrical), and then multiplying the corresponding positions first, and then multiplying the two directions The value is added, the starting position is only multiplied once, and all the products are added together as the pixel value of the corresponding position of the new image

Function convolve_odd convolution: the same, except that the sum of the multiplication in the two directions is subtracted (the product sum in the left direction minus the right direction)

6 imconv.h

6.1 Introduction

A bunch of functions are defined, and their functions are to convert the data type of the image, too many will not be listed

6.2 Partial functions

Function imageRGBtoGRAY : convert the RGB image into a grayscale image, the RGB weights are 0.299 0.587 0.114

Function imageGRAYtoRGB: grayscale image to RGB

Function imageUCHARtoFLOAT: conversion of the data type of the pixel value in the image

7 filter.h

7.1 Introduction

This header file officially performs the convolution operation on the image, and defines the method of implementing the filtering operation.

7.2 Content

A function normalize : normalization

B macro definition creates a filter function: you can specify the function name and the function fun used by the filter,

A Gaussian filter make_gaussian is defined in the code, usingfun= \exp(-0.5*\sqrt{\frac{i}{\sigma}})

【Filter input parameters \sigma

len = (int)\lceil \sigma*WIDTH \rceil + 1

 

        default in the fileWIDTH = 4.0 

[Function of the function] Return a mask of vector type, the length is len, and the value corresponding to each position is funcalculated by the input

C smooth function

Convolve the input function twice to get a filtered image of the same size

D laplacian function

The explanation is too tiring, just do the above calculation operation pixel by pixel

Came to the most important header file

8 segment-image.h

8.1 Introduction

This header file is a combination of the functions and methods in the above files, and starts to realize the segmentation operation of the image.

8.2 Content

A function diff: calculation of pixel dissimilarity

Calculate the root mean square of the sum of the squares of the differences between the r, g, and b channels of two points

B segment_image(rgb image

Implemented operations:

  • Filter each channel with a filter with parameter sigma
  • Create an undirected graph edges, traverse all pixels, link adjacent pixels (direction is right, down, lower right, upper right), store each edge, and the weight is calculated by the diff function
  • Segment undirected graph with segment_graph
  •  Handle some small components and merge trees smaller than min_size
  • Randomly select a color for each component of the processed image to color
  • You can output it!

Three supplementary instructions

1 There are two files misc.h and pnmfile.h in the open source code. The former is overloaded with many operators, and the latter is for file operations, which are used to read and write pictures, so I won’t study them carefully.

2 Run segment.cpp in the code package to achieve segmentation

Guess you like

Origin blog.csdn.net/weixin_45581089/article/details/119968046