Summary of semi-automated extraction methods for buildings

Based on boundaries:

Boundary-based interactive extraction methods require the user to specify a small number of key points or approximate locations of the target boundary, and then accurately track the boundary of the target based on features such as the strength and continuity of the target boundary. Common boundary-based methods are the Snake algorithm and Intelligent Scissors. The boundary-based method is widely used, but its disadvantages are obvious. For example, when the user marks the boundary, it needs to go around the boundary, especially for objects with complex boundaries, and the workload is large; when the contrast between the boundary of the target and the surrounding area is low , it is likely to mislabel keypoints or approximate locations of the target boundary.

A simple understanding is that the user specifies the point at which the object needs to be extracted for a week.

The so-called edge refers to the collection of continuous pixel points on the boundary line of two different regions in the image, which reflects the discontinuity of the local characteristics of the image, and reflects the sudden change of image characteristics such as grayscale, color, and texture. In general, the edge-based segmentation method refers to the edge detection based on the gray value, which is based on the observation that the edge gray value will show a step or roof type change.

There are obvious differences in the gray value of the pixels on both sides of the step-shaped edge, while the roof-shaped edge is located at the turning point where the gray value rises or falls. It is based on this characteristic that differential operators can be used for edge detection, that is, the extreme value of the first derivative and the zero crossing point of the second derivative are used to determine the edge, and the specific implementation can be done by convolution of the image and the template.

The first area-based processes are mostly:

Image preprocessing—initial segmentation—feature extraction—weighting—foreground and background model selection—threshold setting—extracting target image—accuracy inspection

Based on seed points:

This kind of method is to divide the image into different regions according to the similarity criterion, mainly including several types such as seed region growing method, region splitting and merging method and watershed method.

The seed region growing method starts from a group of seed pixels representing different growth regions, and then merges the qualified pixels in the neighborhood of the seed pixels into the growth region represented by the seed pixels, and uses the newly added pixels as new seeds Pixels continue the merging process until no new pixel is found that meets the criteria. The key to this method is to choose a suitable initial seed pixel and a reasonable growth criterion.

The basic idea of ​​the region splitting and merging method is to first divide the image into several disjoint regions arbitrarily, and then split or merge these regions according to relevant criteria to complete the segmentation task. This method is suitable for both grayscale image segmentation and texture Image segmentation.

The watershed method is a mathematical morphology segmentation method based on topology theory. Its basic idea is to regard the image as a topological landform in geodesy. The gray value of each pixel in the image represents the altitude of the point. Each The local minima and their area of ​​influence are called catchment basins, and the boundaries of catchment basins form watersheds. The implementation of this algorithm can be simulated as a flooding process, the lowest point of the image is first submerged, and then the water gradually floods the entire valley. When the water level reaches a certain height, it will overflow. At this time, dams are built at the place where the water overflows. This process is repeated until all the points on the entire image are submerged. At this time, a series of dams built become watersheds separating the basins. The watershed algorithm has a good response to weak edges, but the noise in the image will cause the watershed algorithm to produce over-segmentation.

Based on Graph Cuts:

There must be an optimal solution, but it is limited to solving binary classification problems, and exponential comparison is required for more than binary classification. And the error is larger when there are too few seed points or the classification background is similar.

From GrabCut to OneCut: Image Segmentation - Programmer Sought

grabcut in one-cut is an easy-to-use and fast image segmentation algorithm_zhangyumengs' Blog-CSDN Blog

Construction and Implementation of Graph in Classical Graph Cut Algorithm: Graph-Cut_PandasRan's Blog-CSDN Blog_graphcut

Graph Cuts

 

The first kind of vertices and edges are: The first kind of common vertices correspond to each pixel in the image. The connection of every two neighborhood vertices (corresponding to every two neighborhood pixels in the image) is an edge. Such edges are also called n-links.

The second type of vertices and edges is: there is a connection between each common vertex and these two terminal vertices, forming the second type of edge. Such edges are also called t-links.

Each edge has a weight. The cut in Graph Cuts refers to a subset of the edge set in the graph. The weight sum of all edges in the cut is called cut. Cuts in Graph Cuts refers to such a set of edges. Obviously, these edge sets include the above two types of edges. The disconnection of all edges in this set will lead to the separation of the remaining "S" and "T" graphs, so it is called For "cut". If a cut has the smallest sum of all values ​​of its edges, then this is called a minimum cut, which is the result of a graph cut. The Ford-Fulkerson theorem shows that the maximum flow max flow of the network is equal to the minimum cut min cut. So the max-flow/min-cut algorithm invented by Boykov and Kolmogorov can be used to obtain the minimum cut of the st graph. This minimum cut divides the vertices of the graph into two disjoint subsets S and T, where s ∈ S, t ∈ T and S ∪ T = V. These two subsets correspond to the foreground pixel set and the background pixel set of the image, which is equivalent to completing the image segmentation.

Based on rectangular box (GrabCut):

The user points out the background area and does not need to point out the foreground area.

If the background is complex or the similarity between the background and the target is very large, it is not easy to segment, and the speed is a bit slow.

(1) The target and background models of Graph Cut are grayscale histograms, and Grab Cut is replaced by a mixed Gaussian model GMM of RGB three channels;

(2) The energy minimization (segmentation) of Graph Cut is achieved at one time, while Grab Cut is replaced by an interactive iterative process of continuous segmentation estimation and model parameter learning;

(3) Graph Cut requires the user to specify some seed points of the target and the background, but Grab Cut only needs to provide the pixel set of the background area. That is to say, you only need to frame the target, and then all the pixels outside the frame will be used as the background. At this time, you can model the GMM and complete a good segmentation. That is, Grab Cut allows incomplete labeling.

Image Segmentation (3) From Graph Cut to Grab Cut_zouxy09's Blog - CSDN Blog

One cut based on Graph Cuts

GrabCut in One Cut (OpenCV implementation of a fast image segmentation based on the graph cut algorithm grabcut)----the best graph cut at present_shiter's blog-CSDN blog_grabcut in one cut

The construction and implementation of graphs in the classic graph cut algorithm: one-Cut_PandasRan's Blog-CSDN Blog_onecut

The main improvement of one-cut is to modify the prior penalty term of the original graph cut. The prior penalty term is represented by the L1 distance of the foreground and background histograms of the full image. The author said that this modification can improve performance and solve the problem of NP difficulty in the grab-cut frame method. The EM algorithm needs to be iterated in grab-cut, which may not be able to solve the global optimal solution, while one-cut can directly solve the optimal solution of the criterion function.

Interactive extraction of rectangular buildings based on multi-star constraint graph cuts:

It can be classified as seed line segmentation based on region growth, adding multi-star shape prior information to improve the accuracy of image segmentation, adding star shape constraints to the Graph Cuts framework to obtain a graph cut that fuses star shape prior information Model, when the energy function is the smallest, the image segmentation result is optimal. The goal of graph cut based on star constraints is to find the star shape that can minimize the energy function in all star shape sets, so as to obtain the optimal segmentation result .

The technical route is to preprocess the image first - perform superpixel segmentation on the image (replacing a single pixel with a superpixel will effectively reduce the complexity of image processing), using a simple linear iterative clustering algorithm - and then through manual interaction The target foreground and background superpixels are obtained from the seed line—the building patch is obtained by using the graph cut method based on multi-star constraints—and finally the outline of the building patch is regularized.

Guess you like

Origin blog.csdn.net/m0_51864191/article/details/127904287