综述:Recent Advances in Features Extraction and Description Algorithms: A Comprehensive Survey

translation

Feature Extraction and description of the algorithm of the latest developments: Overview

 

Abstract - Computer vision is one of today's information technology is the most active area of research. Let machines and robots to see and understand the world around us, creating endless opportunities and potential applications at a speed of sight. Feature detection algorithm described and may indeed be regarded as such a machine and the retina of the eye of the robot. However, these algorithms are usually computationally intensive, which makes the speed at which they can not achieve real-time visual performance. In addition, they have different capabilities, some people may be due to a specific type of input is more conducive than others to work. Therefore, the compact must report its pros and cons, as well as their recent progress and performance. This article dedicated to the latest developments feature detection and description of the algorithm and a comprehensive overview of the latest developments. Specifically, it begins with an overview of the basic concepts. Then, it would be more, report and discuss their performance and functionality. Selection algorithm and maximum stable extremal regions scale invariant feature transform algorithms, which are two types of the best algorithm to report recent algorithms derivative thereof. The term Index - computer vision, image processing, the robot, feature detection, characterization, MSER, SIFT

 

Ⅰ. Introduction
of static and dynamic scenarios are described feature detection and active area of research, computer vision literature relating to one of the most studied. Feature detection and concepts described herein refers to the process of identifying image points (points of interest), and this point may be used to describe the content of an image, such as edges, corners, ridges and spots. It is mainly for the object from the video stream for detecting, tracking and analysis in order to describe the operation and behavior semantics [1]. It also has many potential applications, including, but not limited to sensitive building access control, population and demographic analysis, human detection and tracking suspicious behavior detection, traffic analysis, vehicle tracking and detection of military targets.
In the past few years, we have witnessed a significant increase in uniform and non-uniform amount of visual input (mainly due to the built-in cameras, smart phones in the availability of low-cost capture devices, as well as the availability of hosted applications free images, websites and servers, such as Instagram and Facebook). This has prompted research groups made many new, powerful and automated feature detection and description of algorithms, which can adapt to the needs of the application in terms of accuracy and performance.

Most of the proposed algorithm requires intensive computations (especially when it is used with high-definition or high-resolution video stream of satellite imagery). Hardware accelerator requires a lot of processing power to accelerate these algorithms to calculate their real-time applications. A digital signal processor (DSP), field programmable gate arrays (the FPGA), a system on chip (the SoC), application specific integrated circuits (ASIC) and a graphics processing unit (GPU) platform, having a more intelligent, parallel and manageable processing hardware design can alleviate this problem to become a target.
The feature detection algorithm and transplanted to the hardware platform described in the order of the calculation can be accelerated. However, such as memory, power, scalability and hardware constraints like the format of the interface constitutes a major bottleneck to expand it to a high resolution. Typical solutions to these hardware-related problem is to reduce the resolution or sacrificing features detected accuracy. On the other hand, the latest technology in the field of machine vision and robotics recently concluded that processing algorithms to solve these problems will make a substantial contribution [2] [3]. In other words, the target computer vision algorithms may be solved with the memory and power requirements associated hardware requirements of most issues related to, and may bring great change to such systems [4]. This challenge is inviting researchers invented, implement and test these new algorithms, these new algorithms to detect and describe the main features belong to the category and is an essential tool for many visual computing applications.

To ensure the robustness of vision algorithms, a necessary prerequisite is that they are designed to cover a variety of possible scenarios, with a high degree of reproducibility and no difference. Ultimately, all of these scenarios and parameter study is almost impossible, however, critical to the successful design of a clear understanding of all of these variables. The key factors affecting the performance include the real-time processing platform (and its associated memory restrictions, FPGA and the power frequency, SoC, GPU, etc., may lead to modifications may affect the desired properties of the algorithm), monitored environment (e.g., illumination, reflection ,), and apply a shadow of interest, the view direction, angle, etc. (e.g., the object of interest, the detection tolerable miss / false alarm rate and the desired trade-off and allowed latency). Therefore, careful study of computer vision algorithms is essential.
This article dedicated to the latest developments feature detection and description of the algorithm and a comprehensive overview of the latest developments. Specifically, the paper outlines constituting the basic concepts and features described detection algorithm core. Then, it would be more, report and discuss their performance and functionality. Selecting the maximum stable extremal regions (MSERs) algorithm and scale invariant feature transform (SIFT) algorithm as the best two types, to report its latest derivative algorithm.

The rest of the paper is organized as follows. Section II summarizes the latest feature detection and description of the algorithm proposed in the literature. It also summarizes and compares the performance and accuracy under a variety of transformations. In Section III, and detailed study MSER SIFT algorithm according to its most recent derivatives thereof. Finally, Section IV summarizes with an outlook on future work.

 

Ⅱ. Definitions and principles
This section describes the process from an original image having a color or grayscale image to a set of descriptor generation stage wherein the detection and description of the image. It summarizes the characteristics described for the indicator being generated to measure the quality.
A. Local Function
local image features (also referred to as a point of interest, and salient features of critical points) may be defined as a specific pattern, characterized in that immediately adjacent pixels, which is usually associated with one or more image attributes associated [5] [6]. These properties include edges, corners, and other areas. Below FIG 1 (a) represents a summary of such local features. Indeed, these represent local features when searching for the image (or video) content of the frame can be summarized as (by means of feature descriptors) of the basic anchor. Then converts them into digital local feature descriptor represents a unique and compact summary of these local features.

Local (description and invariance) feature provides a powerful tool for a variety of computer vision and robotics applications, such as real-time visual monitoring, image search, video mining, object tracking, mosaic, object detection and wide baseline matching names few [7]. To illustrate the usefulness of such local function, consider the following example. Aerial image given to the edge, can represent the detected street, street corner intersection may be, may represent homogeneous regions car, or building roundabout (of course, depending on the resolution).
The term detector (also referred to as the extractor) conventionally refers to the detection (or extracts) such local features and ready to pass them to further processing stage algorithm or technique described in its contents, i.e., feature descriptors algorithm. That is, the feature extraction algorithm between different computer vision image processing function as an intermediate stage. In this work, the terms extractor and the detector are used interchangeably.

Figure 1: an illustrative picture local features (a) an input image, (b) angle, (c) an edge and (d) area

 

B. over the local characteristics
normally, generally having a local feature spatial extent, due to its local pixel neighborhood. That is, they represent a subset of frames semantically meaningful, e.g., corresponding to an object (or part of an object). Ultimately, all of these features localization is not feasible, because it requires high-level framework prerequisites (scene) understood [5]. Thus, a feature detection algorithm that attempts to directly locate these features based on the input frame intensity pattern. These local feature selection does affect the overall system performance significantly [6].
Over the feature (and the feature detectors) should generally have the following important characteristics [5]:
(1) unique: intensity pattern based on the detected features should have a rich variation may be used to distinguish features and match them.
(2) Locality: wherein should be local, so as to reduce the possibility of being blocked, and allows simple geometric and photometric estimation between two frames having different views deformation.
(3) Number: Number of detected features (i.e., features density) should be sufficient to (excessively) large in order to reflect the content of the frame in a compact form.
(4) Accuracy: the detected features should be accurately positioned with respect to the frame in different proportions, shape, and position of the pixel.
(5) Efficiency: identifying features should be effective in a short time, making it suitable for real-time (i.e., time critical) applications.
(6) Repeatability: Given a set of different viewing the same target (or scene) of the two frames should be found from the detected feature high percentage overlap in the visible portion of the two frames. The following two quality greatly influences the repeatability.
(7) invariance: in anticipation of large deformation (scale, rotation, etc.), the detector should be as accurate algorithm mathematical simulation of the deformation, in order to minimize its impact on the extracted features.

(8) Robustness: in case (noise, blur, discrete effects, compression artifacts, etc.) is expected to small deformation, typically sufficient to enable detection algorithm less sensitive to such modification (i.e., without sharp decline in accuracy).

Table I: summary of the prior art feature detector [6]

Intuitively, a given computer vision applications may be more beneficial than another one kind of quality [5]. Repeatability, arguably the most important quality is directly dependent on other qualities (ie, a quality improvement will also improve reproducibility). However, with regard to other qualities, usually requires compromise. For example, the unique properties and local competition (wherein the topical, the more obvious features, matching features make it more difficult). Efficiency and the number is another example of the quality of this competition. High-density characteristics may improve the object / scene recognition task, but this will have a negative impact on the computation time.

 

C. Characteristics detector
technical literature contains new features described and detection algorithms, and comparing their performance and quality surveys, such as those mentioned in the previous section. The reader may refer to [5] [8] [9] [10] [11] [12] [13] [14] [15] In some grace literature survey. However, until today still not ideal detector. This is mainly due to the (large changes in scale, viewpoint, and light contrast, image quality, compression, etc.) may be virtually unlimited number of computer vision applications (one or more features may be required), and possible differences in imaging conditions scenarios. When considering for real-time applications, the computational efficiency of such a detector has become more important [6] [8] [9].
Thus, the most important local features include: (1) Edge: refers to a sudden change in pixel intensity pattern (having a strong gradient amplitude), (2) Corner: refers to two (or more) of the edge points in the local community intersect, and (3) zone: set point means connected to the closure having similar homogeneity criteria, typically strength values.

You can visually note there is a strong correlation between local features. For example, a plurality of times around the edge region, i.e., the tracking area boundary edge defined. Similarly, the crosswise edge defines an angle [8]. Table 1 shows the outline of well-known features of the probe. Table 2 compares the performance of many of the most advanced detectors.
As computer vision literature [5] [10] Many properties [13] in Comparative investigations reported above, MSER [16] and the SIFT [17] and the other invariant features quality exhibited excellent performance. (See Table 2, last two lines). Due to these facts, MSER and SIFT algorithm is extended to a number of different derivatives having enhanced (will be reported in a later section). Therefore, the following section of this article to consider the report of the algorithm derived MSER and SIFT algorithm.

 

. Ⅲ MSER and SIFT: Derivative Algorithm
This section discusses some of the well-known SIFT algorithm MSER and derivatives. These algorithms are proposed to enhance MSER and SIFT algorithm performance in terms of computational complexity, accuracy and execution time.
A. MSER derivatives
Matas, who in 2002 made the most stable extremal regions (MSER) algorithm. Since then, the number of proposed technology-based MSER region detection algorithm. The following is a list of five kinds MSER derivatives are listed in chronological order.
(1) N-dimensional expansion: by neighborhood search criteria stability and extended to 2D instead of 3D image data intensity date, the algorithm for the first time in 2006 3D segmentation extension [18]. Later, in 2007, Vedaldi raises another extended N-dimensional data space in [19], and later in the same year, also provides extensions can take advantage of vector-valued function of the three color channels. In [20].

(2) linear-time algorithm MSER: In 2008, Nister and Stewenius proposed a new process to simulate real traffic flow in [21]. The new standard linear time algorithm having the algorithm MSER several advantages as compared to, for example, better cache locality, linear complexity and the like. [22] proposed the original hardware design.

Table II: Performance summary of the main features of the detection algorithm [5]

(3) Extended MSER (X-MSER) algorithm: Algorithm Standard MSER frame search only the input intensity extreme region. However, in 2015, [23] the authors proposed extended depth (spatial) domain, note the correlation between the intensity image and a depth image, and the introduction of extended MSER detector, which is obtained [24] patent.
(4) parallel MSER algorithm: MSER a major disadvantage is the need to run the two algorithms to detect dark and light areas in each frame extremum. To circumvent these problems, the authors propose a parallel MSER algorithm [25]. Parallel in this context refers to the ability to detect two extreme region in a single run. This algorithm is shown to enhance the algorithm is superior to standard MSER great advantages, such as a significant reduction in execution time, hardware resources and power needed. This algorithm is almost parallel MSER U.S. Patent associated (e.g. [26]).
(5) Other MSER derivatives: Other algorithms MSER inspired algorithms include extreme region extremum levels [27] [28] algorithm and tree-based Morse region (TBMR) [29].

 

The SIFT Derievatives B.
the SIFT method and having a local feature detector based on local histogram descriptors. It detects a point of interest in the image set, and for each point, which is calculated based on a histogram descriptor having 128 values. Because Lowe SIFT algorithm proposed in 2004, the number of attempts to reduce the algorithm SIFT descriptors to reduce the width of the descriptor calculating and matching time. Other algorithms use different window size and histogram calculation patterns around each set point, or to speed up the calculation process is described to increase robustness for different transform. It may be noted, compared with MSER algorithm, SIFT-rich derivatives. The reason is simple processing flow for MSER do not have much to do, unlike the SIFT more complicated. BRIEF SUMMARY derivative SIFT algorithm discussed below.

(1) ASIFT: Yu and Morel made a free version of the SIFT algorithm [30], called ASIFT. The derivative view all images simulated by changing the latitude and longitude available. It then uses the standard SIFT method itself. It turns out that ASIFT performance is better than SIFT, completely intact [30]. However, the main drawback is that a sharp increase of calculation load. ASIFT code can be found in [31].
(2) CSIFT: SIFT Algorithm for another variation of color space is CSIFT [32]. It SIFT descriptor is substantially modified (change in the color space), and found to be more robust in fuzzy and natural changes, and compared to standard SIFT, at less robust illumination change.
(3) n-SIFT: n -SIFT algorithm simply extend directly to the SIFT standard image (or data) having a multi-dimensional [33]. The algorithm used to create a feature vector by using ultra-spherical coordinates of the gradient and multi-dimensional histogram. Compared with the traditional SIFT algorithm, n-SIFT features can be extracted efficiently match 3D and 4D image.
(4) PCA-SIFT: PCA -SIFT [34] Alternatively principal component analysis feature vector (PCA) derived based on the normalized gradient which patches are not weighted and smoothed HoG SIFT standard used. More importantly, it uses a window size of 41x41 pixels to generate a descriptor length 39x39x2 = 3042, but it is by using the PCA to reduce the dimension of the descriptor vector from 3042 to 2036, which may be more preferably in the memory device is limited .

(5) SIFT-SIFER Retro fi t: SIFT and SIFT and Error Recovery (SIFER) [35] The main difference between the algorithms that SIFER (at the expense of increased computational load accuracy) have better space management higher particle size image pyramid representation and cosine modulated using Gaussian (CMG) filter to adjust the filter to better scaling. For some standards, the accuracy and robustness of the algorithm features increased by 20%. However, the price is to increase the accuracy of the execution time is slower than SIFT algorithm about twice.
(6) Other derivatives: Other derivatives include SIFT SURF [36], SIFT CS- LBP Retro fi t, RootSIFT Retro fi t and CenSurE and the STAR algorithm, summarized in [7].

 

Ⅳ Conclusion

The purpose of this paper is to provide a brief introduction to the basic principles of image feature detection and description for the new computer vision researchers. It also provides an overview of the latest technology proposed in the literature. First, review the basic concepts related to these algorithms. It also provides a brief comparison of its performance and functionality according to different indicators. Quality image feature extraction transformation already exist in real life applications compared algorithms, such as image rotation, scaling and natural. Indicators used in the comparison include: repeatability, localization, robustness, and efficiency. From these algorithms, we select the two most common, MSER algorithm and SIFT algorithm and its derivative algorithm is used to explore the details. The discussion highlighted the main aspects of the new derivatives, making them open to distinguish original form.

 

 

㍰ѨXؼ

Guess you like

Origin www.cnblogs.com/Alliswell-WP/p/TranslationOfPapers_001.html