Characteristics and classification - "convolutional neural network computer vision" study notes

  Feature extraction and classification are two key phases of a typical computer vision systems. Accuracy, robustness and efficiency of the visual system depends largely on the quality of the image features and classifier. Feature extraction methods can be divided into two different categories, which is based on manual methods and feature-based learning approach. Classifier can be divided into two groups, namely shallow and deep model model.

  Feature any unique aspects or characteristics associated with a particular application to solve computing tasks. Combinations of n feature may be represented as n-dimensional vector, called a feature vector. Quality feature vector depends on its ability to distinguish between different categories of image samples. Good features should be a wealth of information, from the noise and the impact of a series of transformations, and calculates fast.

  Classification is the core of modern computer vision and pattern recognition. Classification is the task of feature vectors using an image or region of interest (ROI) divided category. The degree of difficulty depends on the feature value classification task from the same class images of variability, and the difference with respect to the image features from different classes of values. However, since the noise (hatched, occlusion, perspective distortions, etc), outliers (e.g., an image "construction" category may include people), ambiguity (e.g., the same rectangular shape may correspond to a table or windows of a building) , missing label, only a small training samples are available, and the imbalance of positive and negative training cover the data samples. Consequently, the classifier design decisions is a challenging task.

  

Traditional feature descriptors: traditional (manual design) feature extraction method into two categories: global and local. Global feature extraction method defines a set of global features effectively describe the entire image, and therefore, details of the shape are ignored. Global feature does not apply to the object recognition part occluded. On the other hand, the local feature extraction method for extracting a local region around the critical point, it is possible to better handle occlusion. Here are some local feature extraction methods.

    (1) HOG descriptors - to describe the shape and appearance of objects in the image edge direction histogram. Implementation is divided into four steps:

      1. The gradient calculation. In horizontal and vertical direction of the image, the center of performing one-dimensional discrete differential template.

      2. direction histogram unit. Each pixel within the cell, based on the modulus of the gradient at each pixel a weighted gradient orientation intervals votes cast.

      3. descriptor block. In order to handle changes in the light and contrast, through contiguous blocks on more space in the cell together form a locally normalized gradient strength. HOG descriptors are vector normalized histogram of cell components from all regions.

      4. The return of a block. May be normalized by L1 or L2 norm norm.

    (2) SIFT-- scale invariant feature transform

      Providing a SIFT feature set of objects, these objects for scaling and rotation features are robust. Divided into the following four steps:

      1. extreme value detecting scale space. Use SIFT Difference of Gaussian (DoG), search DoG images in all scales and image locations to search for local extremes.

      2. The key point precise positioning. This step by looking for those with low contrast or localized weak point on the edge, remove the key points of potential instability point from the list.

      3. Positioning direction. In order to realize rotation invariance of the image, which is based on the local image properties for each key is assigned a constant direction. It may then be expressed with respect to the direction of the key point descriptor.

      4. Key Descriptor

      SIFT complex mathematical ideas, research needs for many years.

    (3) SURF-- acceleration robust feature

      SURF is an accelerated version of SIFT. In the SIFT Gaussian Laplacian approximated by the DoG, to construct scale space. SURF estimated by using a filter cartridge LoG to accelerate this process.

    The limitations of traditional hand-engineering features

    Advances in computer vision is based on the manual engineering features. However, the difficulty of characteristics engineer, time-consuming and requires expertise in the field of issues. Another drawback is that they feature handmade works too sparse in terms of information, can not be captured from the image. The use of neural networks such as depth features automatic learning algorithm can solve all these problems.

Guess you like

Origin www.cnblogs.com/candyRen/p/11988753.html