Pattern Recognition (ii) the feature vector space

1.2 feature vector and the feature space
a feature vector. :
N characteristic measured values an analytical object are x1, x2, x3, xn, which constitute an n-dimensional feature vector x, x = (x1, x2 , x3, xn ) T, x is a mathematical original object (sample) abstraction to represent the original object, the original object is the model.
Spatial b:
classification of an object is its model, i.e. its classification feature vector. Various values of x constitute the entire n-dimensional space, the n-dimensional space is a point feature space, feature vector x is the feature space, feature vector also known as feature points.
c random variable:
Due to the characteristics of random factors and similar measurement system itself is different objects spread in a feature space, the same object or a feature value of the same type, are random variables. Vector consisting of random components called random vector. Feature vectors of the same object class in the feature space is based on certain statistical laws random walk.
Distribution function of the random vectors:
Here Insert Picture Description
joint probability density function :
Here Insert Picture Description
a digital random feature vectors:
1. mean vector
Here Insert Picture Description
2. conditional expectation:
Here Insert Picture Description
3 covariance matrix:
Here Insert Picture Description
2.1 Cluster Analysis:
A basic idea. :
Suppose: there are several objective object set natural classes, some natural properties of each individual class has a strong similarity.
Principle: given pattern into a plurality of groups, each group is similar to the pattern, and the difference between each of the modes set larger.
Features: 1 similar classified as a class; second mode measure of similarity and clustering; 3 unsupervised classification;.
Characteristic amount type;
1. ---- quantities by weight, length, speed
2. The amount of the order ---- levels, skill, hours
3. ---- nominal amount gender, state, type
of example : the animals were classified:
Here Insert Picture Description
according to the classification of different characteristics:
Here Insert Picture Description
Here Insert Picture Description
may be carried out to combine the two characteristics classification:

Here Insert Picture Description
Summary : choose what features? How many feature selection? Choose what kind of dimension? What kind of choice from the measure? It will have a significant impact on the classification results.
The main application clustering algorithm:
A, in some cases, unable to obtain training samples;.
B sample can be obtained, but consumes a lot of people, money and time;.
C as a pretreatment follow more complex classification algorithms;.
D. for data compression;
E for data mining and knowledge discovery;.
similarity measure 2.2 mode
degree of similarity between the characteristics used to describe the modes:
1. measure the distance;
2. similarity measure;
3. match measure;
a distance. measure (measure difference)
from the set vector x and the vector y is referred to as d (x, y);
common measure distance measurement are:
1. Euclidean distance:
Here Insert Picture Description
2. the absolute value of the distance (or Manhattan distance from the neighbors)
Here Insert Picture Description
3. cut from formula
Here Insert Picture Description
4. formula out from
Here Insert Picture Description
5. Mahalanobis distance
Here Insert Picture Description
properties of the Mahalanobis distance: nonsingular linear transformation of all it is the same. That is, a coordinate system having a scale, rotation, translation invariance, and as far as possible removed from the statistical sense of the correlation between the components.
For example:
Here Insert Picture Description

Guess you like

Origin blog.csdn.net/DOUBLE121PIG/article/details/93513883