Watermelon Book Notes - SVM -1

ω SVM Profile

  SVM (Support Vector Machine, SVM) are a class by supervised learning (supervised learning) the way the data is binary classification of generalized linear classifiers (generalized linear classifier), whose decision boundary is the largest margin of solving learning samples hyperplane (maximum-margin hyperplane) (Baidu Encyclopedia)

Spacing and support vector

  Because the basic idea of ​​classification learning is {, ..., (xm, ym) (x1, y1), (x2, y2)} is found in the sample space divided into a training set D hyperplane to separate samples of different categories. So our aim is to find the best division plane (as shown below) is divided in a number of super plane.

 Fig 1 : the presence of a plurality of dividing hyperplane separating two training samples

 

As can be seen from the figure dividing hyperplane red should be the best, since the division of the sample hyperplane localized disturbance of "tolerance" of the best. Because of the limitations or because of the noise factor of the training set, the samples outside the training set may be closer to the border separating the two classes, which will make a lot of mistakes over the division of the plane appear, and the smallest red hyperplane influence, that is, He said the division generated hyperplane classification result is the most "robust" in the strongest generalization.

Robustness (Robust): strong and robust meaning, in this case refers to the trained model to classify abnormal data is still able to get good results

Generalization: the ability to learn new model is applicable to samples did not appear in the training set of samples, known as generalization

 

In sample space, dividing hyperplane can be described by the following linear equation:  ωTx + B = 0

Where [omega] = ( omega] 1, [omega] 2 .... # d # ) for the normal vector, determines the hyperplane direction; b item displacement, determines the distance between the origin and the hyperplane, a hyperplane thus dividing the vector [omega] and can be displacement method b determine

For example, in a three-dimensional plane, the plane equation can be expressed as: Ax + By + Cz + D = 0 Thus the normal vector of the plane is (A, B, C), the origin of the plane distance is D, point to the distance from the plane derived as follows. map

 

 

 

Thus, any point in sample space from the hyperplane (ω, b) is:

 

 

Suppose hyperplane training samples can be correctly classified, i.e. for (xi, yi) ∈D, = + 1'd if Yi,  ωTx + B> 0 , if yi = -1, ωTx + b < 0.

At the same time, because of the super-plane scaling, always get the following equation:

Support Vector : As shown below, in part such that equality holds exactly point above formulas, these points are called support vectors.

 

 

 

 

 

Interval: two heterogeneous support vector sum and the distance to the hyperplane is called γ interval.

                                                          

 

 Support vector classification is to be found with the "maximum interval" dividing hyperplane, i.e. find the constraint ω γ and b such that the maximum spacing, the following equation can be obtained:

Interval ω-related and only seen from the equation, but also affect the value of ω b of the constraints.

 

The above formula can be rewritten as the following equation, which is basic support vector machine.

 

                          

 

 

 

 Dual problem

 

 

 

 

 

 

 

 

                                                                                                       

 

 

 

 

Guess you like

Origin www.cnblogs.com/lovejjy/p/11812141.html