VC 维 (VC Dimension)

definition

VC Dimension: The full name is Vapnik-Chervonenkis dimension. It is used to measure the complexity of a model. It is defined as: randomly scatter x points in the space corresponding to the model, and then randomly assign a 2-type label to each point, use your model to classify, and divide it into pairs , What is x at most. This x is the VC dimension.

example

1. Linear function

If a straight line in a two-dimensional space is used as the discriminant function, what is the VC dimension of the classification model?
Answer: 3
Explanation: If there are 3 points, no matter how to label them randomly, there is a straight line that can separate the two types of samples.
If there are 4 points, a tag sequence may appear, so that there is no straight line separating the two types of samples. As shown below.
Insert picture description here

If a straight line in a three-dimensional space is used as the discriminant function, what is the VC dimension of the classification model?
Answer: 4
Summary:
Insert picture description here

2. Sine function

If a sine function in a one-dimensional space is used as the discriminant function, what is the VC dimension of the classification model?
Answer: ∞ \infty
Explanation: We can always adjust the phase and period to make the pair of any given number of sample points and any label sequence.
Insert picture description here
For example, adjust b and a so that the light blue point is below the function, and the dark blue point is above the function. (The painting is not standard~~).
Insert picture description here

3. Nearest Neighbor Model

Using the nearest neighbor classifier in any space, what is the VC dimension of the classification model?
Answer: ∞ \infty∞Explanation
: No matter how many sample points you give, set it toqqq , and then how to randomly assign labels, suppose that after the assignment is completed, the positive class hasmmm , negative class hasnnn . I can design a classifier as follows to save these labeled samples. The set of positive label samples is: x 1, x 2,... Xm {x_1,x_2,...x_m}x1,x2,...xm, The sample set of negative labels is: y 1, y 2... Yn {y_1,y_2...y_n}Y1,Y2. . . Yn, And then on this qqq samples are classifiedas xi x_ixiAs an example, we use the nearest neighbor classifier and find that xi x_ixiAway from our saved xi x_ixiThe distance is 0, then it is judged to be a positive class. Others can be deduced by analogy, in short, all can be paired.

In addition, as a supplement, we found that if the nearest neighbor classifier is changed to a K nearest neighbor classifier, the VC dimension will be reduced. This is left to your thinking, it is very simple~.
Insert picture description here

Related discussion

Insert picture description here

Insert picture description here

Guess you like

Origin blog.csdn.net/qq_43391414/article/details/111692672