opencv advanced 04-understand what is the K-adjacent algorithm

A machine learning algorithm is a model generated from data, that is, an algorithm for learning (hereinafter also referred to as an algorithm). We provide experience to the algorithm, and it can generate models based on empirical data. When faced with a new situation, the model will provide us with judgment (prediction) results. For example, we judge a child to be a good candidate for an athlete based on "tall, long legs, and light weight." Quantify these data and give it to the computer, and it will generate a model based on it. When facing a new situation (judging whether another child can become an athlete), the model will give a corresponding judgment.

data characteristics

For example, to test a group of children, the basic data of this group of children must be obtained first. This set of data includes data such as height, leg length, and weight . These items that reflect the performance or nature of an object (or an event) in a certain aspect are called attributes or characteristics . The specific value, such as "188 cm" reflecting the height, is the feature value or attribute value . The collection of this set of data "(height=188 cm, leg length=56 cm, weight=46 kg), ..., (height=189 cm, leg length=55 cm, weight=48 kg)" is called a data set , where the data for each child is called a sample.

The process of learning a model from data is called learning or training. The data used in the training process is called training data, each sample in it is called a training sample, and the set of training samples is called a training set.

Of course, if you want to obtain a model, in addition to having data, you also need to attach corresponding labels to the samples. For example, "(((tall, long-legged, light weight), nice guy)"). The "good seeds" here are labels, and we usually refer to samples with labels as "examples".

After learning the model, in order to test the effect of the model, it must be tested, and the tested samples are called test samples. When inputting a test sample, the label of the test sample (the target category) is not provided, but the model determines the label of the sample (which category it belongs to). Comparing the difference between the predicted label of the test sample and the actual sample label, the accuracy of the model can be calculated.

Most machine learning algorithms are derived from everyday practice. The K nearest neighbor algorithm is one of the simplest machine learning algorithms, which is mainly used to divide objects into known classes and is widely used in life. For example, if the coach wants to select a group of long-distance runners, how should he be selected? He may be using the K-nearest neighbor algorithm, and will select candidates who are tall, have long legs, light weight, small knee and ankle circumference
, obvious Achilles tendons, and large arches. He will feel that such children have the potential of athletes, or that the characteristics of these children are very close to those of athletes.

The basic idea of ​​the K nearest neighbor algorithm

The essence of the K nearest neighbor algorithm is to assign the specified object according to the known特征值分类 . For example, when seeing a father and son, under normal circumstances, by judging their ages, one can immediately tell which is the father and which is the son. This is via the age attribute 特征值来划分.

The above example is the simplest classification based on a single feature dimension. In actual scenarios, the situation may be more complicated, with multiple feature dimensions. For example, to classify a sports video, determine whether the video is a table tennis game or a football game.

In order to determine the classification, features need to be defined. Two features are defined here, one is the athlete's "waving" action, and the other is the athlete's "kicking" action. Of course, we can't classify a video as a "ping pong game" just by seeing the "waving" action, because we know that some football players are used to waving to communicate with teammates on the sports field. Likewise, we can't just see a "kick" and classify a video as a "soccer game" because some table tennis players use "kicks" to express their emotions.

We counted the number of "waving" and "kicking" actions in the video within a certain period of time, and found the following rules:

  • In the video of a table tennis match, there are far more "waves" than "kicks".
  • Videos of football games show far more "kicks" than "waves".

According to the analysis of a group of videos, the data shown in Table 20-1 are obtained.

insert image description here

For the convenience of observation, the above data is drawn as a scatter diagram, as shown in Figure 20-1.

insert image description here

insert image description here

As can be seen from Figure 20-1, the data points exhibit clustering characteristics:

  • The data points in the table tennis game video are gathered in the area where the coordinates of the x-axis are [3000, 5000] and the coordinates of the y-axis are [1,500].
  • The data points in the football game video are clustered in the area where the coordinates of the y-axis are [3000, 5000] and the coordinates of the x-axis are [1,500].

At this time, there is a video test, and statistics show that there are 2000 "waving" actions and 100 "kicking" actions. If its location is marked in Figure 20-1, it can be found that the nearest neighbor of the video Test's location is a table tennis game video, so it can be judged that the video is a table tennis game video.

The above example is an extreme example, black and white, but the actual classification data often has a lot of parameters, and it is not so simple to judge.
Therefore, in order to improve the reliability of the algorithm, k neighboring points will be selected during implementation , and which class among these k points belongs to is more, and then the current point to be identified is divided into which class. For the convenience of judgment,
the value of k is usually an odd number, which is the same as the reason why board members are usually arranged as an odd number in order to obtain a clear voting result.

For example, it is known that a well-known twin artist A and B look very similar. If you want to judge whether the person on an image T is artist A or artist B, the specific steps to implement it using the K-nearest neighbor algorithm are as follows:

(1) Collect 100 photos of artist A and artist B each.
(2) Determine several important features for identifying people, and use these features to label photos of artists A and B.

For example, according to certain 4 features, each photo can be expressed as [156, 34, 890, 457] (that is, a sample point). According to the above method,
a data set FA of 100 photos of artist A and a data set FB of 100 photos of artist B are obtained. At this time, the elements in the data sets FA and FB
are in the form of the above-mentioned eigenvalues, and each set has 100 such eigenvalues. In short, it uses numerical values ​​to represent photos, and obtains
the numerical feature set (data set) FA of artist A and the numerical feature set FB of artist B.

(3) Calculate the feature of the image T to be recognized, and use the feature value to represent the image T. For example, the feature value TF of image T may
be [257, 896, 236, 639].
(4) Calculate the distance between the eigenvalue TF of the image T and each eigenvalue in FA and FB.

(5) Find the sample points that produce the k shortest distances (find the k nearest neighbors to T), count the number of sample points belonging to FA and FB among the k sample points, and which data set has more sample points , determine which artist's image T is.

For example, find 11 nearest points, among these 11 points, there are 7 sample points belonging to FA, and 4 sample points belonging to FB, then it is determined that the artist on this image T is A; otherwise, if Among these 11 points, 6 sample points belong to FB, and 5 sample points belong to FA, then the artist on this image T is determined to be B.
The above is the basic idea of ​​the K nearest neighbor algorithm.

computational thinking

The "feeling" of a computer is achieved through logical calculations and numerical calculations. Therefore, in most cases, we need to numerically process the objects to be processed by the computer and quantify them into specific values ​​for subsequent processing.

After obtaining the eigenvalues ​​of each sample, the K-nearest neighbor algorithm calculates the distance between the eigenvalues ​​of the samples to be identified and the eigenvalues ​​of each known classification sample, and then finds the k nearest neighbor samples, according to the k nearest neighbor samples The classification of the sample with the highest proportion belongs to determine the classification of the sample to be identified.

01. Normalization

For simple cases, it is sufficient to directly calculate the distance (gap) from the eigenvalue.

For example, in a certain film and television drama, it has been known through technical means that the height of the suspect is 186 cm, and the height of the victim is 172 cm. In the face of the police, both A and B claimed to be victims.

At this point, we can determine who is the real victim by measuring the height of the two:

  • A's height is 185 cm, the distance from the suspect's height = 186-185 = 1 cm, and the distance from the victim's height = 185-172 = 7 cm. A's height is closer to the suspect, so A is confirmed as the suspect.
  • B's height is 173 cm, the distance from the suspect's height=186-173=13cm, and the distance from the victim's height=173-172=1cm. B's height is closer to that of the victim, so B is identified as the victim.

The above example is a very simple special case. In actual scenarios, more parameters may be required for judgment.

For example, in a foreign film and television drama, the police learned through technical means that the suspect was 180 cm tall and missing a finger; the victim was 173 cm tall and had all ten fingers intact. At this time, A and B, who came to surrender, both claimed to be victims.
When there are multiple parameters, these parameters are generally formed into a list (array) for comprehensive judgment. In this example, (height, number of fingers) is used as the feature.

Therefore, the suspect has eigenvalues ​​of (180, 9) and the victim has eigenvalues ​​of (173, 10).

At this point, the following judgments can be made for the two:

  • A is 175 cm tall and lacks a finger. The eigenvalue of A is (175, 9).
  • The distance between A and the characteristic value of the suspect = (180-175) + (9-9) = 5
  • The distance between A and the characteristic value of the victim = (175-173) + (10-9) = 3

At this time, the eigenvalue of A is closer to the victim, and it is determined that A is the victim

  • B's height is 178 cm, and his ten fingers are sound. B's eigenvalue is (178, 10).
  • The distance between B and the characteristic value of the suspect = (180-178) + (10-9) = 3
  • The distance between B and the characteristic value of the victim = (178-173) + (10-10) = 5

At this time, the eigenvalues ​​of B and the suspect are closer, and it is concluded that B is the suspect.

Of course, we know that the above results are wrong . Because the height and the number of fingers have different dimensions (weights), the weights between different parameters should be fully considered when calculating the distance from the feature value. Usually, due to reasons such as inconsistent dimensions of each parameter , it is necessary to process the parameters so that all parameters have equal weights.

In general, it is enough to normalize the parameters. When normalizing, the eigenvalue is usually divided by the maximum value (or the difference between the maximum value and the minimum value) of all eigenvalues .

For example, in the above example, the height is divided by the maximum height of 180 (cm), and the number of fingers of each person is divided by the maximum number of fingers 10 (10 fingers) to obtain a new feature value. The calculation method is:

Normalized features = (each person's height/maximum height 180, number of fingers/long hand index 10)

Therefore, after normalization:

  • The feature value of the suspect is (180/180, 9/10) = (1, 0.9)
  • The characteristic value of the victim is (173/180, 10/10) = (0.96, 1)

At this point, the two can be judged based on the normalized eigenvalues:

  • The characteristic value of A is (175/180, 9/10)=(0.97, 0.9)
  • The distance between A and the characteristic value of the suspect = (1-0.97) + (0.9-0.9) = 0.03
  • The distance between A and the characteristic value of the victim = (0.97-0.96) + (1-0.9) = 0.11

At this time, the eigenvalues ​​of A and the suspect are closer, and it is concluded that A is the suspect.

  • The characteristic value of B is (178/180, 10/10)=(0.99, 1)

  • The eigenvalue distance between B and the suspect = (1-0.99) + (1-0.9) = 0.11

  • Eigenvalue distance between B and the victim = (0.99-0.96) + (1-1) = 0.03

At this time, the eigenvalues ​​of B and the victim are closer, and it is concluded that B is the victim.

02. Distance calculation

In the previous discussion, we calculated the distance several times. The method used is to subtract the corresponding elements in the eigenvalues ​​first, and then sum them.

For example, there are eigenvalues ​​A(185, 75) and B(175, 86) in the form of (height, weight), and the distance between C(170, 80) and eigenvalue A and eigenvalue B is judged below:

  • Distance between C and A = (185-170) + (75-80) = 15+(-5) =10
  • Distance from C to B = (175-170) + (86-80) = 5+6 = 11

Through calculation, the distance between C and A is closer, so C is classified as the category to which A belongs.

Of course, we know that the above judgment is wrong , because there is a negative number when calculating the distance between C and A, which offsets a part of the positive number. So, in order to avoid this kind of positive and negative offset, we usually calculate the sum of absolute values :

  • The distance between C and A = |185-170|+|75-80| = 15+5 = 20
  • Distance between C and B = |175-170|+|86-80| = 5+6 = 11

After taking the absolute value and then summing, it is calculated that the distance between C and B is closer, and C is classified as the category to which B belongs. This distance expressed by the sum of absolute values ​​is called 曼哈顿距离.

Calculating distance like this is basically sufficient, but there are better ways. For example, a way to calculate sums of squares can be introduced. The calculation method at this time is:

insert image description here
The more general form is to calculate the square root of the sum of squares. This distance is the widely used Euclidean distance. Its calculation method is:

insert image description here

After reading it, try to use your own understanding of the vernacular K-neighborhood algorithm, see you in the comment area! ! !

Guess you like

Origin blog.csdn.net/hai411741962/article/details/132298394