KD-tree algorithm of KNN

KD-tree algorithm is to model the data set, and then search for the nearest neighbor, the final step is to predict.

KD refers to the tree K is the dimension of the sample characteristics.

First, the establishment of KD tree

m n-dimensional feature samples, wherein calculating the variance of n, whichever is the largest variance of the k-dimensional feature as the root node. Select k-dimensional feature as a median of the cutting points, the median is less than the left subtree of the discharge, the discharge is greater than the median right subtree, recursively generated.

For example

6 two-dimensional samples, {(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)}:

1, to find the root, six data points in the x, y dimension on the variance are 6.97,5.37, x dimension maximum variance, choosing the x-dimension for trie;

2, to cut points, the median dimension is x (7,2), therefore the value of this point in the x-dimension is divided;

3, x = 7 is divided into left and right space portions, then using this method recursively, the final result is:

Second, the search for the nearest neighbor

After kd tree generation can predict where the target sample test set up.

First find a sample containing the target

 

Guess you like

Origin www.cnblogs.com/pacino12134/p/11333238.html