K-NN algorithm Overview

A, the KNN algorithm (k-NearestNeighbor), k values ​​near algorithm: given a data point in the future, determine the distance between it and the existing data points, to take the k nearest points, which exist in that Up to this point the class say the new data point the kind of homing.

 • easy problems:

  1., k value is too small, prone to over- fitting problem , the result is high accuracy on the training set, but very low on the test set.

  2, wherein the specific gravity of the imbalance. When the distance between the sample points is calculated, if the presence of a different order of magnitude difference in dimensions, will lead to certain features of the role (the influence on the distance) is too large or too small. So to be normalized to avoid this problem.

 • measure distance: the Euclidean distance, Manhattan distance, whichever is greater, and so on

 

Two, kd (K-demension tree) tree: the particular space into several parts, related search within a specific section.

Guess you like

Origin www.cnblogs.com/yyf2019/p/11578878.html