About a .KNN
1.KNN algorithm, also known as K near algorithm is
data mining
one classification. The so-called K-nearest neighbor, k nearest neighbor is the meaning of that is that each sample can use its closest neighbors k to represent.
The core idea 2.KNN algorithm is that if a majority of the samples in the feature space of k-nearest neighbor samples belong to a category, then the sample also fall into this category, and has the characteristics of the sample on the category. The method in determining the classification decision based solely on the nearest one category or several samples to determine the category to be sub-sample belongs. KNN algorithm in the decision-making categories, with only a very small amount of adjacent samples related. Since the KNN algorithm is mainly limited by the surrounding adjacent samples, rather than a method of determining the discrimination class Category field, so for domains overlap or more class sample set is to be divided, KNN method is more than other methods suitable.
II. Code implementation
# - * - Coding: UTF-. 8 - * - "" " Use python KNN algorithm program to simulate the Created 22 is ON Sat On Jun 2019 18:38:22 @author: Zhen " "" Import numpy AS NP Import Collections CS AS Data = NP .Array ([ [ 203,1], [126,1], [89,1], [70,1], [196,2], [211,2], [221,2], [311,3 ], [271,3 ] ]) feature = data [:, 0] # wherein Print (feature) label = data [:, -. 1] # results Sort Print (label) predictPoint = 200 is # forecast data Print ( " predicted input characterized by:" + STR (predictPoint)) Distance = List (Map ( the lambda X: ABS (predictPoint - X), Feature)) # each point the distance to the prediction point Print (Distance) sortIndex = np.argsort (Distance) # sorting, returns after sorting index for each original data Print (sortIndex) sortLabel = label [sortIndex] # reordering according subscript Print (sortLabel) # k = #. 3 set value of size k. 3 for k in Range (. 1, label.size + 1'd ): Result = cs.Counter (sortLabel [0: K].) most_common (. 1) [0] [0] # highest occurrence frequency classified according to the previous value calculating k k data, the predicted classification is the print (" When = K " + STR (K) + " predicted when classified: " + STR (Result))
OUTCOME
[203 1,268,970,196,211,221,311 271 ] [ 111,122,233 ] Predictive Input wherein: 200 [ 3, 74, 111, 130, 4, 11, 21, 111, 71 ] [0 4 5681273 ] [ 122,231,131 ] when K = 1 is predicted classified as: 1 when K = 2 predicted classified as: 1 K when = 3 is predicted classified into: 2 when K =. 4 is predicted classified as: 2 K when = 5 forecast classified as: 2 K when = 6 forecast classified as: 2 K when = 7 prediction classified as: 1 K when = 8 prediction classified as: 1 when K = the prediction. 9 classified as: 1
IV. Summary
1. The training data and results, such as when k is small [this] When k = 1, the prediction error is prone to the presence of abnormal data if the training data, it is generally not too small values K!
2. When the k value is large, the more training data for a classification to predict the more likely this category, therefore, must first be re-balance the training data according to the classification!
3. Select the number and classification of general value of k, the greater number of classification, k is generally greater, the value is generally: between k ~ 2k!