KNN algorithm to achieve --python

1.KNN algorithm as the classification algorithm, also known as k-nearest neighbor algorithm.
The core idea 2.KNN algorithm is a new sample in the feature space, k-nearest sample most of a class, then this sample also fall into this category.
Here Insert Picture Description
Here we calculate the distance between samples using the Euler's formula.
Here Insert Picture Description

import math

import numpy as np

from sklearn import datasets
import matplotlib.pyplot as plt

raw_data_X = [[3.393533211, 2.331273381],
              [3.110073483, 1.781539638],
              [1.343808831, 3.368360954],
              [3.582294042, 4.679179110],
              [2.280362439, 2.866990263],
              [7.423436942, 4.696522875],
              [5.745051997, 3.533989803],
              [9.172168622, 2.511101045],
              [7.792783481, 3.424088941],
              [7.939820817, 0.791637231]
             ]
raw_data_y = [0, 0, 0, 0, 0, 1, 1, 1, 1, 1]

X_train = np.array(raw_data_X)
y_train = np.array(raw_data_y)


First, we will establish the data, then convert the data into a numpy array.

plt.scatter(X_train[y_train == 0,0],X_train[y_train == 0,1],color = 'g')
plt.scatter(X_train[y_train == 1,0],X_train[y_train == 1,1],color = 'r')
plt.show()

Then use matplotlib draw the corresponding scatter plots.
Here Insert Picture Description
Then, insert a new point.

x = np.array([8.093607318, 3.365731514])

plt.scatter(X_train[y_train == 0,0],X_train[y_train == 0,1],color = 'g')
plt.scatter(X_train[y_train == 1,0],X_train[y_train == 1,1],color = 'r')
plt.scatter(x[0],x[1],color = 'b')
plt.show()

Here Insert Picture Description
Using the Euclidean distance equation, the new point is determined, what category.
Our most recent six points from group to determine the basis.

distances = []

for x_train in X_train:
    d = sqrt(np.sum((x_train - x)**2))
    distances.append(d)

nearest = np.argsort(distance)

c = [y_train[k] for k in nearest[:6]]

C is output:
Here Insert Picture Description
found, the new data points determined to 1.
If the data points, it may also be utilized collections.

from collections import Counter

vote = Counter(c)
vote.most_common()

Output
Here Insert Picture Description

Released two original articles · won praise 0 · Views 43

Guess you like

Origin blog.csdn.net/qq_45779388/article/details/104532401