Machine learning-03K mean algorithm

1). The k-means clustering process of manual playing of playing cards:> 30 cards, 3 types

 

                                                Figure 1 Statistical table

 

                                     Figure 2 Actual situation of the first round

 

                               Figure 3 Actual situation of the second round

2). * K-means algorithm is independently written, clustering is performed on the iris petal length data, and displayed with a scatterplot. (Plus points)

ps: The artificial intelligence teacher has taught this algorithm before, so the code is basically the same.

Source code:   

# Import data set 
from sklearn.datasets import load_iris 
import numpy as np 

data = load_iris (). Data 
n = len (data) # Calculate the total number of samples 
m = data.shape [1] # The number of sample attributes 
k = 3 #Select the class The number of centers 
dist = np.zeros ([n, k + 1]) #Initialize the distance matrix, the last column stores the category of each sample 

center = data [: k 
,: ] center_new = np.zeros ([k, m] ) 
while True: 

    for i in range (n): 
        for j in range (k): 
            dist [i, j] = np.sqrt (sum ((data [i,:]-center [j,:]) ** 2)) 

        dist [i, k] = np.argmin (dist [i,: k]) 

    for i in range (k): 
        index = dist [:, k] == i # 
        center_new [i,:] = data [index,:]. mean (axis = 0) 
 
    ,: if np.all ((center == center_new)):
        break 
    else: 
        center = center_new 
print ("Classification of 150 samples \ n", dist [:, k])

 

                                          Figure 4 Clustering results

3). Use sklearn.cluster.KMeans and iris petal length data for clustering and display with scatter plot.

Source code:

from sklearn.datasets import load_iris 
from sklearn.cluster import KMeans 
import matplotlib.pyplot as plt 

iris = load_iris () 
data = iris.data [:, 1] 
x = data.reshape (-1,1) # 
y = KMeans (n_clusters = 3) # Introduce the construction of KMeans model with a centroid number of 3 
y.fit (x) #Train the model and calculate k-means clustering 

y_pre = y.predict (x) #Predict according to the trained model, that is, calculate cluster Center and predict the clustering index of each sample 

plt.scatter (x [:, 0], x [:, 0], c = y_pre, s = 50, cmap = 'rainbow') 
plt.show ()

 

 

 

 

                                                              Figure 5 Code and scatter plot

4). Complete data of iris flowers are clustered and displayed with a scatterplot.

Source code:

Import load_iris sklearn.datasets from 
from sklearn.cluster KMeans Import
Import matplotlib.pyplot AS PLT

IRIS = load_iris ()
X = iris.data
Y = KMeans (= n_clusters. 3) is introduced construct # KMeans model number of the centroid. 3
y.fit ( x)
#Train the model and calculate the k-means clustering y_pre = y.predict (x) #Predict according to the trained model, that is, calculate the clustering center and predict the clustering index of each sample
print ("prediction result: \ n" , y_pre)
plt.scatter (x [:, 2], x [:, 3], c = y_pre, s = 100, cmap = 'rainbow', alpha = 0.5)
plt.show ()

  

                                          Figure 6 Predicted results and their scatter plots

5). Think about what is used in the k-means algorithm?

Answer: k-means clustering is the most famous clustering algorithm. Due to its simplicity and efficiency, it is the most widely used among all clustering algorithms. Given a set of data points and the required number of clusters k, k is specified by the user, the k-means algorithm repeatedly divides the data into k clusters according to a certain distance function. Life can be used to classify and classify things according to certain characteristics, such as analyzing the Chinese football team's level according to previous years' data, predicting this year's seed quality based on previous year's seed quality, data classification, etc.

Guess you like

Origin www.cnblogs.com/lcj170/p/12709350.html