3.k-means algorithm

1. Class practice

# Workshop 
from sklearn.datasets Import load_iris
 # Import data iris 
iris = load_iris () 
iris 
iris.keys () 
Data = iris [ ' Data ' ]   # iris data 
target = iris.target # labels, which flower belonging 
iris .feature_names # feature name: calyx length, width sepals, petals length, width petal 
# 'sepal length (cm & lt)', 'sepal width (cm & lt)', 'petal length (cm & lt)', 'petal width (cm & lt)'

2. Homework

1). The k-means clustering process of manual playing of playing cards:> 30 cards, 3 types

 

 

First classification First Class Center 1 8 13
sum 18 127 86
mean 18/8 127/18 86/7
Second classification Second Class Center 2.25 7.05 12.28
sum 18 107 106
mean 18/8 107/16 106/9
The third classification The third new center 2.25 6.68 11.77
sum 18 107 106
mean 18/8 107/16 106/9
Clustering center center 2.25 6.68 11.77

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2). * K-means algorithm is independently written, clustering is performed on the iris petal length data, and displayed with a scatterplot. (Plus points)

3). Use sklearn.cluster.KMeans and iris petal length data for clustering and display with scatter plot.

from sklearn.datasets Import load_iris
 from sklearn.cluster Import KMeans
 Import matplotlib.pyplot AS PLT
 # Import iris data 
IRIS = load_iris () 
Data = IRIS [ ' Data ' ]   # iris data 
Petal Data = [:, 2] # Petals length data 
# # n rows into one, meaning any row -1 
X_petal petal.reshape = (-1,1 ) 
MODEL1 = KMeans (= n_clusters. 3) # build a model, the number of cluster centers. 3 
MODEL1. Fit (X_petal) # model training 
Y_petal = model1.predict (X_petal) # after training the model, according to the predicted length of petal classification
# C is a color-coded, cmap color is provided 
# X axis petals data, y-axis is classified iris 
plt.rcParams [ ' font.sans serif- ' ] = [ ' SimHei ' ] # for normal display tag Chinese 
plt .scatter (X_petal [:, 0], Y_petal, c = Y_petal, cmap = " rainbow " ) 
plt.xlabel ( " petal length (cm) " ) 
plt.ylabel ( " iris classification " ) 
plt.yticks (range ( 3), labels = [ ' setosa ' , ' versicolor ' , ' virginica ' ])

 

4). Complete data of iris flowers are clustered and displayed with a scatterplot.

from sklearn.datasets Import load_iris
 from sklearn.cluster Import KMeans
 Import matplotlib.pyplot AS PLT
 # Import iris data 
IRIS = load_iris () 
X_iris = iris.data   # Iris complete data 
Model = KMeans (= n_clusters. 3)   # build a model, The number of cluster centers. 3 
model.fit (X_iris)    # model training 
Y_iris = model.predict (X_iris) # after training the model, all the data of the predicted classification 
plt.scatter (X_iris [:, 2] , X_iris [:, 3 ], c = Y_iris, cmap = " rainbow " ) 
plt.xlabel ( " petal length (cm) " )
plt.ylabel (" Petal Width (cm) " )

 

5). Think about what is used in the k-means algorithm?

The K-means algorithm is a clustering algorithm that can classify data without labels.

In actual life, it can help market segments, and customers can be divided into different market segment groups for marketing and service;

Or you can perform social network analysis and observe the interaction between people, so as to find a group of people who are related to each other.

Guess you like

Origin www.cnblogs.com/cyxxixi/p/12709892.html