K-Means algorithm and code implementation

1.K-Means algorithm

  K-Means algorithm, also referred to as average or K- K- means algorithm, a clustering algorithm is a widely used. K-Means algorithm is focused on similar Unsupervised algorithms, as the distance between the standard similarity measure data object, i.e., the smaller the distance between data objects, their similarity is higher, the more likely they are the same a class cluster. Is known as K-Means since it can be found k different clusters, and the center of each cluster using the cluster mean is calculated from values ​​contained.

2. The concept of clustering

  Clustering, it is not a given sample determined in advance categories, according to their needs, determine the number of classes, and then return to sample different categories inside. In other words, the same garbage example, you give a bunch of garbage, I can recyclable, non-recyclable cluster is divided into two piles; also be based on recyclable, non-recyclable, kitchen waste clustered into three groups . And for the same condition in which a bunch of clustering, we can understand the degree of similarity between the garbage.

3.k-means algorithm ideas

1. Centralized data randomly selected from the data object as an initial k cluster centers k clusters, and each cluster corresponds to a data object;
2. the remaining data objects according to their distance from the center point of each cluster, respectively, from its assigned to the nearest cluster;
3. updated cluster center of each cluster (i.e. recalculation of all objects within each cluster the average, each data object redistribute);
4. criterion function converges or until the cluster centers not change, otherwise go to step3. 

4. The code implementation

 

 

 

operation result:

 

 

Guess you like

Origin www.cnblogs.com/wukuanglin/p/11488042.html