Kmeans of clustering algorithm

Kmeans is a relatively simple algorithm in the clustering algorithm, and it is also used a lot. Here is a simple explanation, the main purpose is to record it for yourself for later review.

The main idea of ​​K-means clustering is to make the points of each class as close as possible to the cluster center.

The algorithm for K-means clustering can be described as:


Input: dataset D = { x 1 , x 2 , . . , x n }
Number of clusters: k


algorithm:

  1. Randomly select k samples from dataset D as initial cluster centers { u 1 , u 2 , . . . , u k }
  2. repeat
  3. make C i = ( 1 i k )
  4. for j = 1,2,…,m do
  5. for l = 1,2,…,n
  6. Calculate the distance between sample l and each cluster center, and add sample l to the set to which the nearest cluster center belongs
  7. end for
  8. end for
  9. Recalculate the center position of all clusters, that is, the mean vector
  10. until the cluster centers no longer change (sometimes the convergence time is very long, a maximum number of repetitions may be set, or a threshold for the change of cluster centers may be set ϵ

output: clusters { C 1 , C 2 , . . . , C k }


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325838974&siteId=291194637