"Statistical learning methods" Chapter 14 Clustering KMeans

k- means clustering

k n samples assigned to different classes or clusters, each sample from its center to a minimum belongs.

Each sample can only belong to a class, all k- means clustering is hard clustering .

model

  • k < n
  • G_{i} \cap G_{j} = \varnothing, \bigcup_{i=1}^{k}G_{i} = X

Tactics

  • Distance: Euclidean distance
  • Loss function: distance from the center of the sample belongs based on the total retention
  • NP-hard problem

algorithm

The objective function minimization

  1. Initialization, randomly selected ksamples do center
  2. Clustering the samples, the sample is calculated from the cluster center, each sample to its nearest center is assigned a class
  3. The new class of computing centers. Calculate the mean of the sample clustering results, as a new class center
  4. If the iteration converge or meet the conditions to stop the output. Otherwise, let t = t + 1returns 2

Source: github.com/iOSDevLog/s...

Reproduced in: https: //juejin.im/post/5cfe7baef265da1b6a348a19

Guess you like

Origin blog.csdn.net/weixin_34406061/article/details/91443530