KMeans Clustering

#lannister/machinelearning

k-means算法以k为参数,把n个对象分成k个簇,使簇内具有较高的相似度,而簇间的相似度较低。其处理过程如下:
1.随机选择k个点作为初始的聚类中心;
2.对于剩下的点,根据其与聚类中心的距离,将其归入最近的簇
3.对每个簇,计算所有点的均值作为新的聚类中心
4.重复2、3直到聚类中心不再发生改变

Clustering is to find the relations/connections between data without labels.
K-means is one of the most widely used algorithms.

K-means for non-separated clusters(T-shirt sizing)

  • Find closest centroids

            
            
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
            
            
    K = size(centroids, 1);
    distance = zeros(K, 1); % to store and return the min distance
    idx = zeros(size(X,1), 1);
    for i = 1:size(X, 1)
    for k = 1:K
    distance(k) = sqrt(sum((X(i,:)-centroids(k,:)).^2));
    end
    [mini, index] = min(distance);
    idx(i) = index;
    end
  • Compute Means

            
            
    1
    2
    3
    4
    5
    6
    7
    8
            
            
    [m n] = size(X);
    centroids = zeros(K, n);
    for k=1:K
    log 大专栏   KMeans Clusteringic = idx==k;
    centroids(k,:) = 1/sum(logic)*sum(X.*logic);
    % sum(logic) is the number of examples assigned to kth centroid
    end
  • Randomly initialize cluster centroids

            
            
    1
    2
    3
    4
            
            
    centroids = zeros(K, size(X, 2));
    randidx = randperm(size(X, 1));
    % Take the first K examples as centroids
    centroids = X(randidx(1:K), :);
  • K-Means Clustering on Pixels

            
            
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
            
            
    % Run K-Means
    for i=1:max_iters
    % Output progress
    fprintf('K-Means iteration %d/%d...n', i, max_iters);
    if exist('OCTAVE_VERSION')
    fflush(stdout);
    end
    % For each example in X, assign it to the closest centroid
    idx = findClosestCentroids(X, centroids);
    % Given the memberships, compute new centroids
    centroids = computeCentroids(X, idx, K);
    end

猜你喜欢

转载自www.cnblogs.com/dajunjun/p/11710952.html