C++ Implementation of kmeans Clustering Algorithm in Image Processing

Kmeans clustering algorithm is a very common clustering algorithm. Given the number of clusters N, Kmeans will automatically find N centroids in the sample data, thereby dividing the sample data into N categories. The following briefly introduces the Kmeans clustering principle, and attaches the Kmeans clustering algorithm implementation written by myself.

1. Kmeans principle

  1. Input: a set of data data, set the number of categories to be clustered ClusterCnt, set the number of iterations IterCnt, and the iteration cut-off accuracy eps

      Output: The label corresponding to the data data, each data will correspond to a label (range 0 ~ ClusterCnt-1), indicating which category the data belongs to.

   2. First, select the initial ClusterCnt centroid positions. It is important to select the initial centroid positions. The general principle is that the ClusterCnt centroids are as far apart in value from each other as possible (the larger the distance, the better). The lazy way is to randomly select, or select the previous ClusterCnt data as the initial centroid. But the premise is that the initial centroid values ​​cannot be repeated or equal.

  3. Start clustering, which is an iterative process. First, for each data, calculate the distance (difference) between it and each centroid, select the corresponding centroid with the smallest distance, classify it into one category (set to the same label value), and traverse all the data in turn. This way after the first iteration, all data has a label value.

  4. Calculate the new centroid. After each iteration is completed, calculate the mean of the data in each category, use this mean as the new centroid, and proceed to the next round of iterations. In this way, the sequential centroids are recalculated after each iteration. until the conditions in 5 are met.

  5. After each iteration, calculate the variance value of the values ​​in each category, and then find the mean value var of the variance values ​​of all categories, using var as a criterion, when the change between this var and the last var is less than eps , or when the number of iterations is greater than iterCnt, the iteration is stopped and the clustering is completed.

  6. Label the output data. The same label is worth clustering into one category by kmeans, so that all data are clustered into the set ClusterCnt categories.

 

The application of images

  Simply apply the kmeans algorithm to the classification of pixels in the image, use the RGB value of each pixel as input data, calculate the distance between the pixel and the centroid, and iterate continuously until all pixels have a label value. According to the label image, set the same color for the same category in the original image, and set different colors for different categories. Can be used for image segmentation etc.

  OpenCV also integrates the API of the Kmeans algorithm, as shown in the figure below, there are three flags that can be set for the initial centroid selection, random selection, certain algorithm selection, and user setting. For specific usage, please refer to the OpenCV documentation.

  

3. Examples

Original image kmeans clustering (10 categories)

   

 4. Code

  See my code cloud code: https://gitee.com/rxdj/myKmeans.git

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325406434&siteId=291194637