Image segmentation method based on the K-Means Clustering

”“”K-Means to realize Image segmentation “”“
import numpy as np
import PIL.Image as image
from sklearn.cluster import KMeans
#Define loadDato to solve my image
def loadData(filePath):
     f = open(filePath,'rb')#deal with binary 
     data = []
     img = image.open(f)#return to pixel(像素值)
     m,n = img.size#the size of image
     for i in range(m):
         for j in range(n):
             x,y,z = img.getpixel((i,j))
             #deal with pixel to the range 0-1 and save to data
             data.append([x/256.0,y/256.0,z/256.0])
     f.close()
     return np.mat(data),m,n
     
imgData,row,col = loadData("./picture/apple.png")
#setting clusers(聚类中心) is 3
label = KMeans(n_clusters=3).fit_predict(imgData)
#get the label of each pixel
label = label.reshape([row,col])
#create a new image to save the result of K-Means
pic_new = image.new("L",(row,col))
#according to the label to add the pixel
for i in range(row):
     for j in range(col):
         pic_new.putpixel((i,j),int(256/(label[i][j]+1)))
pic_new.save("./picture/km.jpg","JPEG")

K-Means algorithm:

We often say that the K-Means algorithm is unsupervised classification (tag information training samples is unknown, the goal is by the inherent nature and the law of unlabeled training samples of study to reveal the data for further data provide the basis for analysis), it is a measure of the way through in a certain degree of similarity between samples, by iteratively updating the cluster centers, when the cluster centers not move or moves difference is less than the threshold value, the sample will be divided into different categories. Clustering attempts to set the sample data is divided into several subsets are disjoint generally, each subset is called a "cluster", by this division, each cluster may correspond to a number of potential classes.

Algorithm steps:

Randomly selected cluster centers
According to the current cluster center, using the selected metrics, the classification of all sample points
Calculating the mean current for each type of sample points, the next iteration as the cluster center
The next iteration of the calculation of the gap between the current cluster centers and cluster center, if the gap is less than the threshold value iteration, iteration ends.

Algorithm pseudo-code:

Wherein, D is a sample set, the resulting divided into clades C

Image segmentation experiments : using grayscale, color, texture and shape features of the image, dividing the image into a plurality of non-overlapping regions, and similarities in the features presented in the same region, significant between different regions difference. Area image can then be segmented extracted with unique properties for different studies. In this experiment we will apple cluster centers set n_clusters = 3, cat cluster center is set to 2

1, the experimental procedure

Establish kms.py project and import the required package python
Load local images preprocessing
K-Means clustering algorithm
Clustering and save the output pixel

2, the experimental data

Test image:

3. Experimental results

3, test summary

In this experiment, we adopted a different set of cluster centers, resulting in a different clustering results. If you want to get the desired results, it must be several attempts, which makes the K value uncertainty is not conducive to our operations.

When this experiment, we encounter the following problems:

(1)IndentationError: unindent does not match any outer indentation level

(2)ValueError: cannot reshape array of size 500 into shape (500,500)

problem solved:

The reason these two problems are its alignment format python problem, f opening and closing f should be aligned, and the reason being given is also just here. The reason being given general (2) up to the data format may be a problem, but considering the data format problem, first of all look at the code in the correct format.

Image segmentation method based on the K-Means Clustering

Guess you like