Clustering -dbscan

First, the algorithm principle

The cluster distance and density

  1.  Select the starting point, if all the points have been elected as the starting point or have been incorporated into categories, then stop
  2.  The point with the selected distance is less than a certain threshold value into a set of points
  3.  Step 2 If the point number is larger than a certain value, then incorporate it into a category, and then select a center point in this category then proceeds to step 2, otherwise go to step 1

 

Two, Code

 1 from scipy.spatial import distance
 2 from sklearn.neighbors import NearestNeighbors
 3 from sklearn.cluster.dbscan_ import DBSCAN
 4 from sklearn.cluster.dbscan_ import dbscan
 5 import numpy as np
 6 
 7 from sklearn.cluster.tests.common import generate_clustered_data
 8 
 9 min_samples = 10
10 eps = 0.0309
11 
12 X = generate_clustered_data(seed=1, n_samples_per_cluster=1000)
13 
14 D = distance.squareform(distance.pdist(X))
15 D = D / np.max(D)
16 core_samples, labels = dbscan(D, metric="precomputed", eps=eps,
17                               min_samples=min_samples)

 

Third, the results

There are pictures available of this type of clustering is actually not very good with DBSCAN, as kmeans. Cyclic DBSCAN more suitable clusters, cluster shape or high density

Guess you like

Origin www.cnblogs.com/ylxn/p/11822889.html