Machine learning - clustering - hierarchical clustering algorithm notes

Hierarchical clustering method

Hierarchical clustering method for a given set of data decomposition level, until certain conditions are met. Concrete can be divided into:

1) aggregation of hierarchical clustering: AGNES algorithm

 A self-bottom-up strategy, the first of each object as a cluster, and then merge them into a growing cluster of clusters, until a termination condition is satisfied.

2) hierarchical clustering split: DIANA algorithm

 top-down strategy, it first Ge all objects in a cluster, then gradually broken down into smaller and smaller clusters, until reaching a termination condition.

AGNES and DIANA algorithm

1) AGNES (AGglomerative NESting) algorithm initially each object as a cluster, then the clusters are combined step by step according to certain criteria. The distance between two clusters is determined from the similarity of the most recent data point by the two different clusters; cluster merging process repeats until all objects in the final number of clusters meet.

2) DIANA (DIvisive ANAlysis) algorithm is the reverse of the above process, hierarchical clustering belong division, first initialize all the objects into a cluster, then according to some principles (such as maximum Euclidean distance), the cluster classification. Until the distance between the user-specified number of clusters or clusters of two exceeds a certain threshold.

AGNES between different definitions of clusters in the distance

1) The minimum distance

  Two sets of two nearest samples, readily form a chain structure

2) the maximum distance

  Farthest from the two sets of two samples complete, if abnormal values ​​exist unstable

3) the average distance

  Set between two sample mean twenty-two average distance squared between the ward and the two sets of samples from twenty-two

 Specific can also look at this hierarchical clustering

Guess you like

Origin www.cnblogs.com/yang901112/p/11615541.html