SPSS system clustering

Foreword:

The reference textbook for this column is "SPSS22.0 From Beginner to Master". Due to the software version, part of the content has changed. In order to adapt to the changes in the software version, this column is created to facilitate everyone's learning. The software used in this column is:SPSS25.0

Please click this link to download all the data files in this column:SPSS data analysis column attachment!


Table of contents

1. System clustering

2.SPSS implementation

3. Result analysis


1. System clustering

Systematic clustering is a method of gradually merging data points into clusters in a certain order. It does not need to determine the number of clusters in advance, but represents the hierarchical clustering structure of data points by building a cluster tree or cluster graph, so that clustering results at different levels can be obtained at the same time. Systematic clustering is usually divided into agglomerative and divisive Two methods, generally speaking, the agglomerative method is more common.

The basic principle of agglomerative hierarchical clustering is: first treat each data point as an independent cluster, then find the two closest clusters in each step, merge them into a new cluster, and repeat the process until up to some kind of stopping rule. In this way, a clustering-level tree structure can be formed, which can be represented by a dendrogram or a dendrogram heat map. This method usually requires calculating the distance between all data points, so the computational complexity is high when processing large-scale data.

System clustering has the following advantages:
1. There is no need to specify the number of clusters in advance, and different levels of clustering results can be obtained automatically.
2. It can provide rich clustering structure information, for example, it can display clustering results at different levels and the relationship between clusters.

However, system clustering also has some shortcomings:
1. The computational complexity is high, especially when processing large-scale data, it is necessary to calculate the distance between two data points.
2. Sensitive to noise and outliers.

In practical applications, systematic clustering is often used to explore the intrinsic structure of data, discover clustering patterns at different levels, and assist decision-makers in understanding the data.

2.SPSS implementation

(1) Open the "data10-02" data file, select "Analysis" - "Classification" - "System Clustering", and the dialog box shown in the figure below will pop up.

 (2) Set the corresponding options as shown in the figure below.

(3) Click the "Statistics" button to pop up the "System Clustering: Statistics" dialog box. Set the corresponding options as shown in the figure below, and then click Continue to return to the main dialog box.

(4) Click the "Method" button, and the dialog box shown in the figure below will pop up. The options are according to the system default, and click to continue.

(5) Click the "Draw" button to pop up the "System Cluster Analysis: Graph" dialog box, set the corresponding options as shown below, and then click Continue to return to the main dialog box.

(6) Click the "Save" button to pop up the "System Cluster Analysis: Save" dialog box. Set the corresponding options as shown below, and then click Continue to return to the main dialog box.

(7) After completing all settings, click OK.

3. Result analysis

Guess you like

Origin blog.csdn.net/m0_64087341/article/details/134275330