Data Mining Cluster Analysis - Elbow Method

1 read data

from sklearn.datasets import load_iris
from sklearn.cluster import KMeans, AgglomerativeClustering, DBSCAN
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv('X.csv')
df

The result display:

 elbow method

data = np.array(df)
sse = []
for k in range(1, 11):
    kmeans = KMeans(n_clusters=k, random_state=42)
    kmeans.fit(hyt_data)
    sse.append(kmeans.inertia_)
plt.rcParams["font.sans-serif"] = "SimHei"
plt.plot(range(1, 11), sse)
plt.xlabel('Number of clusters')
plt.ylabel('SSE')
plt.title("手肘法")
plt.show()

result:

It can be seen from the elbow method that there is an obvious inflection point at 4, and it can be roughly concluded that the optimal number of clusters is 4. 

Guess you like

Origin blog.csdn.net/qq_52351946/article/details/130946677