sklearn KNN implements iris classification

Classification of iris flowers based on sklearn's KNN algorithm

Data set download: GitHub

1. Data preparation

  For learning classification problems, the iris data set is a more commonly used example. This article uses original data, a total of 150 valid data, the content and format have not been modified.
  The first 10 rows of data are as follows:
Sample data

2. Import several packages

  Including several packages of pandas and sklearn:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

3. Divide the training set and test set

  First, extract the first four columns of the data set as features, and the last column as the classification label; then, use train_test_split()the features and labels to randomly divide the training set and test, and set the proportion of the test set to 20%, which is 30 Bar; Finally, convert the labels of the training set and the test set into a one-dimensional array (no conversion is possible, just for the convenience of viewing).

# 读取数据
iris_data_set = pd.read_csv("D:\\iris.csv")
# x是4列特征
x = iris_data_set.iloc[:, 0:4].values
# y是1列标签
y = iris_data_set.iloc[:, -1].values

# 划分训练集和测试集
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

# 将特征转为一维数组
y_train = y_train.flatten()
y_test = y_test.flatten()

4. Train the model and predict

  First, KNeighborsClassifier()establish the KNN algorithm model by calling the function, where the n_neighbors=3K value is set to 3; then, input the features of the training set and the classification label for training; finally, apply the features of the test set to the model to classify and obtain the classification result.

# 建模
knn_model = KNeighborsClassifier(n_neighbors=3)
# 训练
knn_model.fit(x_train, y_train)
# 预测
y_pre = knn_model.predict(x_test)

5. Result output and analysis

  Print out the actual classification of the test set and the classification predicted by the model for intuitive comparison.
  The confusion matrix is ​​an important basis for evaluating the pros and cons of the classification model, and confusion_matrix()the confusion matrix of the model can be returned by calling it .
  The evaluation classification model has many indicators, which can classification_report()be output through functions.

print("正确标签:", y_test)
print("预测结果:", y_pre)

# 混淆矩阵
conf_mat = confusion_matrix(y_test, y_pre)
print(conf_mat)

# 分类指标文本报告(精确率、召回率、F1值等)
print(classification_report(y_test, y_pre))

  The final result is as follows:
Insert picture description here

6. Summary

  It can be seen that based on the sklearn API, you can easily perform data set division, model building, model training, and classification prediction without writing too much code, and you can also calculate the classification indicators of the model.
  The complete code is as follows:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report

# 读取数据
iris_data_set = pd.read_csv("D:\\iris.csv")
# x是4列特征
x = iris_data_set.iloc[:, 0:4].values
# y是1列标签
y = iris_data_set.iloc[:, 4:].values

# 划分训练集和测试集
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

# 将特征转为一维数组
y_train = y_train.flatten()
y_test = y_test.flatten()

# 建模、训练、预测
knn_model = KNeighborsClassifier()
knn_model.fit(x_train, y_train)
y_pre = knn_model.predict(x_test)

print("正确标签:", y_test)
print("预测结果:", y_pre)

# 混淆矩阵
conf_mat = confusion_matrix(y_test, y_pre)
print(conf_mat)

# 分类指标文本报告(精确率、召回率、F1值等)
print(classification_report(y_test, y_pre))

Extended learning

  1. Python iris classification based on BP neural network
  2. Machine learning classification problem indicator understanding-accuracy, precision, recall, F1-Score, ROC curve, PR curve, AUC area
  3. Python multidimensional data visualization

Welcome to follow my WeChat public account:

Insert picture description here

Guess you like

Origin blog.csdn.net/michael_f2008/article/details/107574888