Regression Analysis of Iris Dataset Based on SVM

1. Introduction of the author

Jialun Zhang, male, graduate student of 2022, School of Electronic Information, Xi'an Polytechnic University
Research direction: Machine Vision and Artificial Intelligence
Email: [email protected]

Chen Mengdan, female, School of Electronic Information, Xi'an Polytechnic University, 2022 graduate student, Zhang Hongwei's artificial intelligence research group
Research direction: machine vision and artificial intelligence
Email: [email protected]

2. SVM support vector machine algorithm

The support vector machine (Support Vector Machine, SVM), its decision boundary is the maximum margin hyperplane for learning samples.

Figure (1) shows the decision boundary and support vectors.
insert image description here

Figure (2) shows that soft interval and hard interval
insert image description here
SVM can be classified nonlinearly through kernel tricks , which is one of the common kernel learning methods.
insert image description here
Common kernel functions:

Rbf Gaussian kernel function:
insert image description here
Sigmoid kernel function:
insert image description here

2.1 Iris data set

insert image description here
Use .info() to view the overall information of the dataset, and you can view the label types of the dataset, the corresponding label names and the number of labels.

# 使用.info()查看数据的整体信息
iris_features.info()

insert image description here
Use .value_counts() to view the types of data set samples, and you can view the number of the corresponding sample type and the number of corresponding samples in the data set.
insert image description here

2.2 Visualization of iris data set

2.2.1 Scatter plot

Merge the label and feature information, and combine the label features of the dataset with the sample types for two-dimensional visual distribution. For example, the iris data set has four feature labels and three sample types, and the distribution of the three sample types is mapped to the corresponding two-dimensional feature coordinates. And when the coordinates correspond to the same latitude, the corresponding distribution quantity of each sample in this latitude can be seen. The characteristic distribution of each sample can be observed under the two-dimensional combination of various latitudes.

# 合并标签和特征信息
iris_all = iris_features.copy() ##进行浅拷贝,防止对于原始数据的修改
iris_all['target'] = iris_target
sns.pairplot(data=iris_all,diag_kind='hist', hue= 'target')  # 特征与标签组合的散点可视化
plt.show()

insert image description here

2.2.2 Box plot

The boxplot is the sample type distribution for a single feature latitude. Through the boxplot, you can easily observe the one-dimensional feature distribution of the sample type and the separability between each sample type under a single feature of the data set.

for col in iris_features.columns:
    sns.boxplot(x='target', y=col, saturation=0.5,palette='pastel', data=iris_all)
    plt.title(col)
plt.show()

insert image description here

2.2.3 Three-dimensional scatter plot (3D)

Select its three features to draw a 3D scatterplot. According to the characteristics of the selected three latitudes, the three-dimensional modeling of the sample types in the data set can be carried out. By changing the three selected features and the sample separability of the single-latitude feature of the boxplot, 3D modeling can be performed on the data set at different feature latitudes, and the 3D modeling results with the highest degree of separation can be obtained.

# 选取其前三个特征绘制三维散点图
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10,8))
ax = fig.add_subplot(111, projection='3d')
iris_all_class0 = iris_all[iris_all['target']==0].values
iris_all_class1 = iris_all[iris_all['target']==1].values
iris_all_class2 = iris_all[iris_all['target']==2].values
# 'setosa'(0), 'versicolor'(1), 'virginica'(2)
ax.scatter(iris_all_class0[:,0], iris_all_class0[:,1], iris_all_class0[:,2],label='setosa')
ax.scatter(iris_all_class1[:,0], iris_all_class1[:,1], iris_all_class1[:,2],label='versicolor')
ax.scatter(iris_all_class2[:,0], iris_all_class2[:,1], iris_all_class2[:,2],label='virginica')
plt.legend()
plt.show()

insert image description here

3. Implementation of SVM algorithm

3.1 Complete code

import numpy as np
import pandas as pd
import pylab
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.svm import LinearSVC
from sklearn.svm import SVC
from matplotlib.colors import ListedColormap
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.axes._axes import _log as matplotlib_axes_logger
matplotlib_axes_logger.setLevel('ERROR')
from sklearn.datasets import load_iris

data = load_iris() # 得到数据特征
iris_target = data.target # 得到数据对应的标签
iris_features = pd.DataFrame(data=data.data, columns=data.feature_names) # 利用Pandas转化为DataFrame格式

iris_features.info()

pd.Series(iris_target).value_counts()

#### 数据可视化 ####
# 散点图
# 合并标签和特征信息
iris_all = iris_features.copy() ##进行浅拷贝,防止对于原始数据的修改
iris_all['target'] = iris_target
sns.pairplot(data=iris_all,diag_kind='hist', hue= 'target')  # 特征与标签组合的散点可视化
plt.show()

# 箱型图
for col in iris_features.columns:
    sns.boxplot(x='target', y=col, saturation=0.5,palette='pastel', data=iris_all)
    plt.title(col)
plt.show()

# 选取其前三个特征绘制三维散点图
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure(figsize=(10,8))
ax = fig.add_subplot(111, projection='3d')
iris_all_class0 = iris_all[iris_all['target']==0].values
iris_all_class1 = iris_all[iris_all['target']==1].values
iris_all_class2 = iris_all[iris_all['target']==2].values
# 'setosa'(0), 'versicolor'(1), 'virginica'(2)
ax.scatter(iris_all_class0[:,0], iris_all_class0[:,1], iris_all_class0[:,2],label='setosa')
ax.scatter(iris_all_class1[:,0], iris_all_class1[:,1], iris_all_class1[:,2],label='versicolor')
ax.scatter(iris_all_class2[:,0], iris_all_class2[:,1], iris_all_class2[:,2],label='virginica')
plt.legend()
plt.show()

iris=datasets.load_iris()
a,b=0,2
X_reduced = iris.data[:,:4]
X = X_reduced[:,[a,b]]        #二维可视化,即只取两个属性
y = iris.target     #由上述程序结果可知取值为0,1,2
x_min,x_max = X[:,0].min()-.5, X[:,0].max()+.5   #x值的最小值和最大值分别是第一列最小值和最大值-5和+5
y_min,y_max = X[:,1].min()-.5, X[:,1].max()+.5   #y值的最小值和最大值分别是第二列最小值和最大值-5和+5
plt.figure(2,figsize=(8,6))
plt.clf
plt.scatter(X[:,0],X[:,1],c=y,cmap=plt.cm.Set1,edgecolor='w')  #绘制散点图,c即color,cmap是将y不同的值画出不同颜色,edgecolor为白色
plt.xlabel(iris.feature_names[a])
plt.ylabel(iris.feature_names[b])
plt.xlim(x_min,x_max)    #x轴的作图范围
plt.ylim(y_min,y_max)    #x轴的作图范围
plt.xticks(())     #x轴的刻度内容的范围
plt.yticks(())    #y轴的刻度内容的范围

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=0)
sc = StandardScaler()
sc.fit(X_train)  #调用训练集训练
X_train_std = sc.transform(X_train)
X_test_std: object = sc.transform(X_test)

def plot_decision_regions(X, y, classifier, test_idx=None, resolution=0.02):
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
                           np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    plt.contourf(xx1, xx2, Z, alpha=0.4, cmap=cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim = (xx2.min(), xx2.max())
    X_test, y_test = X[test_idx, :], y[test_idx]
    for idx, cl in enumerate(np.unique(y)):
        plt.scatter(x=X[y == cl, 0], y=X[y == cl, 1], alpha=0.8, c=cmap(idx), marker=markers[idx], label=cl)
    if test_idx:
        X_test, y_test = X[test_idx, :], y[test_idx]
        plt.scatter(X_test[:, 0], X_test[:, 1],c='black', alpha=0.8, linewidths=1, marker='o', s=10, label='test set')
        
# 调整2*2图像大小比例
plt.figure(2,figsize=(10,8))

pylab.subplot(2, 2, 1)   # 子图像为2*2的第一个
X_combined_std = np.vstack((X_train_std, X_test_std))
y_combined = np.hstack((y_train, y_test))
svm = SVC(kernel='linear', random_state=0, C=1.0)  # 调用SVM核函数,’linear’核函数,以及两个超参数
svm.fit(X_train_std, y_train)   # 训练
plot_decision_regions(X_combined_std, y_combined, classifier=svm, test_idx=range(105,150))
plt.ylabel(iris.feature_names[b])
plt.title('Linear')

pylab.subplot(2, 2, 2)
X_combined_std = np.vstack((X_train_std, X_test_std))
y_combined = np.hstack((y_train, y_test))
svm = SVC(kernel='poly',random_state=0, degree=2, gamma=0.3, C=100)
# 调用SVM核函数,’poly’以及四个参数,多项式核函数专属的超参数d
svm.fit(X_train_std, y_train)  #训练
plot_decision_regions(X_combined_std, y_combined, classifier=svm, test_idx=range(105,150))
plt.title('poly')

pylab.subplot(2, 2, 3)
X_combined_std = np.vstack((X_train_std, X_test_std))
y_combined = np.hstack((y_train, y_test))
svm = SVC(kernel='rbf', random_state=0, gamma=0.9, C=1.5)
svm.fit(X_train_std, y_train)
plot_decision_regions(X_combined_std, y_combined, classifier=svm, test_idx=range(105,150))
plt.ylabel(iris.feature_names[b])
plt.xlabel(iris.feature_names[a])
plt.title('rbf')

pylab.subplot(2, 2, 4)
X_combined_std = np.vstack((X_train_std, X_test_std))
y_combined = np.hstack((y_train, y_test))
svm = SVC(kernel='sigmoid', random_state=0, gamma=0.3, C=50)
svm.fit(X_train_std, y_train)
plot_decision_regions(X_combined_std, y_combined, classifier=svm, test_idx=range(105,150))
plt.xlabel(iris.feature_names[a])
plt.title('sigmoid')

plt.show()

3.2 Running results

In SVM classification, the feature label is set as a variable, and the two-dimensional feature distribution of sample types at different latitudes can be obtained by changing the label variable. Then, four different SVM kernel functions are used to classify and compare them, and finally adjust the hyperparameters to get The best classification kernel function and classification results.

Feature 0:2 2D Feature Distribution Map:insert image description here Related parameter setting:
Linear : c=1
Poly : degree=2,gamma=0.3,c=100
Rbf : gamma=0.9, c=1.5
Sigmoid : gamma=0.3, c=50.0

Feature 0:2 The result graph processed by four different SVM kernel functions
insert image description here
Feature 0:3 Two-dimensional feature distribution map:
insert image description here
Related parameter settings:
Linear : c=1
Poly : degree=3,gamma=0.7,c=10
Rbf : gamma=0.9, c=1.5
Sigmoid : gamma=0.2, c=2.5

Feature 0:3 The result graph processed by four different SVM kernel functions
insert image description here
Feature 0:1 Two-dimensional feature distribution map:
insert image description here Related parameter setting:
Linear : c=1
Poly : degree=2,gamma=0.8,c=10
Rbf : gamma=0.9, c=1.5
Sigmoid : gamma=0.3, c=1.0

Feature 0:1 The result map processed by four different SVM kernel functionsinsert image description here
Feature 1:2 Two-dimensional distribution map:
insert image description here
Related parameter setting:
Linear : c=1
Poly : degree=3, gamma=0.7, c=5
Rbf : gamma=0.9, c=1.5
Sigmoid : gamma=0.2, c=2.5

Feature 1:2 The result graph processed by four different SVM kernel functions
insert image description here
Feature 1:3 Two-dimensional feature distribution map:
insert image description here
Related parameter settings:
Linear : c=1
Poly : degree=3, gamma=0.9, c=1.5
Rbf : gamma=0.9,c=10
Sigmoid : gamma=0.2,c=2.5

Feature 1:3 The result graph processed by four different SVM kernel functions
insert image description here
Feature 2:3 Two-dimensional feature distribution map:
insert image description here
Related parameter setting:
Linear : c=1
Poly : degree=3, gamma=0.9, c=3
Rbf : gamma=0.9,c=50
Sigmoid : gamma=0.5,c=5

Feature 2:3 The result graph processed by four different SVM kernel functions
insert image description here

3.3 Problems and Analysis

(1) The feature exceeds the label numerical index , which should be [0, 1, 2, 3].
insert image description here
(2) In image layout, pylab.subplot(2,2,2): the first 2 is the number of rows, the second 2 is the number of columns, and the third is the order of the subplots. Therefore, the program in the above figure should be more pylab .subplot(1,4,…) .
insert image description here
(3) When selecting the kernel function, attention should be paid to the simplified name of the kernel function, and no mistakes should be made (note the capitalization of the first letter) .
insert image description here
(4) When running a dimensionality reduction image, an error will be reported if the dimension feature is changed, but the image can still be changed. In order to eliminate the error, the Python core needs to be restarted when changing the dimension characteristics after each run .
insert image description here

Guess you like

Origin blog.csdn.net/m0_37758063/article/details/131082668