SVM模型应用（四）SVM模型的超参数选择

SVM模型超参数优化目前常用的方法是让C和g在一定的范围内取值，对于取定的c和g，把训练集作为原始数据集利用K-CV方法得到在此对c和g组合下验证集的分类准确率，最终取得训练集验证分类准确率最高的那组c和g作为最佳参数。对于可能会有多组的c和g对应着最高的验证分类准备率的情况下，选取能够达到最高验证分类准确率中参数c最小的那对c和g作为最佳的参数，在取最小c的情况下，可能还会存在多组g，那么就选取搜索到的第一组c和g作为最佳参数。因为惩罚系数c值过大，会导致过于拟合的情况，模型泛化能力不好。

这种寻优思想可以用网格参数优化来实现。惩罚参数的变化范围在[2^cmin,2^cmax]，即在该范围内寻找最佳的参数c，默认值为cmin=-8，cmax=8,。RBF中的g变化范围也在[2^gmin,2^gmax]，，默认值同样为为gmin=-8，gmax=8。c,g分别构成横轴和纵轴，cstep,gstep分别是进行网格参数须有时 c和g的步进太小，即c的取值为 2^cmin,2^(cmin+cstep),…,2^cmax ，同理g，默认步进取值为1，通过这种方法找到最佳的c和g组合。

import numpy as np
from sklearn import svm
from sklearn.linear_model import LogisticRegression

my_matrix=np.loadtxt("E:\\pima-indians-diabetes.txt",delimiter=",",skiprows=0) 

lenth_x=len(my_matrix[0])

data_y=my_matrix[:,lenth_x-1]

data_x=my_matrix[:,0:lenth_x-1]
print(data_x[0:2],len(data_x[0]),len(data_x))
data_shape=data_x.shape
data_rows=data_shape[0]
data_cols=data_shape[1]

data_col_max=data_x.max(axis=0)#获取二维数组列向最大值
data_col_min=data_x.min(axis=0)#获取二维数组列向最小值
for i in xrange(0, data_rows, 1):#将输入数组归一化
    for j in xrange(0, data_cols, 1):
        data_x[i][j] = \
            (data_x[i][j] - data_col_min[j]) / \
            (data_col_max[j] - data_col_min[j])
print(data_x[0:2])

(array([[   6.   ,  148.   ,   72.   ,   35.   ,    0.   ,   33.6  ,
           0.627,   50.   ],
       [   1.   ,   85.   ,   66.   ,   29.   ,    0.   ,   26.6  ,
           0.351,   31.   ]]), 8, 768)
[[ 0.35294118  0.74371859  0.59016393  0.35353535  0.          0.50074516
   0.23441503  0.48333333]
 [ 0.05882353  0.42713568  0.54098361  0.29292929  0.          0.39642325
   0.11656704  0.16666667]]

n_train=int(len(data_y)*0.7)#选择70%的数据作为训练集，15%的数据作为超参数选择，15%做验证
n_select=int(len(data_y)*0.85)
X_train=data_x[:n_train]
y_train=data_y[:n_train]
print(len(y_train))
X_select=data_x[n_train:n_select]
y_select=data_y[n_train:n_select]
print(len(y_select))
X_test=data_x[n_select:]
y_test=data_y[n_select:]
print(len(y_test))

537
115
116

result =[]
for i in (-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5):
    C = 2 ** i
    for j in (-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5):
        G = 2 ** j
        clf1 = svm.SVC(kernel='rbf', gamma=G, C=C).fit(X_train,y_train)
        y_predictions1=clf1.predict(X_select)
        k=0
        for i in range(len(y_select)):
            if y_predictions1[i]==y_select[i]:
                k+=1
        result.append([C,G,k])
result1 = sorted(result, key=lambda x:x[2])

for i in result1:
    print i

[0.03125, 0.03125, 81]
[0.03125, 0.0625, 81]
[0.03125, 0.125, 81]
[0.03125, 0.25, 81]
[0.03125, 0.5, 81]
[0.03125, 1, 81]
[0.03125, 2, 81]
[0.03125, 4, 81]
[0.03125, 8, 81]
[0.03125, 16, 81]
[0.03125, 32, 81]
[0.0625, 0.03125, 81]
[0.0625, 0.0625, 81]
[0.0625, 0.125, 81]
[0.0625, 0.25, 81]
[0.0625, 0.5, 81]
[0.0625, 1, 81]
[0.0625, 8, 81]
[0.0625, 16, 81]
[0.0625, 32, 81]
[0.125, 0.03125, 81]
[0.125, 0.0625, 81]
[0.125, 0.125, 81]
[0.125, 0.25, 81]
[0.125, 16, 81]
[0.125, 32, 81]
[0.25, 0.03125, 81]
[0.25, 0.0625, 81]
[0.25, 0.125, 81]
[0.25, 0.25, 81]
[0.25, 32, 81]
[0.5, 0.03125, 81]
[0.5, 0.0625, 81]
[0.5, 32, 81]
[1, 0.03125, 81]
[0.125, 0.5, 82]
[0.0625, 2, 83]
[0.5, 0.125, 83]
[0.0625, 4, 84]
[1, 0.0625, 84]
[2, 0.03125, 84]
[32, 16, 86]
[0.125, 8, 87]
[0.25, 16, 87]
[8, 32, 87]
[16, 32, 87]
[32, 32, 87]
[4, 32, 88]
[8, 16, 90]
[16, 16, 90]
[0.125, 1, 91]
[8, 0.125, 91]
[2, 32, 92]
[4, 0.25, 92]
[16, 0.0625, 92]
[16, 0.125, 92]
[16, 8, 92]
[32, 0.125, 92]
[0.125, 4, 93]
[0.5, 1, 93]
[1, 0.125, 93]
[1, 0.25, 93]
[1, 0.5, 93]
[1, 1, 93]
[2, 0.0625, 93]
[2, 0.125, 93]
[2, 0.25, 93]
[2, 0.5, 93]
[4, 0.03125, 93]
[4, 0.0625, 93]
[4, 16, 93]
[8, 0.25, 93]
[16, 0.25, 93]
[32, 0.0625, 93]
[32, 8, 93]
[0.25, 0.5, 94]
[0.25, 1, 94]
[0.25, 2, 94]
[0.5, 0.25, 94]
[4, 0.125, 94]
[4, 0.5, 94]
[8, 0.03125, 94]
[8, 0.0625, 94]
[16, 0.03125, 94]
[16, 1, 94]
[32, 0.03125, 94]
[32, 4, 94]
[0.125, 2, 95]
[0.25, 4, 95]
[0.5, 0.5, 95]
[0.5, 2, 95]
[1, 32, 95]
[8, 0.5, 95]
[8, 2, 95]
[8, 8, 95]
[16, 2, 95]
[32, 0.25, 95]
[32, 0.5, 95]
[32, 2, 95]
[0.25, 8, 96]
[0.5, 4, 96]
[0.5, 16, 96]
[1, 2, 96]
[2, 1, 96]
[16, 0.5, 96]
[16, 4, 96]
[32, 1, 96]
[0.5, 8, 97]
[1, 4, 97]
[1, 8, 97]
[1, 16, 97]
[2, 2, 97]
[2, 4, 97]
[2, 8, 97]
[2, 16, 97]
[4, 2, 97]
[4, 8, 97]
[8, 1, 97]
[8, 4, 97]
[4, 1, 98]
[4, 4, 98]

我们可以选择C=0.5，G=8作为模型的超参数

clf_final= svm.SVC(kernel='rbf', gamma=8, C=0.5).fit(X_train,y_train)
clf2=LogisticRegression()#模型2逻辑回归还是选择默认参数
clf2.fit(X_train,y_train)

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)

X_test_final=data_x[n_train:]
y_test_final=data_y[n_train:]
y_predictions_final=clf_final.predict(X_test_final)
y_predictions2=clf2.predict(X_test_final)

k,h=0,0
for i in range(len(y_test_final)):
    if y_predictions_final[i]==y_test_final[i]:
        k+=1
for i in range(len(y_test_final)):
    if y_predictions2[i]==y_test_final[i]:
        h+=1 
print(k,h)

(186, 181)

accuracy_svm=float(k)/float(len(y_test_final))
accuracy_LogR=float(h)/float(len(y_test_final))

print"The accuracy of SVM is %f, and the accuracy of LogisticRegression is %f"%(accuracy_svm,accuracy_LogR)

The accuracy of SVM is 0.805195, and the accuracy of LogisticRegression is 0.783550

通过对SVM超参数优化后，可以明显看出模型预测结果准确率超过了逻辑回归，比第一次和第二次实验只使用默认参数效果来的好。

SVM模型应用（四）SVM模型的超参数选择

猜你喜欢