SVM model application (4) Hyperparameter selection of SVM model

The current commonly used method for SVM model hyperparameter optimization is to let C and g take values ​​within a certain range. For the selected c and g, the training set is used as the original data set and the K-CV method is used to obtain the combination of c and g here. Lower the classification accuracy of the validation set, and finally obtain the group of c and g with the highest validation classification accuracy of the training set as the best parameters. In the case that there may be multiple groups of c and g corresponding to the highest verification classification preparation rate, select the pair of c and g that can achieve the smallest parameter c in the highest verification classification accuracy rate as the best parameter. In the case of , there may be multiple groups of g, then the first group of c and g searched is selected as the best parameters. Because the value of the penalty coefficient c is too large, it will lead to overfitting, and the generalization ability of the model is not good.

This kind of optimization idea can be realized by grid parameter optimization. The variation range of the penalty parameter is [2^cmin, 2^cmax], that is to find the best parameter c within this range, the default value is cmin=-8, cmax=8,. The g variation range in RBF is also [2^gmin, 2^gmax], and the default value is also gmin=-8, gmax=8. c, g constitute the horizontal axis and the vertical axis respectively, cstep, gstep are the grid parameters. Sometimes the step of c and g is too small, that is, the value of c is 2^cmin, 2^(cmin+cstep),… ,2^cmax , similarly for g, the default step value is 1, and this method is used to find the best combination of c and g.

import numpy as np
from sklearn import svm
from sklearn.linear_model import LogisticRegression

my_matrix=np.loadtxt("E:\\pima-indians-diabetes.txt",delimiter=",",skiprows=0) 

lenth_x=len(my_matrix[0])

data_y=my_matrix[:,lenth_x-1]

data_x=my_matrix[:,0:lenth_x-1]
print(data_x[0:2],len(data_x[0]),len(data_x))
data_shape=data_x.shape
data_rows=data_shape[0]
data_cols=data_shape[1]

data_col_max=data_x.max(axis=0)#获取二维数组列向最大值
data_col_min=data_x.min(axis=0)#获取二维数组列向最小值
for i in xrange(0, data_rows, 1):#将输入数组归一化
    for j in xrange(0, data_cols, 1):
        data_x[i][j] = \
            (data_x[i][j] - data_col_min[j]) / \
            (data_col_max[j] - data_col_min[j])
print(data_x[0:2])
(array([[   6.   ,  148.   ,   72.   ,   35.   ,    0.   ,   33.6  ,
           0.627,   50.   ],
       [   1.   ,   85.   ,   66.   ,   29.   ,    0.   ,   26.6  ,
           0.351,   31.   ]]), 8, 768)
[[ 0.35294118  0.74371859  0.59016393  0.35353535  0.          0.50074516
   0.23441503  0.48333333]
 [ 0.05882353  0.42713568  0.54098361  0.29292929  0.          0.39642325
   0.11656704  0.16666667]]
n_train=int(len(data_y)*0.7)#选择70%的数据作为训练集,15%的数据作为超参数选择,15%做验证
n_select=int(len(data_y)*0.85)
X_train=data_x[:n_train]
y_train=data_y[:n_train]
print(len(y_train))
X_select=data_x[n_train:n_select]
y_select=data_y[n_train:n_select]
print(len(y_select))
X_test=data_x[n_select:]
y_test=data_y[n_select:]
print(len(y_test))
537
115
116
result =[]
for i in (-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5):
    C = 2 ** i
    for j in (-5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5):
        G = 2 ** j
        clf1 = svm.SVC(kernel='rbf', gamma=G, C=C).fit(X_train,y_train)
        y_predictions1=clf1.predict(X_select)
        k=0
        for i in range(len(y_select)):
            if y_predictions1[i]==y_select[i]:
                k+=1
        result.append([C,G,k])
result1 = sorted(result, key=lambda x:x[2])

for i in result1:
    print i
[0.03125, 0.03125, 81]
[0.03125, 0.0625, 81]
[0.03125, 0.125, 81]
[0.03125, 0.25, 81]
[0.03125, 0.5, 81]
[0.03125, 1, 81]
[0.03125, 2, 81]
[0.03125, 4, 81]
[0.03125, 8, 81]
[0.03125, 16, 81]
[0.03125, 32, 81]
[0.0625, 0.03125, 81]
[0.0625, 0.0625, 81]
[0.0625, 0.125, 81]
[0.0625, 0.25, 81]
[0.0625, 0.5, 81]
[0.0625, 1, 81]
[0.0625, 8, 81]
[0.0625, 16, 81]
[0.0625, 32, 81]
[0.125, 0.03125, 81]
[0.125, 0.0625, 81]
[0.125, 0.125, 81]
[0.125, 0.25, 81]
[0.125, 16, 81]
[0.125, 32, 81]
[0.25, 0.03125, 81]
[0.25, 0.0625, 81]
[0.25, 0.125, 81]
[0.25, 0.25, 81]
[0.25, 32, 81]
[0.5, 0.03125, 81]
[0.5, 0.0625, 81]
[0.5, 32, 81]
[1, 0.03125, 81]
[0.125, 0.5, 82]
[0.0625, 2, 83]
[0.5, 0.125, 83]
[0.0625, 4, 84]
[1, 0.0625, 84]
[2, 0.03125, 84]
[32, 16, 86]
[0.125, 8, 87]
[0.25, 16, 87]
[8, 32, 87]
[16, 32, 87]
[32, 32, 87]
[4, 32, 88]
[8, 16, 90]
[16, 16, 90]
[0.125, 1, 91]
[8, 0.125, 91]
[2, 32, 92]
[4, 0.25, 92]
[16, 0.0625, 92]
[16, 0.125, 92]
[16, 8, 92]
[32, 0.125, 92]
[0.125, 4, 93]
[0.5, 1, 93]
[1, 0.125, 93]
[1, 0.25, 93]
[1, 0.5, 93]
[1, 1, 93]
[2, 0.0625, 93]
[2, 0.125, 93]
[2, 0.25, 93]
[2, 0.5, 93]
[4, 0.03125, 93]
[4, 0.0625, 93]
[4, 16, 93]
[8, 0.25, 93]
[16, 0.25, 93]
[32, 0.0625, 93]
[32, 8, 93]
[0.25, 0.5, 94]
[0.25, 1, 94]
[0.25, 2, 94]
[0.5, 0.25, 94]
[4, 0.125, 94]
[4, 0.5, 94]
[8, 0.03125, 94]
[8, 0.0625, 94]
[16, 0.03125, 94]
[16, 1, 94]
[32, 0.03125, 94]
[32, 4, 94]
[0.125, 2, 95]
[0.25, 4, 95]
[0.5, 0.5, 95]
[0.5, 2, 95]
[1, 32, 95]
[8, 0.5, 95]
[8, 2, 95]
[8, 8, 95]
[16, 2, 95]
[32, 0.25, 95]
[32, 0.5, 95]
[32, 2, 95]
[0.25, 8, 96]
[0.5, 4, 96]
[0.5, 16, 96]
[1, 2, 96]
[2, 1, 96]
[16, 0.5, 96]
[16, 4, 96]
[32, 1, 96]
[0.5, 8, 97]
[1, 4, 97]
[1, 8, 97]
[1, 16, 97]
[2, 2, 97]
[2, 4, 97]
[2, 8, 97]
[2, 16, 97]
[4, 2, 97]
[4, 8, 97]
[8, 1, 97]
[8, 4, 97]
[4, 1, 98]
[4, 4, 98]

We can choose C=0.5, G=8 as the hyperparameters of the model

clf_final= svm.SVC(kernel='rbf', gamma=8, C=0.5).fit(X_train,y_train)
clf2=LogisticRegression()#模型2逻辑回归还是选择默认参数
clf2.fit(X_train,y_train)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)
X_test_final=data_x[n_train:]
y_test_final=data_y[n_train:]
y_predictions_final=clf_final.predict(X_test_final)
y_predictions2=clf2.predict(X_test_final)
k,h=0,0
for i in range(len(y_test_final)):
    if y_predictions_final[i]==y_test_final[i]:
        k+=1
for i in range(len(y_test_final)):
    if y_predictions2[i]==y_test_final[i]:
        h+=1 
print(k,h)
(186, 181)
accuracy_svm=float(k)/float(len(y_test_final))
accuracy_LogR=float(h)/float(len(y_test_final))
print"The accuracy of SVM is %f, and the accuracy of LogisticRegression is %f"%(accuracy_svm,accuracy_LogR)
The accuracy of SVM is 0.805195, and the accuracy of LogisticRegression is 0.783550

After optimizing the SVM hyperparameters, it can be clearly seen that the accuracy of the model prediction results exceeds that of logistic regression, which is better than the first and second experiments using only the default parameters.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325763322&siteId=291194637
svm