支持向量机,因其英文名为support vector machine,故一般简称SVM,通俗来讲,它是一种二类分类模型,其基本模型定义为特征空间上的间隔最大的线性分类器,其学习策略便是间隔最大化,最终可转化为一个凸二次规划问题的求解。
两个主要内容:
1.原始公式怎么由实际问题产生;
2.原始问题到对偶问题的数学推导;
分类最基本的想法就是基于训练集的样本空间找到一个平面(划分超平面),将不同类别的样本分开。但是划分超平面有很多,具体选择哪一个,要根据哪个超平面的两个异类支持向量到超平面的距离(间隔)最大。“支持向量”:支撑向量本质是向量,而这些向量却起着很重要的作用,如果做分类,他们就是离分界线最近的向量。也就是说分界面是靠这些向量确定的,他们支撑着分类面。名字就是这么来的...(就是离最优分类平面最近的离散点,也可以称为向量)
对原始公式使用拉格朗日乘法可得到“对偶问题”。
1.对原始公式的每条约束添加拉格朗日乘子得到该问题的拉格朗日函数L;
2.令L对w,b求偏导为0,代入L中,将L中的w,b消去,即可得到对偶问题。
学习的目标是在特征空间中找到一个分离超平面,能将实例分到不同的类,其中,法向量w指向的一侧为正类;
一个点距离分离超平面的远近表示分类预测的确信程度;
间隔最大化的直观解释:对训练集找到几何间隔最大的超平面意味着以充分大的确信度对训练数据进行分类。也就是说,不仅将正负实例点分开,而且将最难分的实例点(离超平面最近的点)也有足够大的确信度将它们分开。这样的超平面应该对未知的新实例有很好的分类预测能力;
非线性分类问题:通过利用非线性模型才能很好的进行分类的问题;
非线性问题的求解:进行一个非线性变化,将非线性问题变换为线性的问题;
线性分类方法求解非线性分类问题分为两步:1.使用一个变换将元空间的数据映射到新空间 2.在新空间里用线性分类学习方法从训练数据中学习分类模型;
核技巧:通过一个非线性变换将输入空间对应于一个特征空间,使得在输入空间汇总的超曲面模型对于特征空间中的超平面模型(支持向量机);
实际代码中C是控制对错误分类的惩罚,C越大,要求SVM尽量将所有样本分类正确
高斯核可以看做是一个函数,用来测量一对样本的距离。
# -*- coding: utf-8 -*-
import matplotlib.pyplot as plt
import numpy as np
import scipy.io as scio
from sklearn import svm
import plotData as pd
import visualizeBoundary as vb
import gaussianKernel as gk
plt.ion()
np.set_printoptions(formatter={'float': '{: 0.6f}'.format})
# ===================== Part 1: Loading and Visualizing Data =====================
# We start the exercise by first loading and visualizing the dataset.
# The following code will load the dataset into your environment and
# plot the data
print('Loading and Visualizing data ... ')
# Load from ex6data1:
data = scio.loadmat('ex6data1.mat')
X = data['X']
y = data['y'].flatten()
m = y.size
# Plot training data
pd.plot_data(X, y)
input('Program paused. Press ENTER to continue')
# ===================== Part 2: Training Linear SVM =====================
# The following code will train a linear SVM on the dataset and plot the
# decision boundary learned
#
print('Training Linear SVM')
# You should try to change the C value below and see how the decision
# boundary varies (e.g., try C = 1000) C越大,要求尽量的正确所有分类
c = 1000
clf = svm.SVC(c, kernel='linear', tol=1e-3)
clf.fit(X, y)
pd.plot_data(X, y)
vb.visualize_boundary(clf, X, 0, 4.5, 1.5, 5)
input('Program paused. Press ENTER to continue')
# ===================== Part 3: Implementing Gaussian Kernel =====================
# You will now implement the Gaussian kernel to use
# with the SVM. You should now complete the code in gaussianKernel.py
#非线性分类
print('Evaluating the Gaussian Kernel')
x1 = np.array([1, 2, 1])
x2 = np.array([0, 4, -1])
sigma = 2
sim = gk.gaussian_kernel(x1, x2, sigma)
print('Gaussian kernel between x1 = [1, 2, 1], x2 = [0, 4, -1], sigma = {} : {:0.6f}\n'
'(for sigma = 2, this value should be about 0.324652'.format(sigma, sim))
input('Program paused. Press ENTER to continue')
# ===================== Part 4: Visualizing Dataset 2 =====================
# The following code will load the next dataset into your environment and
# plot the data
#
print('Loading and Visualizing Data ...')
# Load from ex6data1:
data = scio.loadmat('ex6data2.mat')
X = data['X']
y = data['y'].flatten()
m = y.size
# Plot training data
pd.plot_data(X, y)
input('Program paused. Press ENTER to continue')
# ===================== Part 5: Training SVM with RBF Kernel (Dataset 2) =====================
# After you have implemented the kernel, we can now use it to train the
# SVM classifier
#
print('Training SVM with RFB(Gaussian) Kernel (this may take 1 to 2 minutes) ...')
c = 1
sigma = 0.1
def gaussian_kernel(x_1, x_2):
n1 = x_1.shape[0]
n2 = x_2.shape[0]
result = np.zeros((n1, n2))
for i in range(n1):
for j in range(n2):
result[i, j] = gk.gaussian_kernel(x_1[i], x_2[j], sigma)
return result
# clf = svm.SVC(c, kernel=gaussian_kernel)
clf = svm.SVC(c, kernel='rbf', gamma=np.power(sigma, -2))
clf.fit(X, y)
print('Training complete!')
pd.plot_data(X, y)
vb.visualize_boundary(clf, X, 0, 1, .4, 1.0)
input('Program paused. Press ENTER to continue')
# ===================== Part 6: Visualizing Dataset 3 =====================
# The following code will load the next dataset into your environment and
# plot the data
#
print('Loading and Visualizing Data ...')
# Load from ex6data3:
data = scio.loadmat('ex6data3.mat')
X = data['X']
y = data['y'].flatten()
m = y.size
# Plot training data
pd.plot_data(X, y)
input('Program paused. Press ENTER to continue')
# ===================== Part 7: Visualizing Dataset 3 =====================
clf = svm.SVC(c, kernel='rbf', gamma=np.power(sigma, -2))
clf.fit(X, y)
pd.plot_data(X, y)
vb.visualize_boundary(clf, X, -.5, .3, -.8, .6)
input('ex6 Finished. Press ENTER to exit')
# ==========================================
plotData.py中的代码
def plot_data(X, y):
plt.figure()
pos = np.where(y == 1)[0]
neg = np.where(y == 0)[0]
plt.scatter(X[pos, 0], X[pos, 1], marker="+", c='b')
plt.scatter(X[neg, 0], X[neg, 1], marker="o", c='y', s=15)
gaussianKernel.py的代码
def gaussian_kernel(x1, x2, sigma):
x1 = x1.flatten()
x2 = x2.flatten()
sim = 0
sim = np.exp(np.sum((x1 - x2) ** 2) / (-2*sigma**2))
return sim
visualizeBoundary.py
def visualize_boundary(clf, X, x_min, x_max, y_min, y_max):
h = .02
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contour(xx, yy, Z, levels=[0], colors='r')