- 算法特征:
①. 所有点正确分开; ②. 极大化margin; ③. 极小化非线性可分之误差. - 算法推导:
Part Ⅰ
线性可分之含义:
包含同类型所有数据点的最小凸集合彼此不存在交集.
引入光滑化手段:
plus function: \begin{equation*}(x)_{+} = max \{ x, 0 \}\end{equation*}
p-function: \begin{equation*}p(x, \beta) = \frac{1}{\beta}ln(e^{\beta x} + 1)\end{equation*}
可采用p-function近似plus function. 相关函数图像如下:
下面给出p-function的1阶及2阶导数:
1st order (sigmoid function): \begin{equation*} s(x, \beta) = \frac{1}{e^{-\beta x} + 1} \end{equation*}
2nd order (delta function): \begin{equation*} \delta (x, \beta) = \frac{\beta e^{\beta x}}{(e^{\beta x} + 1)^2} \end{equation*}
相关函数图像如下:
Part Ⅱ
SVM决策超平面方程如下:
\begin{equation}\label{eq_1}
h(x) = W^Tx + b
\end{equation}
其中, $W$是超平面法向量, $b$是超平面偏置参数.
本文拟采用$L_2$范数衡量非线性可分之误差, 则SVM原始优化问题如下:
\begin{equation}
\begin{split}
\min &\quad\frac{1}{2}\left\|W\right\|_2^2 + \frac{c}{2}\left\|\xi\right\|_2^2 \\
s.t. &\quad D(A^TW + 1b) \geq 1 - \xi
\end{split}\label{eq_2}
\end{equation}
其中, $A = [x^{(1)}, x^{(2)}, \cdots , x^{(n)}]$, $D = diag([\bar{y}^{(1)}, \bar{y}^{(2)}, \cdots, \bar{y}^{(n)}])$, $\xi$是由所有样本非线性可分之误差组成的列矢量.
根据上述优化问题(\ref{eq_2}), 误差$\xi$可采用plus function等效表示:
\begin{equation}\label{eq_3}
\xi = (1 - D(A^TW + 1b))_{+}
\end{equation}
由此可将该有约束最优化问题转化为如下无约束最优化问题:
\begin{equation}\label{eq_4}
\min\quad \frac{1}{2}\left\|W\right\|_2^2 + \frac{c}{2}\left\| \left(1 - D(A^TW + 1b)\right)_{+} \right\|_2^2
\end{equation}
同样, 根据优化问题(\ref{eq_2})KKT条件中稳定性条件有:
\begin{equation}\label{eq_5}
W = AD^T\alpha
\end{equation}
其中, $\alpha$为对偶变量. 代入式(\ref{eq_4})变数变换可得:
\begin{equation}\label{eq_6}
\min\quad \frac{1}{2}\alpha^TD^TA^TAD\alpha + \frac{c}{2}\left\| \left(1 - D(A^TAD^T\alpha + 1b)\right)_{+} \right\|_2^2
\end{equation}
由于权重参数$c$的存在, 上述最优化问题等效如下多目标最优化问题:
\begin{equation}
\begin{cases}
\min& \frac{1}{2}\alpha^TD^TA^TAD\alpha \\
\min& \frac{1}{2}\left\| \left(1 - D(A^TAD^T\alpha + 1b)\right)_{+} \right\|_2^2
\end{cases}\label{eq_7}
\end{equation}
由于$D^TA^TAD\succeq 0$, 有如下等效:
\begin{equation}\label{eq_8}
\min\quad \frac{1}{2}\alpha^TD^TA^TAD\alpha \quad\Rightarrow\quad \min\quad \frac{1}{2}\alpha^T\alpha
\end{equation}
根据Part Ⅰ中光滑化手段, 有如下等效:
\begin{equation}\label{eq_9}
\min\quad \frac{1}{2}\left\| \left(1 - D(A^TAD^T\alpha + 1b)\right)_{+} \right\|_2^2 \quad\Rightarrow\quad \min\quad \frac{1}{2}\left\| p(1 - DA^TAD\alpha - D1b, \beta)\right\|_2^2, \quad\text{where $\beta\to\infty$}
\end{equation}
结合式(\ref{eq_8})、式(\ref{eq_9}), 优化问题(\ref{eq_6})等效如下:
\begin{equation}\label{eq_10}
\min\quad \frac{1}{2}\alpha^T\alpha + \frac{c}{2}\left\| p(1 - DA^TAD\alpha - D1b, \beta)\right\|_2^2, \quad\text{where $\beta\to\infty$}
\end{equation}
为增强模型得分类能力, 引入kernel trick, 得最终优化问题:
\begin{equation}\label{eq_11}
\min\quad \frac{1}{2}\alpha^T\alpha + \frac{c}{2}\left\| p(1 - DK(A^T, A)D\alpha - D1b, \beta)\right\|_2^2, \quad\text{where $\beta\to\infty$}
\end{equation}
其中, $K$代表kernel矩阵, 内部元素$K_{ij}=k(x^{(i)}, x^{(j)})$由kernel函数决定. 常见的kernel函数如下:
①. 多项式核: $k(x^{(i)}, x^{(j)}) = ({x^{(i)}}\cdot x^{(j)})^n$
②. 高斯核: $k(x^{(i)}, x^{(j)}) = e^{-\mu\| x^{(i)} - x^{(j)}\|_2^2}$
结合式(\ref{eq_1})、式(\ref{eq_5})及kernel trick可得最终决策超平面方程:
\begin{equation}
h(x) = \alpha^TDK(A^T, x) + b
\end{equation}
Part Ⅲ
以下给出优化相关协助信息. 拟设式(\ref{eq_11})之目标函数符号为$J$, 并令:
\begin{equation}
J = J_1 + J_2\quad\quad\text{其中,}
\begin{cases}
J_1 = \frac{1}{2}\alpha^T\alpha \\
J_2 = \frac{c}{2}\left\| p(1 - DK(A^T, A)D\alpha - D1b, \beta)\right\|_2^2
\end{cases}
\end{equation}
$J_1$相关:
\begin{gather*}
\frac{\partial J_1}{\partial \alpha} = \alpha \quad \frac{\partial J_1}{\partial b} = 0 \\
\frac{\partial^2J_1}{\partial\alpha^2} = I \quad \frac{\partial^2J_1}{\partial\alpha\partial b} = 0 \quad \frac{\partial^2J_1}{\partial b\partial\alpha} = 0 \quad \frac{\partial^2J_1}{\partial b^2} = 0
\end{gather*}
\begin{gather*}
\nabla J_1 = \begin{bmatrix} \frac{\partial J_1}{\partial \alpha} \\ \frac{\partial J_1}{\partial b}\end{bmatrix} \\ \\
\nabla^2 J_1 = \begin{bmatrix}\frac{\partial^2J_1}{\partial\alpha^2} & \frac{\partial^2J_1}{\partial\alpha\partial b} \\ \frac{\partial^2J_1}{\partial b\partial\alpha} & \frac{\partial^2J_1}{\partial b^2}\end{bmatrix}
\end{gather*}
$J_2$相关:
\begin{gather*}
z_i = 1 - \sum_{j}K_{ij}\bar{y}^{(i)}\bar{y}^{(j)}\alpha_j - \bar{y}^{(i)}b \\
\frac{\partial J_2}{\partial \alpha_k} = -c\sum_{i}\bar{y}^{(i)}\bar{y}^{(k)}K_{ik}p(z_i, \beta)s(z_i, \beta) \\
\frac{\partial J_2}{\partial b} = -c\sum_{i}\bar{y}^{(i)}p(z_i, \beta)s(z_i, \beta) \\
\frac{\partial^2 J_2}{\partial \alpha_k\partial \alpha_l} = c\sum_{i}\bar{y}^{(k)}\bar{y}^{(l)}K_{ik}K_{il}[s^2(z_i, \beta) + p(z_i, \beta)\delta(z_i, \beta)] \\
\frac{\partial^2 J_2}{\partial \alpha_k\partial b} = c\sum_{i}\bar{y}^{(k)}K_{ik}[s^2(z_i, \beta) + p(z_i, \beta)\delta(z_i, \beta)] \\
\frac{\partial^2 J_2}{\partial b\partial\alpha_l} = c\sum_{i}\bar{y}^{(l)}K_{il}[s^2(z_i, \beta) + p(z_i, \beta)\delta(z_i, \beta)] \\
\frac{\partial^2 J_2}{\partial b^2} = c\sum_{i}[s^2(z_i, \beta) + p(z_i, \beta)\delta(z_i, \beta)]
\end{gather*}
\begin{gather*}
\nabla J_2 = \begin{bmatrix} \frac{\partial J_2}{\partial\alpha_k} \\ \frac{\partial J_2}{\partial b} \end{bmatrix} \\ \\
\nabla^2 J_2 = \begin{bmatrix} \frac{\partial^2 J_2}{\partial\alpha_k\partial\alpha_l} & \frac{\partial^2 J_2}{\partial\alpha_k\partial b} \\ \frac{\partial^2 J_2}{\partial b\partial\alpha_l} & \frac{\partial^2 J_2}{\partial b^2} \end{bmatrix}
\end{gather*}
目标函数$J$之gradient及Hessian如下:
\begin{gather}
\nabla J = \nabla J_1 + \nabla J_2 \\
H = \nabla^2 J = \nabla^2 J_1 + \nabla^2 J_2
\end{gather}
Part Ⅳ
现以二分类为例进行算法实施:
\begin{equation*}
\begin{cases}
h(x) \geq 0 & y = 1 & \text{正例} \\
h(x) < 0 & y = -1 & \text{负例}
\end{cases}
\end{equation*} - 代码实现:
1 # Smooth Support Vector Machine之实现 2 3 import numpy 4 from matplotlib import pyplot as plt 5 6 7 def spiral_point(val, center=(0, 0)): 8 rn = 0.4 * (105 - val) / 104 9 an = numpy.pi * (val - 1) / 25 10 11 x0 = center[0] + rn * numpy.sin(an) 12 y0 = center[1] + rn * numpy.cos(an) 13 z0 = -1 14 x1 = center[0] - rn * numpy.sin(an) 15 y1 = center[1] - rn * numpy.cos(an) 16 z1 = 1 17 18 return (x0, y0, z0), (x1, y1, z1) 19 20 21 def spiral_data(valList): 22 dataList = list(spiral_point(val) for val in valList) 23 data0 = numpy.array(list(item[0] for item in dataList)) 24 data1 = numpy.array(list(item[1] for item in dataList)) 25 return data0, data1 26 27 28 # 生成训练数据集 29 trainingValList = numpy.arange(1, 101, 1) 30 trainingData0, trainingData1 = spiral_data(trainingValList) 31 trainingSet = numpy.vstack((trainingData0, trainingData1)) 32 # 生成测试数据集 33 testValList = numpy.arange(1.5, 101.5, 1) 34 testData0, testData1 = spiral_data(testValList) 35 testSet = numpy.vstack((testData0, testData1)) 36 37 38 class SSVM(object): 39 40 def __init__(self, trainingSet, c=1, mu=1, beta=100): 41 self.__trainingSet = trainingSet # 训练集数据 42 self.__c = c # 误差项权重 43 self.__mu = mu # gaussian kernel参数 44 self.__beta = beta # 光滑化参数 45 46 self.__A, self.__D = self.__get_AD() 47 48 49 def get_cls(self, x, alpha, b): 50 A, D = self.__A, self.__D 51 mu = self.__mu 52 53 x = numpy.array(x).reshape((-1, 1)) 54 KAx = self.__get_KAx(A, x, mu) 55 clsVal = self.__calc_hVal(KAx, D, alpha, b) 56 if clsVal > 0: 57 return 1 58 elif clsVal == 0: 59 return 0 60 else: 61 return -1 62 63 64 def get_accuracy(self, dataSet, alpha, b): 65 ''' 66 正确率计算 67 ''' 68 rightCnt = 0 69 for row in dataSet: 70 clsVal = self.get_cls(row[:2], alpha, b) 71 if clsVal == row[2]: 72 rightCnt += 1 73 accuracy = rightCnt / dataSet.shape[0] 74 return accuracy 75 76 77 def optimize(self, maxIter=100, epsilon=1.e-9): 78 ''' 79 maxIter: 最大迭代次数 80 epsilon: 收敛判据, 梯度趋于0则收敛 81 ''' 82 A, D = self.__A, self.__D 83 c = self.__c 84 mu = self.__mu 85 beta = self.__beta 86 87 alpha, b = self.__init_alpha_b((A.shape[1], 1)) 88 KAA = self.__get_KAA(A, mu) 89 90 JVal = self.__calc_JVal(KAA, D, c, beta, alpha, b) 91 grad = self.__calc_grad(KAA, D, c, beta, alpha, b) 92 Hess = self.__calc_Hess(KAA, D, c, beta, alpha, b) 93 94 for i in range(maxIter): 95 if self.__converged1(grad, epsilon): 96 return alpha, b, True 97 98 dCurr = -numpy.matmul(numpy.linalg.inv(Hess), grad) 99 ALPHA = self.__calc_ALPHA_by_ArmijoRule(alpha, b, JVal, grad, dCurr, KAA, D, c, beta) 100 101 delta = ALPHA * dCurr 102 alphaNew = alpha + delta[:-1, :] 103 bNew = b + delta[-1, -1] 104 JValNew = self.__calc_JVal(KAA, D, c, beta, alphaNew, bNew) 105 if self.__converged2(delta, JValNew - JVal, epsilon): 106 return alphaNew, bNew, True 107 108 alpha, b, JVal = alphaNew, bNew, JValNew 109 grad = self.__calc_grad(KAA, D, c, beta, alpha, b) 110 Hess = self.__calc_Hess(KAA, D, c, beta, alpha, b) 111 else: 112 if self.__converged1(grad, epsilon): 113 return alpha, b, True 114 return alpha, b, False 115 116 117 def __converged2(self, delta, JValDelta, epsilon): 118 val1 = numpy.linalg.norm(delta) 119 val2 = numpy.linalg.norm(JValDelta) 120 if val1 <= epsilon or val2 <= epsilon: 121 return True 122 return False 123 124 125 def __calc_ALPHA_by_ArmijoRule(self, alphaCurr, bCurr, JCurr, gCurr, dCurr, KAA, D, c, beta, C=1.e-4, v=0.5): 126 i = 0 127 ALPHA = v ** i 128 delta = ALPHA * dCurr 129 alphaNext = alphaCurr + delta[:-1, :] 130 bNext = bCurr + delta[-1, -1] 131 JNext = self.__calc_JVal(KAA, D, c, beta, alphaNext, bNext) 132 while True: 133 if JNext <= JCurr + C * ALPHA * numpy.matmul(dCurr.T, gCurr)[0, 0]: break 134 i += 1 135 ALPHA = v ** i 136 delta = ALPHA * dCurr 137 alphaNext = alphaCurr + delta[:-1, :] 138 bNext = bCurr + delta[-1, -1] 139 JNext = self.__calc_JVal(KAA, D, c, beta, alphaNext, bNext) 140 return ALPHA 141 142 143 def __converged1(self, grad, epsilon): 144 if numpy.linalg.norm(grad) <= epsilon: 145 return True 146 return False 147 148 149 def __p(self, x, beta): 150 val = numpy.log(numpy.exp(beta * x) + 1) / beta 151 return val 152 153 154 def __s(self, x, beta): 155 val = 1 / (numpy.exp(-beta * x) + 1) 156 return val 157 158 159 def __d(self, x, beta): 160 term = numpy.exp(beta * x) 161 val = beta * term / (term + 1) ** 2 162 return val 163 164 165 def __calc_Hess(self, KAA, D, c, beta, alpha, b): 166 Hess_J1 = self.__calc_Hess_J1(alpha) 167 Hess_J2 = self.__calc_Hess_J2(KAA, D, c, beta, alpha, b) 168 Hess = Hess_J1 + Hess_J2 169 return Hess 170 171 172 def __calc_Hess_J2(self, KAA, D, c, beta, alpha, b): 173 Hess_J2 = numpy.zeros((KAA.shape[0] + 1, KAA.shape[0] + 1)) 174 Y = numpy.matmul(D, numpy.ones((D.shape[0], 1))) 175 YY = numpy.matmul(Y, Y.T) 176 KAAYY = KAA * YY 177 178 z = 1 - numpy.matmul(KAAYY, alpha) - Y * b 179 p = numpy.array(list(self.__p(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1)) 180 s = numpy.array(list(self.__s(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1)) 181 d = numpy.array(list(self.__d(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1)) 182 term = s * s + p * d 183 184 for k in range(Hess_J2.shape[0] - 1): 185 for l in range(k + 1): 186 val = c * Y[k, 0] * Y[l, 0] * numpy.sum(KAA[:, k:k+1] * KAA[:, l:l+1] * term) 187 Hess_J2[k, l] = Hess_J2[l, k] = val 188 val = c * Y[k, 0] * numpy.sum(KAA[:, k:k+1] * term) 189 Hess_J2[k, -1] = Hess_J2[-1, k] = val 190 val = c * numpy.sum(term) 191 Hess_J2[-1, -1] = val 192 return Hess_J2 193 194 195 def __calc_Hess_J1(self, alpha): 196 I = numpy.identity(alpha.shape[0]) 197 term = numpy.hstack((I, numpy.zeros((I.shape[0], 1)))) 198 Hess_J1 = numpy.vstack((term, numpy.zeros((1, term.shape[1])))) 199 return Hess_J1 200 201 202 def __calc_grad(self, KAA, D, c, beta, alpha, b): 203 grad_J1 = self.__calc_grad_J1(alpha) 204 grad_J2 = self.__calc_grad_J2(KAA, D, c, beta, alpha, b) 205 grad = grad_J1 + grad_J2 206 return grad 207 208 209 def __calc_grad_J2(self, KAA, D, c, beta, alpha, b): 210 grad_J2 = numpy.zeros((KAA.shape[0] + 1, 1)) 211 Y = numpy.matmul(D, numpy.ones((D.shape[0], 1))) 212 YY = numpy.matmul(Y, Y.T) 213 KAAYY = KAA * YY 214 215 z = 1 - numpy.matmul(KAAYY, alpha) - Y * b 216 p = numpy.array(list(self.__p(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1)) 217 s = numpy.array(list(self.__s(z[i, 0], beta) for i in range(z.shape[0]))).reshape((-1, 1)) 218 term = p * s 219 220 for k in range(grad_J2.shape[0] - 1): 221 val = -c * Y[k, 0] * numpy.sum(Y * KAA[:, k:k+1] * term) 222 grad_J2[k, 0] = val 223 grad_J2[-1, 0] = -c * numpy.sum(Y * term) 224 return grad_J2 225 226 227 def __calc_grad_J1(self, alpha): 228 grad_J1 = numpy.vstack((alpha, [[0]])) 229 return grad_J1 230 231 232 def __calc_JVal(self, KAA, D, c, beta, alpha, b): 233 J1 = self.__calc_J1(alpha) 234 J2 = self.__calc_J2(KAA, D, c, beta, alpha, b) 235 JVal = J1 + J2 236 return JVal 237 238 239 def __calc_J2(self, KAA, D, c, beta, alpha, b): 240 tmpOne = numpy.ones((KAA.shape[0], 1)) 241 x = tmpOne - numpy.matmul(numpy.matmul(numpy.matmul(D, KAA), D), alpha) - numpy.matmul(D, tmpOne) * b 242 p = numpy.array(list(self.__p(x[i, 0], beta) for i in range(x.shape[0]))) 243 J2 = numpy.sum(p * p) * c / 2 244 return J2 245 246 247 def __calc_J1(self, alpha): 248 J1 = numpy.sum(alpha * alpha) / 2 249 return J1 250 251 252 def __get_KAA(self, A, mu): 253 KAA = numpy.zeros((A.shape[1], A.shape[1])) 254 for rowIdx in range(KAA.shape[0]): 255 for colIdx in range(rowIdx + 1): 256 x1 = A[:, rowIdx:rowIdx+1] 257 x2 = A[:, colIdx:colIdx+1] 258 val = self.__calc_gaussian(x1, x2, mu) 259 KAA[rowIdx, colIdx] = KAA[colIdx, rowIdx] = val 260 return KAA 261 262 263 def __init_alpha_b(self, shape): 264 ''' 265 alpha, b之初始化 266 ''' 267 alpha, b = numpy.zeros(shape), 0 268 return alpha, b 269 270 271 def __get_KAx(self, A, x, mu): 272 KAx = numpy.zeros((A.shape[1], 1)) 273 for rowIdx in range(KAx.shape[0]): 274 x1 = A[:, rowIdx:rowIdx+1] 275 val = self.__calc_gaussian(x1, x, mu) 276 KAx[rowIdx, 0] = val 277 return KAx 278 279 280 def __calc_hVal(self, KAx, D, alpha, b): 281 hVal = numpy.matmul(numpy.matmul(alpha.T, D), KAx)[0, 0] + b 282 return hVal 283 284 285 def __calc_gaussian(self, x1, x2, mu): 286 val = numpy.exp(-mu * numpy.linalg.norm(x1 - x2) ** 2) 287 # val = numpy.sum(x1 * x2) 288 return val 289 290 291 def __get_AD(self): 292 A = self.__trainingSet[:, :2].T 293 D = numpy.diag(self.__trainingSet[:, 2]) 294 return A, D 295 296 297 class SpiralPlot(object): 298 299 def spiral_data_plot(self, trainingData0, trainingData1, testData0, testData1): 300 fig = plt.figure(figsize=(5, 5)) 301 ax1 = plt.subplot() 302 ax1.scatter(trainingData1[:, 0], trainingData1[:, 1], c="red", marker="o", s=10, label="training data - Positive") 303 ax1.scatter(trainingData0[:, 0], trainingData0[:, 1], c="blue", marker="o", s=10, label="training data - Negative") 304 ax1.scatter(testData1[:, 0], testData1[:, 1], c="red", marker="x", s=10, label="test data - Positive") 305 ax1.scatter(testData0[:, 0], testData0[:, 1], c="blue", marker="x", s=10, label="test data - Negative") 306 ax1.set(xlim=(-0.5, 0.5), ylim=(-0.5, 0.5), xlabel="$x_1$", ylabel="$x_2$") 307 plt.legend(fontsize="x-small") 308 fig.tight_layout() 309 fig.savefig("spiral.png", dpi=100) 310 plt.close() 311 312 313 def spiral_pred_plot(self, trainingData0, trainingData1, testData0, testData1, ssvmObj, alpha, b): 314 x = numpy.linspace(-0.5, 0.5, 500) 315 y = numpy.linspace(-0.5, 0.5, 500) 316 x, y = numpy.meshgrid(x, y) 317 z = numpy.zeros(shape=x.shape) 318 for rowIdx in range(x.shape[0]): 319 for colIdx in range(x.shape[1]): 320 z[rowIdx, colIdx] = ssvmObj.get_cls((x[rowIdx, colIdx], y[rowIdx, colIdx]), alpha, b) 321 cls2color = {-1: "blue", 0: "white", 1: "red"} 322 323 fig = plt.figure(figsize=(5, 5)) 324 ax1 = plt.subplot() 325 ax1.contourf(x, y, z, levels=[-1.5, -0.5, 0.5, 1.5], colors=["blue", "white", "red"], alpha=0.3) 326 ax1.scatter(trainingData1[:, 0], trainingData1[:, 1], c="red", marker="o", s=10, label="training data - Positive") 327 ax1.scatter(trainingData0[:, 0], trainingData0[:, 1], c="blue", marker="o", s=10, label="training data - Negative") 328 ax1.scatter(testData1[:, 0], testData1[:, 1], c="red", marker="x", s=10, label="test data - Positive") 329 ax1.scatter(testData0[:, 0], testData0[:, 1], c="blue", marker="x", s=10, label="test data - Negative") 330 ax1.set(xlim=(-0.5, 0.5), ylim=(-0.5, 0.5), xlabel="$x_1$", ylabel="$x_2$") 331 plt.legend(loc="upper left", fontsize="x-small") 332 fig.tight_layout() 333 fig.savefig("pred.png", dpi=100) 334 plt.close() 335 336 337 338 if __name__ == "__main__": 339 ssvmObj = SSVM(trainingSet, c=0.1, mu=250, beta=100) 340 alpha, b, tab = ssvmObj.optimize() 341 accuracy1 = ssvmObj.get_accuracy(trainingSet, alpha, b) 342 print("Accuracy on trainingSet is {}%".format(accuracy1 * 100)) 343 accuracy2 = ssvmObj.get_accuracy(testSet, alpha, b) 344 print("Accuracy on testSet is {}%".format(accuracy2 * 100)) 345 346 spObj = SpiralPlot() 347 spObj.spiral_data_plot(trainingData0, trainingData1, testData0, testData1) 348 spObj.spiral_pred_plot(trainingData0, trainingData1, testData0, testData1, ssvmObj, alpha, b)
很显然, 此数据集非线性可分. - 结果展示:
此模型在训练集、测试集上的准确率均达到100%. - 使用建议:
①. 通过核函数映射至高维后, 支持向量可能存在很多个;
②. SSVM本质上求解的仍然是原问题, 此时$\alpha$仅作为变数变换引入, 无需满足KKT条件中对偶可行性;
③. 数据集是否需要归一化, 应按实际情况予以考虑. - 参考文档:
Yuh-Jye Lee and O. L. Mangasarian. "SSVM: A Smooth Support Vector Machine for Classification", Computational Optimization and Applications, 20, (2001) 5-22.
Smooth Support Vector Machine - Python实现
猜你喜欢
转载自www.cnblogs.com/xxhbdk/p/12275567.html
今日推荐
周排行