Machine Learning 05: Support Vector Machines 2

This article is from the Sync Blog .

PS I don't know how Jianshu can display mathematical formulas and better typesetting content. So if you feel that the format below the article is messy, please jump to the above link by yourself. In the future, I will no longer take screenshots of mathematical formulas. After all, the layout of inline formula screenshots will be messy. Looking at the original blog address will have a better experience.

The previous article introduced the basic principles of support vector machines in machine learning, and at the end of the article introduced a Pythonmethod for solving the extremum of a binomial programming problem. In this article, I will use this method to solve the $-\vec{\alpha}-$, $-\vec{w}-$, $-b-$ mentioned above step by step, so as to review and verify the support Knowledge points of vector machines.

data

Let's look at a set of test data:

data = {
    '+': [
        [1, 7],
        [2, 8],
        [3, 8],
        [2, 6.5]
    ],
    '-': [
        [5, 1],
        [6, -1],
        [7, 3]
    ]
}

Data datais data that has two types that have been classified, and the elements of each type of data are two-dimensional vectors that can be represented in a Cartesian coordinate system.

According to the principle described in the previous article, we need to use these data to solve a $-\vec{\alpha}-$ vector. That is, we need to solve the $-\vec{\alpha}-$ vector that minimizes the value of the binomial programming equation:
$$
F(\vec{\alpha}) = \frac{1}{2}\vec{\ alpha}_{T}H\vec{\alpha} + \vec{c}\vec{\alpha} + c_0, \vec{y}^{T}\vec{\alpha} = 0, \vec{\ alpha} \ge 0
$$

Obviously, in SVM, $-c_0 = 0-$.

Parametric Solver

dataFirst prepare the variables $-H, c, c_0-$ appearing in the above equation using the input test . Refer to the following code:

def parseXYC(d):
    X = []
    y = []
    c = []
    for _, v in enumerate(d['+']): X.append(np.array(v)) y.append(1) c.append(-1) for _, v in enumerate(d['-']): X.append(np.array(v)) y.append(-1) c.append(-1) return X, y, c, 0 X, y, c, c0 = parseXYC(data) 

parseXYCThe function dataformats it as $-X, y, c, c_0-$.

Then calculate the value of the $-H-$ matrix. Relatively simple, one line of code can get:

H = np.array([y[i] * y[j] * np.dot(X[i], X[j]) for i in range(len(X)) for j in range(len(X))]).reshape(len(X), len(X))

Solve for $-\vec{\alpha}-$

All the data is ready, the next step is to bring it into the optimize.minimizefunction to calculate the result.

Here are a few difficulties that are beyond the scope of this article to briefly mention:

  1. optimize.minimizeThe method used by the function to solve the binomial program SLSQPrequires both the Jacobian derivative of the binomial equation and the Jacobian derivative of the constraint function. Not knowing this has caused me to be unable to solve for the correct value during testing.
  2. Inequality constraints $-\vec{\alpha} \ge 0-$ cannot be constraintspassed to a optimize.minimizefunction as a constraint parameter. I'm guessing it's because I've constructed the inequality parameters wrong, so I can't get the inequality constraints to work. I haven't been able to solve this problem yet, I hope students who understand this problem can leave a message to enlighten me. As a remedy, I boundsdescribe the inequality $-\vec{\alpha} \ge 0-$ using the boundary constraint parameters.
  3. In the solved $-\vec{\alpha}-$ vector, some elements that should be 0 cannot be completely accurate to 0. I observed that the accuracy of the test results should be 1e-16, so I assume that the value under the accuracy of the negative 16th power is 0. After drawing verification, I found this assumption to be reasonable.

The following code is implemented:

# 定义二项规划方程fun及其雅各比方程jac
def fun(x, sign=1.): return sign * (0.5 * np.dot(x.T, np.dot(H, x))+ np.dot(c, x) + c0) def jac(x, sign=1.): return sign * (np.dot(x.T, H) + c) # 定义等式约束条件方程feq及其雅各比方程jeq def feq(x): return np.dot(y, x) def jeq(x): return np.array(y) # 生成相关参数 diff = 1e-16 bounds = [(0, None) for _ in range(len(y))] # x >= 0 constraints = [{ 'type': 'eq', 'fun': feq, 'jac': jeq }]# y*x = 0 options = { 'ftol': diff, 'disp': True } guess = np.array([0 for _ in range(len(X))]) # 计算结果 res_cons = optimize.minimize(fun, guess, method='SLSQP', jac=jac, bounds=bounds, constraints=constraints, options=options) alpha = [ 0 if abs(x - 0) <= diff else x for x in res_cons.x ] # 输出结果与校验y*alpha的值是否为0 print('raw alpha: ', res_cons.x) print('fmt alpha: ', alpha) print('check y*alpha: ', 'is 0'if (abs(np.dot(y, res_cons.x) - 0) < diff ) else 'is not 0') 

Solve $-\vec{w}-$ and $-b-$

# 计算w = sum(xi*yi*Xi)
w = np.sum([ np.array([0, 0]) if alpha[i] == 0 else (alpha[i] * y[i] * X[i]) for i in range(len(alpha))], axis=0) print('w: ', w) # 计算b,对support vector有:yi(w*xi + b) = 1,既有:b = 1/yi - w*xi B = [( 0 if alpha[i] == 0 else ( 1 / y[i] - np.dot(w, X[i]) ) ) for i in range(len(alpha))] B = list(filter(lambda x: x != 0, B)) b = 0 if len(B) <= 0 else B[0] print('b: ', b) 

At this point, the parameter solving process of the support vector machine is completed.

The running result is shown in the following figure:

 

 
operation result

drawing

Finally plot the data as an image.

limit = 11
plt.xlim(-2, limit)
plt.ylim(-2, limit)
# 绘制数据点
[plt.scatter(X[i][0],X[i][1], s=100, color=('r' if y[i] > 0 else 'y')) for i in range(len(X))] # 绘制分割超平面L: wx + b = 0 plt.plot([i for i in range(limit)], [(-b - w[0]*i)/w[1] for i in range(limit)]) # 绘制上下边: wx + b = 1/-1 plt.plot([i for i in range(limit)], [(1-b - w[0]*i)/w[1] for i in range(limit)]) plt.plot([i for i in range(limit)], [(-1-b - w[0]*i)/w[1] for i in range(limit)]) plt.show() 

The effect is as shown below. The red dots are '+' samples and the green dots are '-' samples. The blue line in the middle is the standard line for classification. Boundary lines, i.e. the red and green lines respectively pass through the points in their respective categories that are closest to the taxonomy line. These points are support vectors, and only the $-vec{\alpha}-$ components corresponding to these vectors are non-zero values.

 
image

The source code of this article

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325253082&siteId=291194637
Recommended