Machine learning practice----SKLearn implements SVM

 

1. Introduction

win10, notebook ,python 3.6

 

Support vector machine summary

We see here a simple and intuitive introduction to the principles behind support vector machines. These methods are powerful classification methods for many reasons:

  • They rely on relatively few support vectors, meaning they are very compact models and take up little memory.
  • Once the model is trained, the prediction phase is very fast.
  • Because they are only affected by points near edges, they are suitable for high-dimensional data, even data with dimensions larger than the sample, which is a challenge for other algorithms.
  • The integration of kernel methods makes them very versatile and able to adapt to many types of data.

However, SVM also has several disadvantages:

  • In the worst case, the sample number Ncomplexity is O(N^3), and for an efficient implementation, is O(N^2). For large numbers of training samples, this computational cost can be prohibitive.
  • The results strongly depend on Cthe appropriate choice of softening parameters. This must be chosen carefully through cross-validation, and as the dataset grows, so does the overhead.
  • The results have no direct probabilistic interpretation. This can be estimated by internal cross-validation (see the probabilistic parameter of SVC), but this additional estimation is expensive.

Given these characteristics, I generally only consider SVMs as long as other simpler, faster, and less tuned methods are not sufficient for my needs. However, if you devote enough CPU cycles to training and validating your data using an SVM, this approach can work very well.

 

reference:

python data analysis manual

https://jakevdp.github.io/PythonDataScienceHandbook/05.07-support-vector-machines.html

https://www.kesci.com/home/project/5be0480f954d6e0010618cef/code

Translation of github:

https://www.jianshu.com/p/864adfd2f795

 

2. Simple Linear SVM

1. First generate data

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns; sns.set()



# 随机来点数据,n_samples:50个样本点,centers:中心数,random_state:随机种子,
# cluster_std:簇离散程度,

from sklearn.datasets.samples_generator import make_blobs
X, y = make_blobs(n_samples=50, centers=2,
                  random_state=0, cluster_std=0.60)
# 数据散点图
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')

# print(X)
# print(y)

 As shown in the picture:

 

2. Build a model

# 导入模型,使用线性核,将C参数设置为一个非常大的数值
from sklearn.svm import SVC # "Support vector classifier"
model = SVC(kernel='linear',C=1E10)
# 数据传入SVM模型
model.fit(X, y)

Auxiliary drawing functions:

def plot_svc_decision_function(model, ax=None, plot_support=True):
    """Plot the decision function for a 2D SVC"""
    if ax is None:
        ax = plt.gca()
    xlim = ax.get_xlim()
    ylim = ax.get_ylim()
    
    # create grid to evaluate model
    x = np.linspace(xlim[0], xlim[1], 30)
    y = np.linspace(ylim[0], ylim[1], 30)
    Y, X = np.meshgrid(y, x)
    xy = np.vstack([X.ravel(), Y.ravel()]).T
    
    P = model.decision_function(xy).reshape(X.shape)
    
    # plot decision boundary and margins
    # ax.contour在这里画的是三条等高线
    # 像levels,alpha这些参数,都可以调节一下,看一下有什么变化
    ax.contour(X, Y, P, colors='k',
               levels=[-1, 0, 1], alpha=0.5,
               linestyles=['--', '-', '--'])
    
    # plot support vectors
    # 下面的操作是画出距离分界线最近的点
    if plot_support:
        ax.scatter(model.support_vectors_[:, 0],
                   model.support_vectors_[:, 1],
                   s=300, linewidth=1, facecolors='none');
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)

3. Result

plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
plot_svc_decision_function(model)

Support vectors:

 

# 支持向量
model.support_vectors_
array([[0.44359863, 3.11530945],
       [2.33812285, 3.43116792],
       [2.06156753, 1.96918596]])

4. Try changing the data set

Observation shows that we can build the model only with support vectors

Next, let’s try it with different numbers of data points to see if the effect changes.

Using 60 and 120 data points respectively

def plot_svm(N=10, ax=None):
    X, y = make_blobs(n_samples=200, centers=2,
                      random_state=0, cluster_std=0.60)
    X = X[:N]
    y = y[:N]
    model = SVC(kernel='linear', C=1E10)
    model.fit(X, y)
    
    ax = ax or plt.gca()
    ax.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
    ax.set_xlim(-1, 4)
    ax.set_ylim(-1, 6)
    plot_svc_decision_function(model, ax)

fig, ax = plt.subplots(1, 2, figsize=(16, 6))
fig.subplots_adjust(left=0.0625, right=0.95, wspace=0.1)
for axi, N in zip(ax, [60, 120]):
    plot_svm(N, axi)
    axi.set_title('N = {0}'.format(N))

The left side is the result of 60 points, and the right side is the result of 120 points. It
is observed that as long as the support vector does not change, it does not matter how other data is added!

 

5. Tips for using Notebook

Notebook, use IPython's interactive widget to interactively view this feature of an SVM model:

from ipywidgets import interact, fixed
interact(plot_svm, N=[10, 200], ax=fixed(None))

 

 

3. Kernel function SVM

1. Data

from sklearn.datasets.samples_generator import make_circles
X, y = make_circles(100, factor=.1, random_state=0,noise=.1)


plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')

2. Three-dimensional visualization

#加入了新的维度r
from mpl_toolkits import mplot3d
r = np.exp(-(X ** 2).sum(1))
def plot_3D(elev=30, azim=30, X=X, y=y):
    ax = plt.subplot(projection='3d')
    ax.scatter3D(X[:, 0], X[:, 1], r, c=y, s=50, cmap='autumn')
    ax.view_init(elev=elev, azim=azim)
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_zlabel('r')

plot_3D(elev=45, azim=45, X=X, y=y)

We can see that using this additional dimension, r = 0.7the data can be linearly separated by drawing the separation plane at .

Here we must choose and carefully adjust our predictions:

If we didn't put the radial basis functions in the right place, we wouldn't see such clear linearly separable results.

In general, the need to make such a choice is a problem: we want to somehow automatically find the best basis functions to use.

One strategy for this is to compute basis functions centered on each point in the data set and let the SVM algorithm filter out the results. This type of basis function transformation is called a kernel transformation because it is based on the similarity relationship (or kernel) between each pair of points.

A potential problem with this strategy - Nprojecting n points onto Nn dimensions - is that Nit can become very computationally expensive as it grows. However, thanks to a neat little procedure called the kernel trick ,N fitting on kernel-transformed data can be done implicitly, that is, there is no need to construct a fully dimensional data representation for the kernel projection! This kernel trick is built into SVM and is one of the reasons why this method is so powerful.

 

 3. Model construction, adding radial basis functions

#加入径向基函数
clf = SVC(kernel='rbf', C=1E6)
clf.fit(X, y)

4. Draw

plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
plot_svc_decision_function(clf)
plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1],
            s=300, lw=1, facecolors='none');

 

Using this kernel support vector machine, we learn a suitable nonlinear decision boundary. This kernel transformation strategy is often used in machine learning!

 

 4. Adjust SVM soft spacing

SVM implements a softening factor, which "softens" the margins: that is, it allows certain points to enter the margins if a better match is allowed.

The hardness of the edge is controlled by an adjustment parameter, usually called C.

For very large ones C, the margins are hard and the points cannot enter.

For smaller ones C, the edges are softer and can be expanded and contain some points.

Adjust C parameters

  • When C approaches infinity: it means that there must be no errors in the classification strictly
  • When C tends to be small: it means there can be greater error tolerance

 

The optimal values ​​for the parameters Cwill depend on your data set and should be tuned using cross-validation or a similar process

 

1. Data

X, y = make_blobs(n_samples=100, centers=2,
                  random_state=0, cluster_std=0.8)
plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn');

2. C= 10, and C= 0.1

fig, ax = plt.subplots(1, 2, figsize=(16, 6))
fig.subplots_adjust(left=0.0625, right=0.95, wspace=0.1)

for axi, C in zip(ax, [10.0, 0.1]):
    model = SVC(kernel='linear', C=C).fit(X, y)
    axi.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
    plot_svc_decision_function(model, axi)
    axi.scatter(model.support_vectors_[:, 0],
                model.support_vectors_[:, 1],
                s=300, lw=1, facecolors='none');
    axi.set_title('C = {0:.1f}'.format(C), size=14)

 

3、gama = 10, 与gama = 0.1

X, y = make_blobs(n_samples=100, centers=2,
                  random_state=0, cluster_std=1.1)

fig, ax = plt.subplots(1, 2, figsize=(16, 6))
fig.subplots_adjust(left=0.0625, right=0.95, wspace=0.1)

for axi, gamma in zip(ax, [10.0, 0.1]):
    model = SVC(kernel='rbf', gamma=gamma).fit(X, y)
    axi.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='autumn')
    plot_svc_decision_function(model, axi)
    axi.scatter(model.support_vectors_[:, 0],
                model.support_vectors_[:, 1],
                s=300, lw=1, facecolors='none');
    axi.set_title('gamma = {0:.1f}'.format(gamma), size=14)

 

 

5. SVM implements face recognition

Use labeled faces from the Wild dataset, which contains thousands of collated photos of various public figures. Getters for datasets are built into Scikit-Learn.

 

Guess you like

Origin blog.csdn.net/bailixuance/article/details/85040651