1. 学习目标

2. 重要知识点

3. 拓展练习题

学习目标

了解分类任务和常用分类器
了解支持向量机的理论
掌握支持向量机的使用

重要知识点

分类任务和常用分类器

Classification（分类任务）：参考维基百科

Classifiers: Generative Model, Discriminative Model
支持向量机的理论

Support Vector Machine（支持向量机）

Margin：从线到两个类别中离线最近的点的距离

Support Vector（支持向量）：这两个离线最近的点

kernel: linear(Hard-margin, Soft-margin), RBF(Radial Basis Function), ...

penalty

http://www.cs.tufts.edu/~roni/Teaching/CLT2008S/LN/lecture21-22.pdf

https://blog.csdn.net/liugan528/article/details/79448379
支持向量机的使用

sklearn.svm.SVC

课内练习题【本节课只有课内练习题，都要做完哦~】

In [1]:

import numpy as np
from sklearn import datasets

# Load the iris dataset
iris = datasets.load_iris()
X = iris.data[:, :2]
y = iris.target
# Use the first two features
X = X[y != 0, :2]
# 对X值进行了筛选，去掉了，与X对应的标签值为0的数据
y = y[y != 0]
n_sample = len(X)
np.random.seed(0)
order = np.random.permutation(n_sample)
X = X[order]
y = y[order]

# Split the data into training/testing sets
X_train, X_test = X[:int(.9 * n_sample)], X[int(.9 * n_sample):]
y_train, y_test = y[:int(.9 * n_sample)], y[int(.9 * n_sample):]

kernels = ['linear', 'rbf']
print('y_test {}'.format(y_test))

y_test [2 1 1 2 1 2 2 2 1 1]

练习题 1

加载 Iris 数据集的前两维特征和后两类的数据，使用其中 90% 的数据训练 kernel 分别为 linear 和 rbf 的支持向量机，并对剩下的 10% 数据进行预测，输出预测结果

In [9]:

# your code here

linear: [2 2 1 1 1 1 2 1 1 1]
rbf: [2 2 1 1 1 1 2 1 1 1]

练习题 2

计算并输出上题中两个模型的预测结果与 y_test 的准确率和混淆矩阵

In [10]:

# your code here

for linear kernel, the accurary is 0.6
[[4 1]
 [3 2]]


for rbf kernel, the accurary is 0.6
[[4 1]
 [3 2]]

练习题 3

参考视频中的绘图风格，在二维平面上画出数据集中的点，用黑色圆圈标出测试集的点，并画出上题中两个模型的分割线

In [6]:

# your code here

In [5]:

# your code here

第七课大纲_科小神成长计划