[Machine] Fisher linear discriminant learning with linear Perceptron

Follow the footsteps of bloggers, a little progress every day

This article describes the classification model thought two classical classification, Fisher linear discriminant: mapping data from the original one-dimensional space, so that the compact class separation between classes (one-dimensional distance measure); linear perceptron: Data Looking for a separating hyperplane data (point using a distance metric from the hyperplane) at a certain space, both corresponding support vector machines and neural networks text model.

Author | Wenjie

Edit | yuquanle

Fisher Linear Discriminant linear perceptron

Fisher linear discriminant and Linear Perceptron for classification tasks are, especially in the second classification, the two have in common that are linear classifiers, except that the idea of ​​building a classifier, but both have the same purpose. At the same time we both can be compared with logistic regression, of course, the theoretical basis of logistic regression of the probability.

A, Fisher Linear Discriminant


Fisher linear discriminant is a linear classification idea is to find the core of a d-dimensional projection data of the projection direction (dimension reduction) to a dimension such that the compact, meta-class separation. After determining the direction of projection, the classification decision also is not complete, we also need to cut-off point to divide different classes. In general rarely used Fisher linear discriminant classification as a model, is more ideological Fisher linear discriminant reference to guide dimensionality reduction.

In order to analyze two types of problems:

Compact by the within-class separation between the projection direction is determined based criteria, we can define the distance between the distance within the following classes and categories:
 


Wherein the class represents a lowering of the center of the dimension, the class represents the center original space.

Within distance class:
 


among them:
 


It denotes the distance of each class is a class in the original space.

Distance between categories:
 


among them:
 

Each class represents a distance between the sample space of a primitive class.

In order to compact it within the class, the separation between the classes, the objective function can be maximized as follows:
 

Equivalent to:

By the Lagrange multipliers are:
 

Can be seen that the objective function is quadratic on w of a convex programming, extrema derivative is taken to 0, to have the derivative of formula:
 

If reversible, namely, that:
 

That is a feature vector. Will bring are:
 

It is a scalar, so:
 

Although we identified the projection direction, but the real decision-making function is still unable to determine the general easiest way is to find a threshold value directly on a one-dimensional direct separate categories, but how to determine the threshold required loss function defined classification (class consistency criterion).

比如我们直接采用0-1损失,那么决策界则尽可能多的正确分类样本,另外如果我们采用Logistic回归的对数损失,那么我们的决策边界就不一样。从这个角度来看,线性判别分析和Logistic回归都是将数据映射到一维来进行分类,有没有不用降维直接进行分类的方法,下面就是感知机的分类思想。
 

B、线性感知机
 

线性感知机同样是基于线性分类思想,其核心是直接在高维空间找到一个超平面将两类样本尽可能的分开。即,定义点到超平面的距离,在保证线性可分的基础上最小化点到超平面的距离(等价于使得最难分的样本离超平面距离尽可能的大),但由于没有线性可分前提,所以感知机的目标函数是最小化错分样本到超平面的距离之和(线性损失),而分类正确的样本无损失。
 

首先定义线性超平面:
 

点到超平面的距离为:
 

即错分的样本到超平面的距离一定为负:
 


感知机只考虑分类错误的样本,目标函数为最小化错分样本到超平面的距离之和:
 

其中M为错分样本集合。因为感知机优化的目标是针对错误样本集来不断的调整参数,所以在使用梯度下降算法的时候梯度的计算只依赖于错分的样本。

找到超平面之后我们就可以基于超平面定义决策函数为
 


 

再回过头看损失函数,我们发现感知机的损失与Logistic回归的对数损失不一样,感知机采用的损失直接是错误样本的值,而Logistic回归的损失是所有样本的对数损失(当然Logistic回归的)。
 

感知机的对偶形式:

感知机的对偶形式是logistic回归一致,同SVM对偶形式推导一致。

可以说神经网络和SVM都是线性感知机的一种延伸,神经网络是引入非线性激活函数,而SVM则是使用核函数,SVM同时提出了软间隔分类。虽然说神经网络是从感知机过来的,但是神经网络引入非线性激活函数后,不仅失去了解释性,也使其与感知机渐行渐远,笔者倒是觉得SVM更像感知机,不仅提升了精度,同时保留了很好的解释性。

The End

 

记得备注呦

往期精彩回顾

【机器学习】一文读懂线性回归、岭回归和Lasso回归

【机器学习】对数线性模型之Logistic回归、SoftMax回归和最大熵模型

【机器学习】知否?知否?广义线性模型

 

 

 

发布了3363 篇原创文章 · 获赞 36 · 访问量 14万+

Guess you like

Origin blog.csdn.net/cpongo9/article/details/103353077