机器学习(Andrew Ng)学习笔记(第13~15章)

支持向量机

SVM的代价函数

首先回顾不带正则化的逻辑回归的代价函数:

\[J(\theta)=\frac 1 m \sum_{i=1}^m[-y^{(i)}log(h_\theta(X^{(i)}))-(1-y^{(i)})log(1-h_\theta(X^{(i)}))]\]

\[J(\theta)=\frac 1 m \sum_{i=1}^m[-y^{(i)}log(Sigmoid(\theta^TX^{(i)}))-(1-y^{(i)})log(1-Sigmoid(\theta^TX^{(i)}))]\]

\(z=\theta^TX^{(i)}\),则\(-log(Sigmoid(\theta^TX^{(i)})),-log(1-Sigmoid(\theta^TX^{(i)}))\)的图像如下所示:

\(cost_1(\theta^TX^{(i)})=-log(Sigmoid(\theta^TX^{(i)}))\),\(cost_0(\theta^TX^{(i)})=-log(1-Sigmoid(\theta^TX^{(i)}))\),分别代表\(y^{(i)}=1,0\)时的损失。

则优化目标为:

\[\arg \min _{\theta} J(\theta)=\frac 1 m \sum_{i=1}^m[y^{(i)}cost_1(\theta^TX^{(i)})+(1-y^{(i)})cost_0(\theta^TX^{(i)})]\]

补上正则化:

\[\arg \min _{\theta} J(\theta)=\frac 1 m \sum_{i=1}^m[y^{(i)}cost_1(\theta^TX^{(i)})+(1-y^{(i)})cost_0(\theta^TX^{(i)})]+\frac \lambda {2m}\sum_{i=1}^n \theta_i^2\]

其中,\(m\)是常数,可以省去:

\[\arg \min _{\theta} J(\theta)=\sum_{i=1}^m[y^{(i)}cost_1(\theta^TX^{(i)})+(1-y^{(i)})cost_0(\theta^TX^{(i)})]+\frac \lambda {2}\sum_{i=1}^n \theta_i^2\]

\(C=\frac 1 \lambda\)代入,可以改写为

\[\arg \min _{\theta} J(\theta)=C\sum_{i=1}^m[y^{(i)}cost_1(\theta^TX^{(i)})+(1-y^{(i)})cost_0(\theta^TX^{(i)})]+\frac 1 {2}\sum_{i=1}^n \theta_i^2\]

猜你喜欢

转载自www.cnblogs.com/qpswwww/p/9291626.html