第七周学习笔记
本周的主要学习工作
1.CS229
[课程地址]
第十九讲,微分动态规划
主要内容
Debug强化学习算法 假设我们要建立一个直升机的强化学习算法
建立一个直升机的模拟器
选择回报函数
使用强化学习算法,得到策略 假如表现很差,怎么办?
诊断 :
如果学习到的策略在模拟器中效果不错,但在现实中表现不佳,则是模拟器的问题
如果人类策略的值函数大于学习结果,则是强化学习方法的不佳
如果学习结果的值函数大于人类结果,但现实中表现不佳,则是回报函数的问题 1.升级模拟器 2.改进R 3.改进RL算法
线性二次调节控制(Linear-Quadratic Regulation Control)
微分动态规划(differential dynamic programming)
Kalman滤波 线性二次高斯控制(Linear-Quadratic Gaussian Control)
第二十讲,策略搜索
主要内容
POMDP
策略搜索(Policy search)
Pegasus,固定随机模拟器的伪随机序列
2.Problem Set #1
(a)
J
(
θ
)
=
1
2
(
X
θ
−
y
)
T
(
X
θ
−
y
)
J(\theta)=\dfrac{1}{2}(X\theta-y)^T(X\theta-y)
J ( θ ) = 2 1 ( X θ − y ) T ( X θ − y )
∇
=
∂
J
∂
θ
=
X
T
(
X
θ
−
y
)
\nabla=\dfrac{\partial{J}}{\partial{\theta}}=X^T(X\theta-y)
∇ = ∂ θ ∂ J = X T ( X θ − y )
H
=
∂
2
J
∂
θ
2
=
X
T
X
H=\dfrac{\partial{^2J}}{\partial{\theta}^2}=X^TX
H = ∂ θ 2 ∂ 2 J = X T X (b)对于任意的初始值
θ
0
\theta_0
θ 0
θ
1
=
θ
0
−
H
−
1
∇
=
(
X
T
X
)
−
1
X
T
y
\theta_1=\theta_0-H^{-1}\nabla=(X^TX)^{-1}X^Ty
θ 1 = θ 0 − H − 1 ∇ = ( X T X ) − 1 X T y 一次迭代即可收敛 2.已在之前的实验中完成 代码在这 3. ( a )
J
(
Θ
)
=
1
2
∥
Θ
T
X
−
y
∥
2
J(\Theta)=\dfrac{1}{2}\left\| \Theta^TX-y\right\|^2
J ( Θ ) = 2 1 ∥ ∥ Θ T X − y ∥ ∥ 2 ( b )
Θ
=
(
X
T
X
)
−
1
X
T
Y
\Theta=(X^TX)^{-1}X^TY
Θ = ( X T X ) − 1 X T Y ( c ) 对于任意j
θ
j
=
(
X
T
X
)
−
1
X
T
y
j
\theta_j=(X^TX)^{-1}X^Ty_j
θ j = ( X T X ) − 1 X T y j 写成矩阵形式为
Θ
=
(
X
T
X
)
−
1
X
T
Y
\Theta=(X^TX)^{-1}X^TY
Θ = ( X T X ) − 1 X T Y 即分别线性回归和一起线性回归的结果相同 4. ( a )
l
(
φ
)
=
l
o
g
∏
i
=
1
m
[
(
1
−
ϕ
y
)
p
(
x
(
i
)
∣
y
(
i
)
=
0
)
]
(
1
−
y
(
i
)
)
[
ϕ
y
p
(
x
(
i
)
∣
y
(
i
)
=
1
)
]
y
(
i
)
=
∑
i
=
1
m
(
1
−
y
(
i
)
)
(
l
o
g
(
1
−
ϕ
y
)
+
∑
j
=
1
n
(
x
j
l
o
g
ϕ
j
∣
y
=
0
+
(
1
−
x
j
)
l
o
g
(
1
−
ϕ
j
∣
y
=
0
)
)
)
+
y
(
i
)
(
l
o
g
ϕ
y
+
∑
j
=
1
n
x
j
l
o
g
ϕ
j
∣
y
=
1
+
(
1
−
x
j
)
l
o
g
(
1
−
ϕ
j
∣
y
=
1
)
)
\begin{aligned} l(\varphi)&=log\prod_{i=1}^{m}[(1-\phi_y)p(x^{(i)}|y^{(i)}=0)]^{(1-y^{(i)})}[\phi_{y}p(x^{(i)}|y^{(i)}=1)]^{y^{(i)}} \\ &=\sum_{i=1}^{m}(1-y^{(i)})\Bigl( log(1-\phi_y)+\sum_{j=1}^{n}\bigl( x_jlog\phi_{j|y=0}+(1-x_j)log(1-\phi_{j|y=0}) \bigr) \Bigr)\\&+y^{(i)}\Bigl({log\phi_y}+\sum_{j=1}^{n}x_jlog\phi_{j|y=1}+(1-x_j)log(1-\phi_{j|y=1})\Bigr) \end{aligned}
l ( φ ) = l o g i = 1 ∏ m [ ( 1 − ϕ y ) p ( x ( i ) ∣ y ( i ) = 0 ) ] ( 1 − y ( i ) ) [ ϕ y p ( x ( i ) ∣ y ( i ) = 1 ) ] y ( i ) = i = 1 ∑ m ( 1 − y ( i ) ) ( l o g ( 1 − ϕ y ) + j = 1 ∑ n ( x j l o g ϕ j ∣ y = 0 + ( 1 − x j ) l o g ( 1 − ϕ j ∣ y = 0 ) ) ) + y ( i ) ( l o g ϕ y + j = 1 ∑ n x j l o g ϕ j ∣ y = 1 + ( 1 − x j ) l o g ( 1 − ϕ j ∣ y = 1 ) ) ( b )
∂
l
(
ϕ
)
∂
ϕ
j
∣
y
=
0
=
∑
i
=
1
m
(
1
−
y
(
i
)
)
x
j
(
i
)
ϕ
j
∣
y
=
0
−
(
1
−
y
(
i
)
)
(
1
−
x
j
(
i
)
)
1
−
ϕ
j
∣
y
=
0
\dfrac{\partial{l(\phi)}}{\partial{\phi_{j|y=0}}}=\sum_{i=1}^{m}\dfrac{(1-y^{(i)})x_j^{(i)}}{\phi_{j|y=0}}-\dfrac{(1-y^{(i)})(1-x_j^{(i)})}{1-\phi_{j|y=0}}
∂ ϕ j ∣ y = 0 ∂ l ( ϕ ) = i = 1 ∑ m ϕ j ∣ y = 0 ( 1 − y ( i ) ) x j ( i ) − 1 − ϕ j ∣ y = 0 ( 1 − y ( i ) ) ( 1 − x j ( i ) )
∂
l
(
ϕ
)
∂
ϕ
j
∣
y
=
1
=
∑
i
=
1
m
y
(
i
)
x
j
(
i
)
ϕ
j
∣
y
=
1
−
y
(
i
)
(
1
−
x
j
)
1
−
ϕ
j
∣
y
=
1
\dfrac{\partial{l(\phi)}}{\partial{\phi_{j|y=1}}}=\sum_{i=1}^{m}\dfrac{y^{(i)}x_j^{(i)}}{\phi_{j|y=1}}-\dfrac{y^{(i)}(1-x_j)}{1-\phi_{j|y=1}}
∂ ϕ j ∣ y = 1 ∂ l ( ϕ ) = i = 1 ∑ m ϕ j ∣ y = 1 y ( i ) x j ( i ) − 1 − ϕ j ∣ y = 1 y ( i ) ( 1 − x j )
∂
l
(
ϕ
)
∂
ϕ
y
=
∑
i
=
1
m
−
1
−
y
(
i
)
1
−
ϕ
y
+
y
(
i
)
ϕ
y
\dfrac{\partial{l(\phi)}}{\partial{\phi_y}}=\sum_{i=1}^{m}-\dfrac{1-y^{(i)}}{1-\phi_y}+\dfrac{y^{(i)}}{\phi_y}
∂ ϕ y ∂ l ( ϕ ) = i = 1 ∑ m − 1 − ϕ y 1 − y ( i ) + ϕ y y ( i ) 令以上偏导数都等于0,得
ϕ
j
∣
y
=
0
=
∑
i
=
1
m
1
{
y
(
i
)
=
0
}
x
j
(
i
)
∑
i
=
1
m
1
{
y
(
i
)
=
0
}
\phi_{j|y=0}=\dfrac{\sum_{i=1}^{m}1\{ y^{(i)}=0\}x^{(i)}_j}{\sum_{i=1}^{m}1\{ y^{(i)}=0\}}
ϕ j ∣ y = 0 = ∑ i = 1 m 1 { y ( i ) = 0 } ∑ i = 1 m 1 { y ( i ) = 0 } x j ( i )
ϕ
j
∣
y
=
1
=
∑
i
=
1
m
1
{
y
(
i
)
=
1
}
x
j
(
i
)
∑
i
=
1
m
1
{
y
(
i
)
=
1
}
\phi_{j|y=1}=\dfrac{\sum_{i=1}^{m}1\{ y^{(i)}=1\}x^{(i)}_j}{\sum_{i=1}^{m}1\{ y^{(i)}=1\}}
ϕ j ∣ y = 1 = ∑ i = 1 m 1 { y ( i ) = 1 } ∑ i = 1 m 1 { y ( i ) = 1 } x j ( i )
ϕ
y
=
∑
i
=
1
m
y
(
i
)
m
\phi_y=\dfrac{\sum_{i=1}^{m}y^{(i)}}{m}
ϕ y = m ∑ i = 1 m y ( i ) ( c )
p
(
y
=
1
∣
x
)
≥
p
(
y
=
0
∣
x
)
⟺
p
(
x
∣
y
=
1
)
p
(
y
)
≥
p
(
x
∣
y
=
0
)
p
(
y
)
⟺
p
(
x
∣
y
=
1
)
ϕ
y
≥
p
(
x
∣
y
=
0
)
(
1
−
ϕ
y
)
⟺
∏
j
=
1
n
p
(
x
j
∣
y
=
1
)
ϕ
y
≥
∏
j
=
1
n
p
(
x
j
∣
y
=
0
)
(
1
−
ϕ
y
)
⟺
∏
j
=
1
n
ϕ
j
∣
y
=
1
x
j
(
1
−
ϕ
j
∣
y
=
1
)
1
−
x
j
ϕ
y
≥
∏
j
=
1
n
ϕ
j
∣
y
=
0
x
j
(
1
−
ϕ
j
∣
y
=
0
)
1
−
x
j
(
1
−
ϕ
y
)
⟺
∑
j
=
1
n
x
j
l
o
g
ϕ
j
∣
y
=
1
+
(
1
−
x
j
)
l
o
g
(
1
−
ϕ
j
∣
y
=
1
)
+
l
o
g
ϕ
y
1
−
ϕ
y
≥
∑
j
=
1
n
x
j
l
o
g
ϕ
j
∣
y
=
0
x
j
+
(
1
−
x
j
)
(
1
−
ϕ
j
∣
y
=
0
)
1
−
x
j
⟺
∑
j
=
1
n
x
j
l
o
g
ϕ
j
∣
y
=
1
ϕ
j
∣
y
=
0
+
(
1
−
x
j
)
l
o
g
1
−
ϕ
j
∣
y
=
1
1
−
ϕ
j
∣
y
=
0
+
l
o
g
ϕ
y
1
−
ϕ
y
≥
0
⟺
∑
j
=
1
n
x
j
l
o
g
ϕ
j
∣
y
=
1
(
1
−
ϕ
j
∣
y
=
0
)
ϕ
j
∣
y
=
0
(
1
−
ϕ
j
∣
y
=
1
)
+
∑
j
=
1
n
l
o
g
1
−
ϕ
j
∣
y
=
1
1
−
ϕ
j
∣
y
=
0
+
l
o
g
ϕ
y
1
−
ϕ
y
≥
0
\begin{aligned} p(y=1|x)\geq p(y=0|x)& \Longleftrightarrow p(x|y=1)p(y)\geq p(x|y=0)p(y)\\ &\Longleftrightarrow p(x|y=1)\phi_y\geq p(x|y=0)(1-\phi_y)\\ &\Longleftrightarrow \prod_{j=1}^{n}p(x_j|y=1)\phi_y\geq \prod_{j=1}^{n}p(x_j|y=0)(1-\phi_y)\\ &\Longleftrightarrow \prod_{j=1}^{n}\phi_{j|y=1}^{x_j}(1-\phi_{j|y=1})^{1-x_j}\phi_y\geq\prod_{j=1}^{n}\phi_{j|y=0}^{x_j}(1-\phi_{j|y=0})^{1-x_j}(1-\phi_y)\\ &\Longleftrightarrow \sum_{j=1}^{n}x_jlog\phi_{j|y=1}+(1-x_j)log(1-\phi_{j|y=1})+log\dfrac{\phi_y}{1-\phi_y}\geq\sum_{j=1}^{n}x_jlog\phi_{j|y=0}^{x_j}+(1-x_j)(1-\phi_{j|y=0})^{1-x_j}\\ &\Longleftrightarrow \sum_{j=1}^{n}x_jlog\dfrac{\phi_{j|y=1}}{\phi_{j|y=0}}+(1-x_j)log\dfrac{1-\phi_{j|y=1}}{1-\phi_{j|y=0}}+log\dfrac{\phi_y}{1-\phi_y}\geq 0\\ & \Longleftrightarrow \sum_{j=1}^{n}x_jlog\dfrac{\phi_{j|y=1}(1-\phi_{j|y=0})}{\phi_{j|y=0}(1-\phi_{j|y=1})}+\sum_{j=1}^{n}log\dfrac{1-\phi_{j|y=1}}{1-\phi_{j|y=0}}+log\dfrac{\phi_y}{1-\phi_y}\geq 0 \end{aligned}
p ( y = 1 ∣ x ) ≥ p ( y = 0 ∣ x ) ⟺ p ( x ∣ y = 1 ) p ( y ) ≥ p ( x ∣ y = 0 ) p ( y ) ⟺ p ( x ∣ y = 1 ) ϕ y ≥ p ( x ∣ y = 0 ) ( 1 − ϕ y ) ⟺ j = 1 ∏ n p ( x j ∣ y = 1 ) ϕ y ≥ j = 1 ∏ n p ( x j ∣ y = 0 ) ( 1 − ϕ y ) ⟺ j = 1 ∏ n ϕ j ∣ y = 1 x j ( 1 − ϕ j ∣ y = 1 ) 1 − x j ϕ y ≥ j = 1 ∏ n ϕ j ∣ y = 0 x j ( 1 − ϕ j ∣ y = 0 ) 1 − x j ( 1 − ϕ y ) ⟺ j = 1 ∑ n x j l o g ϕ j ∣ y = 1 + ( 1 − x j ) l o g ( 1 − ϕ j ∣ y = 1 ) + l o g 1 − ϕ y ϕ y ≥ j = 1 ∑ n x j l o g ϕ j ∣ y = 0 x j + ( 1 − x j ) ( 1 − ϕ j ∣ y = 0 ) 1 − x j ⟺ j = 1 ∑ n x j l o g ϕ j ∣ y = 0 ϕ j ∣ y = 1 + ( 1 − x j ) l o g 1 − ϕ j ∣ y = 0 1 − ϕ j ∣ y = 1 + l o g 1 − ϕ y ϕ y ≥ 0 ⟺ j = 1 ∑ n x j l o g ϕ j ∣ y = 0 ( 1 − ϕ j ∣ y = 1 ) ϕ j ∣ y = 1 ( 1 − ϕ j ∣ y = 0 ) + j = 1 ∑ n l o g 1 − ϕ j ∣ y = 0 1 − ϕ j ∣ y = 1 + l o g 1 − ϕ y ϕ y ≥ 0 所以最终得到的是一个线性分类器 5. ( a )
p
(
y
;
ϕ
)
=
(
1
−
ϕ
)
y
−
1
ϕ
=
e
x
p
{
l
o
g
(
1
−
ϕ
)
y
−
l
o
g
ϕ
1
−
ϕ
}
p(y;\phi)=(1-\phi)^{y-1}\phi=exp\{log(1-\phi)y-log\dfrac{\phi}{1-\phi}\}
p ( y ; ϕ ) = ( 1 − ϕ ) y − 1 ϕ = e x p { l o g ( 1 − ϕ ) y − l o g 1 − ϕ ϕ } 所以
b
(
y
)
=
1
b(y)=1
b ( y ) = 1
η
=
l
o
g
(
1
−
ϕ
)
\eta=log(1-\phi)
η = l o g ( 1 − ϕ )
T
(
y
)
=
y
T(y)=y
T ( y ) = y
a
(
η
)
=
η
+
l
o
g
(
1
−
e
η
)
a(\eta)=\eta+log(1-e^{\eta})
a ( η ) = η + l o g ( 1 − e η ) ( b )
g
(
η
)
=
1
1
−
e
η
g(\eta)=\dfrac{1}{1-e^{\eta}}
g ( η ) = 1 − e η 1
h
θ
(
x
)
=
1
1
−
e
θ
T
x
h_{\theta}(x)=\dfrac{1}{1-e^{\theta^Tx}}
h θ ( x ) = 1 − e θ T x 1 ( c )
y
=
1
1
−
e
θ
T
x
y=\dfrac{1}{1-e^{\theta^{T}x}}
y = 1 − e θ T x 1
l
(
θ
)
=
e
θ
T
x
(
y
−
1
)
(
1
−
e
θ
T
x
)
l(\theta)=e^{\theta^Tx(y-1)}(1-e^{\theta^Tx})
l ( θ ) = e θ T x ( y − 1 ) ( 1 − e θ T x )
l
o
g
l
(
θ
)
=
θ
T
x
(
y
−
1
)
+
l
o
g
(
1
−
e
θ
T
x
)
logl(\theta)=\theta^Tx(y-1)+log(1-e^{\theta^Tx} )
l o g l ( θ ) = θ T x ( y − 1 ) + l o g ( 1 − e θ T x )
∂
l
o
g
l
(
θ
)
∂
θ
=
(
y
−
1
1
−
e
θ
T
x
)
x
\dfrac{\partial{logl(\theta)}}{\partial{\theta}}=(y-\dfrac{1}{1-e^{\theta^Tx}})x
∂ θ ∂ l o g l ( θ ) = ( y − 1 − e θ T x 1 ) x 权值更新公式为
θ
k
+
1
=
θ
k
−
α
∂
J
∂
θ
\theta_{k+1}=\theta_{k}-\alpha\dfrac{\partial{J}}{\partial{\theta}}
θ k + 1 = θ k − α ∂ θ ∂ J
Problem Set #2
( a )
J
(
θ
)
=
1
2
(
X
θ
−
y
)
T
(
X
θ
−
y
)
+
λ
2
θ
T
θ
J(\theta)=\dfrac{1}{2}(X\theta-y)^T(X\theta-y)+\dfrac{\lambda}{2}\theta^T\theta
J ( θ ) = 2 1 ( X θ − y ) T ( X θ − y ) + 2 λ θ T θ
∂
J
∂
θ
=
X
T
(
X
θ
−
y
)
+
λ
θ
\dfrac{\partial{J}}{\partial{\theta}}=X^T(X\theta-y)+\lambda\theta
∂ θ ∂ J = X T ( X θ − y ) + λ θ
J
J
J 是凸函数,令偏导数为零
θ
=
(
X
T
X
+
λ
I
)
−
1
X
T
y
\theta=(X^TX+\lambda I)^{-1}X^Ty
θ = ( X T X + λ I ) − 1 X T y ( b )
θ
T
x
=
y
T
X
(
λ
I
+
X
T
X
)
−
1
x
=
y
T
(
λ
I
+
X
X
T
)
−
1
X
x
\theta^Tx=y^TX(\lambda I+X^TX)^{-1}x=y^T(\lambda I+XX^T)^{-1}Xx
θ T x = y T X ( λ I + X T X ) − 1 x = y T ( λ I + X X T ) − 1 X x 以
ϕ
(
x
(
i
)
)
\phi(x^{(i)})
ϕ ( x ( i ) ) 代替X中的每一列,则无需计算出每个
ϕ
(
x
(
i
)
)
\phi(x^{(i)})
ϕ ( x ( i ) ) ,只要计算出
ϕ
(
x
(
i
)
)
\phi(x^{(i)})
ϕ ( x ( i ) ) 之间的内积即可。 此外
(
λ
I
+
B
A
)
−
1
B
=
1
λ
(
B
A
λ
+
B
A
B
A
λ
2
+
⋅
⋅
⋅
)
B
=
B
λ
(
A
B
λ
+
A
B
A
B
λ
2
+
⋅
⋅
⋅
)
=
B
(
λ
I
+
A
B
)
−
1
(\lambda I + BA)^{-1}B=\dfrac{1}{\lambda}(\dfrac{BA}{\lambda}+\dfrac{BABA}{\lambda^2}+\cdot\cdot\cdot)B=\dfrac{B}{\lambda}(\dfrac{AB}{\lambda}+\dfrac{ABAB}{\lambda^2}+\cdot\cdot\cdot)=B(\lambda I+AB)^{-1}
( λ I + B A ) − 1 B = λ 1 ( λ B A + λ 2 B A B A + ⋅ ⋅ ⋅ ) B = λ B ( λ A B + λ 2 A B A B + ⋅ ⋅ ⋅ ) = B ( λ I + A B ) − 1 2. ( a ) 假设该最优化问题的解中,存在
ξ
j
<
0
\xi_j<0
ξ j < 0 ,因为
y
(
j
)
(
w
T
x
(
j
)
+
b
)
≥
1
−
ξ
j
≥
1
y^{(j)}(w^Tx^{(j)}+b)\geq1-\xi_j\geq1
y ( j ) ( w T x ( j ) + b ) ≥ 1 − ξ j ≥ 1 若固定其他参数不变,令
ξ
j
=
0
\xi_j=0
ξ j = 0 可以得到使目标函数更小的解,故当前解不是局部极小值,所以
ξ
j
≥
0
\xi_j\geq0
ξ j ≥ 0 对于任意
j
j
j 成立,因此,是否增加
ξ
j
≥
0
\xi_j\geq0
ξ j ≥ 0 的约束不影响问题的解。 ( b )
L
(
w
,
b
,
ξ
,
α
)
=
1
2
∥
w
∥
2
+
C
2
∑
i
=
1
m
ξ
i
2
−
∑
i
=
1
m
α
i
(
y
(
i
)
(
w
T
x
(
i
)
+
b
)
−
1
+
ξ
i
)
α
i
≥
0
,
i
=
1
,
2
,
.
.
,
m
L(w,b,\xi,\alpha)=\dfrac{1}{2}\left\| w\right \|^2+\dfrac{C}{2}\sum_{i=1}^{m}\xi_i^{2}-\sum_{i=1}^{m}\alpha_i(y^{(i)}(w^Tx^{(i)}+b)-1+\xi_i )\\ \alpha_i\geq0,i=1,2,..,m
L ( w , b , ξ , α ) = 2 1 ∥ w ∥ 2 + 2 C i = 1 ∑ m ξ i 2 − i = 1 ∑ m α i ( y ( i ) ( w T x ( i ) + b ) − 1 + ξ i ) α i ≥ 0 , i = 1 , 2 , . . , m ( c )
∇
w
L
=
w
−
∑
i
=
1
m
α
i
y
(
i
)
x
(
i
)
=
0
∇
b
=
∑
i
=
1
m
α
i
y
(
i
)
=
0
∇
ξ
=
C
ξ
−
α
=
0
\nabla_wL=w-\sum_{i=1}^{m}\alpha_iy^{(i)}x^{(i)}=0\\ \nabla_b=\sum_{i=1}^{m}\alpha_iy^{(i)}=0\\ \nabla_\xi=C\xi-\alpha=0
∇ w L = w − i = 1 ∑ m α i y ( i ) x ( i ) = 0 ∇ b = i = 1 ∑ m α i y ( i ) = 0 ∇ ξ = C ξ − α = 0 ( d ) 根据( c )
w
=
∑
i
=
1
m
α
i
y
(
i
)
x
(
i
)
∑
i
=
1
m
α
i
y
(
i
)
=
0
c
ξ
i
=
α
i
w=\sum_{i=1}^{m}\alpha_iy^{(i)}x^{(i)}\\ \sum_{i=1}^{m}\alpha_iy^{(i)}=0\\ c\xi_i=\alpha_i
w = i = 1 ∑ m α i y ( i ) x ( i ) i = 1 ∑ m α i y ( i ) = 0 c ξ i = α i
L
(
w
,
b
,
ξ
,
α
)
=
1
2
∥
w
∥
2
+
C
2
∑
i
=
1
m
ξ
i
2
−
∑
i
=
1
m
α
i
(
y
(
i
)
(
w
T
x
(
i
)
+
b
)
−
1
+
ξ
i
)
=
1
2
(
∑
i
=
1
m
α
i
y
(
i
)
x
(
i
)
)
T
(
∑
i
=
1
m
α
i
y
(
i
)
x
(
i
)
)
+
1
2
C
∑
i
=
1
m
α
i
2
−
∑
i
=
1
m
α
i
y
(
i
)
(
∑
j
=
1
m
α
j
y
(
i
)
x
(
j
)
)
T
x
(
i
)
+
∑
i
=
1
m
α
i
−
1
C
∑
i
=
1
m
α
i
2
=
−
1
2
∑
i
=
1
m
∑
j
=
1
m
α
i
α
j
y
(
i
)
y
(
j
)
(
x
(
i
)
)
T
x
(
j
)
−
1
2
C
∑
i
=
1
m
α
i
2
+
∑
i
=
1
m
α
i
\begin{aligned} L(w,b,\xi,\alpha)&=\dfrac{1}{2}\left\| w\right \|^2+\dfrac{C}{2}\sum_{i=1}^{m}\xi_i^{2}-\sum_{i=1}^{m}\alpha_i(y^{(i)}(w^Tx^{(i)}+b)-1+\xi_i)\\ &=\dfrac{1}{2}(\sum_{i=1}^{m}\alpha_iy^{(i)}x^{(i)})^T(\sum_{i=1}^{m}\alpha_iy^{(i)}x^{(i)})+\dfrac{1}{2C}\sum_{i=1}^{m}\alpha_i^2-\sum_{i=1}^{m}\alpha_iy^{(i)}(\sum_{j=1}^{m}\alpha_jy^{(i)}x^{(j)})^Tx^{(i)}+\sum_{i=1}^{m}\alpha_i-\dfrac{1}{C}\sum_{i=1}^m\alpha_i^2\\ &=-\dfrac{1}{2}\sum_{i=1}^{m}\sum_{j=1}^{m}\alpha_i\alpha_jy^{(i)}y^{(j)}(x^{(i)})^Tx^{(j)}-\dfrac{1}{2C}\sum_{i=1}^{m}\alpha_{i}^2+\sum_{i=1}^{m}\alpha_i \end{aligned}
L ( w , b , ξ , α ) = 2 1 ∥ w ∥ 2 + 2 C i = 1 ∑ m ξ i 2 − i = 1 ∑ m α i ( y ( i ) ( w T x ( i ) + b ) − 1 + ξ i ) = 2 1 ( i = 1 ∑ m α i y ( i ) x ( i ) ) T ( i = 1 ∑ m α i y ( i ) x ( i ) ) + 2 C 1 i = 1 ∑ m α i 2 − i = 1 ∑ m α i y ( i ) ( j = 1 ∑ m α j y ( i ) x ( j ) ) T x ( i ) + i = 1 ∑ m α i − C 1 i = 1 ∑ m α i 2 = − 2 1 i = 1 ∑ m j = 1 ∑ m α i α j y ( i ) y ( j ) ( x ( i ) ) T x ( j ) − 2 C 1 i = 1 ∑ m α i 2 + i = 1 ∑ m α i 原问题的对偶问题为
min
w
,
b
,
ξ
,
α
L
(
w
,
b
,
ξ
,
α
)
=
−
1
2
∑
i
=
1
m
∑
j
=
1
m
α
i
α
j
y
(
i
)
y
(
j
)
(
x
(
i
)
)
T
x
(
j
)
−
1
2
C
∑
i
=
1
m
α
i
2
+
∑
i
=
1
m
α
i
s
.
t
.
∑
i
=
1
m
α
i
y
(
i
)
=
0
α
i
≥
0
,
i
=
1
,
2
,
.
.
,
m
\min_{w,b,\xi,\alpha}L(w,b,\xi,\alpha)=-\dfrac{1}{2}\sum_{i=1}^{m}\sum_{j=1}^{m}\alpha_i\alpha_jy^{(i)}y^{(j)}(x^{(i)})^Tx^{(j)}-\dfrac{1}{2C}\sum_{i=1}^{m}\alpha_{i}^2+\sum_{i=1}^{m}\alpha_i\\ s.t. \sum_{i=1}^{m}\alpha_iy^{(i)}=0\\ \alpha_i\geq0,i=1,2,..,m
w , b , ξ , α min L ( w , b , ξ , α ) = − 2 1 i = 1 ∑ m j = 1 ∑ m α i α j y ( i ) y ( j ) ( x ( i ) ) T x ( j ) − 2 C 1 i = 1 ∑ m α i 2 + i = 1 ∑ m α i s . t . i = 1 ∑ m α i y ( i ) = 0 α i ≥ 0 , i = 1 , 2 , . . , m 3. ( a )
∀
i
\forall i
∀ i ,令
α
i
=
1
\alpha_i=1
α i = 1 ,令
b
b
b =0,则
∀
k
\forall k
∀ k
∣
f
(
x
(
k
)
)
−
y
(
k
)
∣
=
∣
∑
i
=
1
,
i
≠
k
m
y
(
i
)
e
−
∥
x
(
i
)
−
x
(
k
)
∥
τ
2
∣
≤
∑
i
=
1
,
i
≠
k
m
∣
e
−
∥
x
(
i
)
−
x
(
k
)
∥
τ
2
∣
≤
(
m
−
1
)
e
−
ϵ
τ
2
\begin{aligned} \left|f(x^{(k)})-y^{(k)} \right|=\left| \sum_{i=1,i\neq k}^{m}y^{(i)}e^{-\frac{\left\| x^{(i)} - x^{(k)} \right\|}{\tau^2}} \right| \leq\sum_{i=1,i\neq k}^{m}\left| e^{-\frac{\left\| x^{(i)} - x^{(k)} \right\|}{\tau^2}} \right| \leq(m-1)e^{-\frac{\epsilon}{\tau^2}} \end{aligned}
∣ ∣ ∣ f ( x ( k ) ) − y ( k ) ∣ ∣ ∣ = ∣ ∣ ∣ ∣ ∣ ∣ i = 1 , i ̸ = k ∑ m y ( i ) e − τ 2 ∥ x ( i ) − x ( k ) ∥ ∣ ∣ ∣ ∣ ∣ ∣ ≤ i = 1 , i ̸ = k ∑ m ∣ ∣ ∣ ∣ ∣ e − τ 2 ∥ x ( i ) − x ( k ) ∥ ∣ ∣ ∣ ∣ ∣ ≤ ( m − 1 ) e − τ 2 ϵ 只要取
τ
<
ϵ
l
o
g
(
m
−
1
)
\tau\lt \sqrt{\dfrac{\epsilon}{log(m-1)}}
τ < l o g ( m − 1 ) ϵ
就有
∣
f
(
x
(
k
)
)
−
y
(
k
)
∣
≤
(
m
−
1
)
e
−
ϵ
τ
2
<
1
\left|f(x^{(k)})-y^{(k)} \right| \leq(m-1)e^{-\frac{\epsilon}{\tau^2}}\lt 1
∣ ∣ ∣ f ( x ( k ) ) − y ( k ) ∣ ∣ ∣ ≤ ( m − 1 ) e − τ 2 ϵ < 1 ( b )问题有误,原意应该是问没有松弛变量时能否达到零误差 可以,( a )中已经找到了达成零误差的解,由于SVM损失函数是凸函数,故必定可以找到那个解。 ( c ) 不一定,我们可以通过减小C来使得优化过程更偏向得到间隔更大但带有一定训练误差的解