算法强化 —— 提升树算法(三)

二分类问题

对于二分类问题,原论文中使用的对数损失函数:
L ( y , F ) = l o g ( 1 + e x p ( 2 y F ) ) , y 1 , 1 L(y,F) = log(1+exp(-2yF)),y \in -1,1
其中
F ( x ) = 1 2 l o g [ P r ( y = 1 x ) P r ( y = 1 x ) ] F(x) = \frac{1}{2}log \left[\frac{Pr(y=1|x)}{Pr(y=-1|x)} \right]
那么按照上面的算法一步步进行计算,首先计算负梯度
y ~ i = [ L ( y , F ( x i ) ) F ( x i ) ] F ( x ) = F m 1 ( x ) = 2 y i 1 + exp ( 2 y i F m 1 ( x i ) ) \tilde{y}_{i}=-\left[\frac{\partial L\left(y, F\left(x_{i}\right)\right)}{\partial F\left(x_{i}\right)}\right]_{F(x)=F_{m-1}(x)}=\frac{2 y_{i}}{1+\exp \left(2 y_{i} F_{m-1}\left(x_{i}\right)\right)}

然后估计叶子节点的值
γ j m = argmin γ x i R m log ( 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) ) \gamma_{j m}=\operatorname{argmin}_{\gamma} \sum_{x_{i} \in R_{m}} \log \left(1+\exp \left(-2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right)
原论文中,直接使用Newton-Raphson方法得出近似结果,
γ j m = x i R m y ~ i x i R m y ~ i ( 2 y ~ i ) \gamma_{j m}=\frac{\sum_{x_{i} \in R_{m}} \tilde{y}_{i}}{\sum_{x_{i} \in R_{m}}\left|\tilde{y}_{i}\right|\left(2-\left|\tilde{y}_{i}\right|\right)}

初始值如何设置

在梯度提升树算法中,我们知道,初始值的设置是:
F o ( x ) = a r g m i n i = 1 N L ( y i , F ( x i ) ) F_o(x) = argmin \sum_{i=1}^N L(y_i,F(x_i))
我们让损失函数L对F求偏导,并令偏导为0,求极值
i = 1 N L ( y i , F ( x i ) ) F = 0 i = 1 N ( 2 y i ) e 2 y i F e 2 y i F + 1 = 0 \begin{aligned} &\frac{\partial \sum_{i=1}^{N} L\left(y_{i}, F\left(x_{i}\right)\right)}{\partial F}=0\\ &\sum_{i=1}^{N} \frac{\left(-2 y_{i}\right) e^{-2 y_{i} F}}{e^{-2 y_{i} F}+1}=0 \end{aligned}
由于是二分类,所以yi的取值是1和-1,所以有
i : y i = 1 2 e 2 F e 2 F + 1 + i : y i = 1 2 e 2 F e 2 F + 1 = 0 \sum_{i:y_i=1} \frac{2e^{-2F}}{e^{-2F}+1} + \sum_{i:y_i=-1} \frac{-2e^{2F}}{e^{2F}+1} = 0
将分母处理成一致:
\sum_{i:y_i=1} \frac{2}{e^{2F}+1} + \sum_{i:y_i=-1} \frac{-2e{2F}}{e{2F}+1} = 0
设正样本数量为m个,负样本数量为n个,则有:
m n e 2 F = 0 m-ne^{2F} = 0
e 2 F = m n = 1 + m n m + n 1 m n m + n = 1 + y ˉ 1 y ˉ e^{2F} = \frac{m}{n} = \frac{1+\frac{m-n}{m+n}}{1-\frac{m-n}{m+n}} = \frac{1+\bar{y}}{1-\bar{y}}
m+n表示样本总数,m-n表示yi求和
最终可以得出
F o ( X ) = 1 2 l o g 1 + y ˉ 1 y ˉ F_o(X) = \frac{1}{2}log \frac{1+\bar{y}}{1-\bar{y}}

牛顿近似法求解

如何将公式1转化为公式2
γ j m = argmin γ x i R m log ( 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) ) \gamma_{j m}=\operatorname{argmin}_{\gamma} \sum_{x_{i} \in R_{m}} \log \left(1+\exp \left(-2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right)
γ j m = x i R m y ~ i x i R m y ~ i ( 2 y ~ i ) \gamma_{j m}=\frac{\sum_{x_{i} \in R_{m}} \tilde{y}_{i}}{\sum_{x_{i} \in R_{m}}\left|\tilde{y}_{i}\right|\left(2-\left|\tilde{y}_{i}\right|\right)}
首先,牛顿法是一种迭代求解的方法,论文中提到进一步迭代,我们首先令:
g ( γ ) = x i R j m l o g ( 1 + e x p ( 2 y i ( F m 1 ( x i + γ ) ) ) ) g(\gamma) = \sum_{x_i \in R_{jm}} log (1+exp(-2y_i(F_{m-1}(x_i+\gamma))))
然后使用牛顿法求解 γ 0 = 0 \gamma_0 = 0 开始迭代
γ j m = γ 0 g ( γ 0 ) g ( γ 0 ) = g ( γ 0 ) g ( γ 0 ) \gamma_{j m}=\gamma_{0}-\frac{g^{\prime}\left(\gamma_{0}\right)}{g^{\prime \prime}\left(\gamma_{0}\right)}=-\frac{g^{\prime}\left(\gamma_{0}\right)}{g^{\prime \prime}\left(\gamma_{0}\right)}
然后分别对 γ \gamma 进行一阶求导和二阶求导
g ( γ ) = x i R j m 2 y i 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) g^{\prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{-2 y_{i}}{1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)}
g ( γ ) = x i R j m 4 y i 2 exp ( 2 y i ( F m 1 ( x i ) + γ ) ) [ 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) ] 2 = x i R j m 4 y i 2 ( exp ( 2 y i ( F m 1 ( x i ) + γ ) ) + 1 ) 4 y i 2 [ 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) ] 2 g^{\prime \prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{4 y_{i}^{2} \exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}}=\sum_{x_{i} \in R_{jm}} \frac{4 y_{i}^{2}\left(\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)+1\right)-4 y_{i}^{2}}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}}
然后由于
y ~ i = [ L ( y , F ( x i ) ) F ( x i ) ] F ( x ) = F m 1 ( x ) = 2 y i 1 + exp ( 2 y i F m 1 ( x i ) ) \tilde{y}_{i}=-\left[\frac{\partial L\left(y, F\left(x_{i}\right)\right)}{\partial F\left(x_{i}\right)}\right]_{F(x)=F_{m-1}(x)}=\frac{2 y_{i}}{1+\exp \left(2 y_{i} F_{m-1}\left(x_{i}\right)\right)}
所以可以近似的得出
g ( γ ) = x i R j m 2 y i 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) = y ~ i g^{\prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{-2 y_{i}}{1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)} = -\tilde{y}_{i}
g ( γ ) = x i R j m 4 y i 2 ( exp ( 2 y i ( F m 1 ( x i ) + γ ) ) + 1 ) 4 y i 2 [ 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) ] 2 g^{\prime \prime}(\gamma)=\sum_{x_{i} \in R_{j m}} \frac{4 y_{i}^{2} (\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)+1)-4y_i^2}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]^{2}}
= x i R j m [ 2 2 y i 2 [ 1 + exp ( 2 y i ( F m 1 ( x i ) + γ ) ) ] y i 2 ~ ] =\sum_{x_{i} \in R_{jm}}\left[ \frac{2*2y_i^2}{\left[1+\exp \left(2 y_{i}\left(F_{m-1}\left(x_{i}\right)+\gamma\right)\right)\right]} -\tilde{y_i^2}\right]
由于yi取值为+1或者-1,所以 y i 2 = y i y_i^2 = |y_i| ,所以有:
g ( γ ) = y i ~ ( 2 y i ~ ) g^{\prime \prime}(\gamma) = |\tilde{y_i}|(2-|\tilde{y_i}|)

二分类问题

最终我们求出F(x),那么如何使用它进行分类呢:
F ( x ) = 1 2 l o g ( p 1 p ) F(x) = \frac{1}{2}log \left(\frac{p}{1-p} \right)
稍微进行转化可得
e 2 F ( x ) = p 1 p e^{2F(x)} = \frac{p}{1-p}
进一步转换可得
P + ( x ) = p = e 2 F ( x ) 1 + e 2 F ( x ) = 1 1 + e 2 F ( x ) P_{+}(x) = p = \frac{e^{2F(x)}}{1+e^{2F(x)}} = \frac{1}{1+e^{-2F(x)}}
P ( x ) = 1 p = 1 1 + e 2 F ( x ) P_{-}(x) = 1-p = \frac{1}{1+e^{2F(x)}}
最终实现二分类

发布了110 篇原创文章 · 获赞 3 · 访问量 4085

猜你喜欢

转载自blog.csdn.net/qq_33357094/article/details/105060431