线性回归和逻辑回归介绍

概述

线性回归和逻辑回归是机器学习中最基本的两个模型,线性回归一般用来解决预测问题,逻辑回归一般解决分类问题,线性回归模型和逻辑回归模型之间既有区别又有关联。

线性回归模型

假定训练数据集为
T = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , . . . , ( x n , y n ) } T = \{(x_1,y_1),(x_2,y_2),...,(x_n,y_n)\}
拟合函数为
f ( x i ) = w x i + b , i = 1 , 2 , . . . , n f(x_i) =wx_i+b,i=1,2,...,n
用最小二乘法,既是找到一条直线,使所有样本数据到直线的欧式距离之和最小,所以损失函数为
J ( w , b ) = i = 1 n ( f ( x i ) y i ) 2 = i = 1 n ( y i w x i b ) 2 J(w,b)=\sum_{i=1}^n(f(x_i)-y_i)^2=\sum_{i=1}^n(y_i-wx_i-b)^2
求损失函数的最小值
arg min w , b J ( w , b ) = min w , b i = 1 n ( f ( x i ) y i ) 2 = min w , b i = 1 n ( y i w x i b ) 2 \arg\min_{w,b} J(w,b)=\min_{w,b}\sum_{i=1}^n(f(x_i)-y_i)^2=\min_{w,b}\sum_{i=1}^n(y_i-wx_i-b)^2
对其求导
J ( w , b ) w = ( i = 1 n ( w 2 x i 2 + ( y i b ) 2 2 w x i ( y i b ) ) ) w = 2 i = 1 n ( w x i 2 x i ( y i b ) ) \frac {\partial J(w,b)}{\partial w}=\frac {\partial (\sum_{i=1}^n(w^2x_i^2+(y_i-b)^2-2wx_i(y_i-b)))}{\partial w}=2\sum_{i=1}^n(wx_i^2-x_i(y_i-b))
J ( w , b ) b = ( i = 1 n ( w 2 x i 2 + ( y i b ) 2 2 w x i ( y i b ) ) ) b \frac {\partial J(w,b)}{\partial b}=\frac {\partial (\sum_{i=1}^n(w^2x_i^2+(y_i-b)^2-2wx_i(y_i-b)))}{\partial b}

= ( i = 1 n ( w 2 x i 2 + ( y i 2 2 b y i + b 2 ) 2 w x i y i + 2 w x i b ) ) b = 2 i = 1 n ( b + w x i y i ) = 2 n b 2 i = 1 n ( y i w x i ) =\frac {\partial (\sum_{i=1}^n(w^2x_i^2+(y_i^2-2by_i+b^2)-2wx_iy_i+2wx_ib))}{\partial b}=2\sum_{i=1}^n(b+wx_i-y_i)=2nb-2\sum_{i=1}^n(y_i-wx_i)
另两个偏导等于0,求w和b
w = i = 1 n x i ( y i b ) i = 1 n x i 2 = i = 1 n x i ( y i y ˉ + w x ˉ ) i = 1 n x i 2 w = \frac {\sum_{i=1}^nx_i(y_i-b)}{\sum_{i=1}^nx_i^2}=\frac {\sum_{i=1}^nx_i(y_i-\bar y+w\bar x)}{\sum_{i=1}^nx_i^2}

= i = 1 n x i ( y i y ˉ + w x ˉ ) i = 1 n x i 2 = i = 1 n ( x i ( y i y ˉ ) + w x i x ˉ ) ) i = 1 n x i 2 =\frac {\sum_{i=1}^nx_i(y_i-\bar y+w\bar x)}{\sum_{i=1}^nx_i^2}=\frac {\sum_{i=1}^n(x_i(y_i-\bar y)+wx_i\bar x))}{\sum_{i=1}^nx_i^2}

= i = 1 n ( x i ( y i y ˉ ) ) i = 1 n x i 2 n x ˉ 2 = i = 1 n x i y i n x ˉ y ˉ i = 1 n x i 2 n x ˉ 2 =\frac {\sum_{i=1}^n(x_i(y_i-\bar y))}{\sum_{i=1}^nx_i^2-n\bar x^2}=\frac {\sum_{i=1}^nx_iy_i-n\bar x\bar y}{\sum_{i=1}^nx_i^2-n\bar x^2}

b = 1 n i = 1 n y i w 1 n i = 1 n x i = y ˉ w x ˉ b = \frac {1}{n}\sum_{i=1}^ny_i- w\frac {1}{n}\sum_{i=1}^nx_i=\bar y-w\bar x
从而得到线性回归的拟合函数

逻辑回归模型

对应二分类问题,输出 y { 0 , 1 } y\in\{0,1\} ,可以通过对线性回归模型添加Sigmoid激活函数实现逻辑回归模型,Sigmoid函数如下:
y = 1 1 + e z y=\frac{1}{1+e^{-z}}
它可以将 z z 的值转化为接近 0 0 1 1 y y 值,并且在 z = 0 z=0 附近变化很陡,线性回归模型加入Sigmoid激活函数后变为
y = 1 1 + e ( w x + b ) y=\frac{1}{1+e^{-(wx+b)}}
其对数几率函数为
l n y 1 y = w x + b ln\frac{y}{1-y}=wx+b
如果将 y y 视为类后验概率 p ( y = 1 x ) p(y=1|x) ,则上式可以表示为
l n p ( y = 1 x ) p ( y = 0 x ) = w x + b ln\frac{p(y=1|x)}{p(y=0|x)}=wx+b
所以有二项逻辑回归模型如下
{ p ( y = 1 x ) = e x p ( w x + b ) 1 + e x p ( w x + b ) p ( y = 0 x ) = 1 1 + e x p ( w x + b ) s t . x R n , y { 0 , 1 } \begin{cases} \quad p(y=1|x)=\frac{exp(wx+b)}{1+exp(wx+b)}\\ \quad p(y=0|x)=\frac{1}{1+exp(wx+b)}\\ st. \quad x \in \Bbb R^n,y \in \{0,1\} \end{cases}
逻辑回归比较两个条件概率的大小,将实例 x x 分配到概率较大的一类
假设:
P ( y = 1 x ) = π ( x ) , P ( y = 0 x ) = 1 π ( x ) P(y=1|x)=\pi(x),P(y=0|x)=1-\pi(x)
似然函数:
i = 1 n [ π ( x i ) ] y i [ 1 π ( x i ) ] 1 y i \prod_{i=1}^n[\pi(x_i)]^{y_i}[1-\pi(x_i)]^{1-y_i}
对数似然函数:
L ( w ) = i = 1 n [ y i log π ( x i ) + ( 1 y i ) log ( 1 π ( x i ) ) ] L(w)=\sum_{i=1}^n[y_i\log\pi(x_i)+(1-y_i)\log(1-\pi(x_i))]

= i = 1 n [ y i log π ( x i ) 1 π ( x i ) + l o g ( 1 π ( x i ) ) ] =\sum_{i=1}^n[y_i\log{\pi(x_i) \over {1-\pi(x_i)}}+log(1-\pi(x_i))]

= i = 1 n [ y i ( w x i + b ) l o g ( 1 + e x p ( w x i + b ) ) ] =\sum_{i=1}^n[y_i(w\cdot x_i+b)-log(1+exp(w\cdot x_i+b))]

L ( w ) L(w) 求极大值,得 w w 的估计值,这样问题就变成了以对数似然函数为目标函数的最优化问题,逻辑回归通常采用梯度下降或拟牛顿法
L ( W ) L(W) 为目标函数,最优化为:

arg max l i k e l i h o o d w ( L ( w ) ) {\arg\max likelihood}_{w}(L(w))

L ( w ) w = i = 1 n y i x i i = 1 n e x p ( w x i + b ) 1 + e x p ( w x i + b ) x i = i = 1 n ( y i 1 1 + e x p ( ( w x i + b ) ) ) x i \frac{\partial L(w)}{\partial w}=\sum_{i=1}^ny_ix_i-\sum_{i=1}^{n}\frac{exp(wx_i+b)}{1+exp(wx_i+b)}x_i=\sum_{i=1}^n(y_i-\frac{1}{1+exp(-(wx_i+b))})x_i

猜你喜欢

转载自blog.csdn.net/fegang2002/article/details/84961068