BP算法数学推导

BP算法是根据链式求导法则推导得出的,在此我们以2层神经网络结构演示反向推导过程。

一、2层神经网络结构(不含输入层)

二、BP算法数学推导

1.正向计算过程

输入层:  \large a_{0}=x

第一层:  \large z_{1}=W_{1}*a_{0}

                \large a_{1}=sigmoid(z_{1})

第二层:  \large z_{2}=W_{2}*a_{1}

                \large a_{2}=sigmoid(z_{2})

代价函数:

\large J(W_{1},W_{2})=\frac{1}{2m}\sum_{i=1}^{m}(a_{2}-y)^{2}

2.反向推导过程

对W2求偏导:\frac{\partial J(W_{1},W_{2})}{\partial W2}=\frac{1}{m}*(a_{2}-y)*\frac{\mathrm{d} sigmoid(z_{2})}{\mathrm{d} z_{2}}*a_{1}

对W1求偏导:\frac{\partial J(W_{1},W_{2})}{\partial W1}=\frac{1}{m}*(a_{2}-y)*\frac{\mathrm{d} sigmoid(z_{2})}{\mathrm{d} z_{2}}*W_{2}*\frac{\mathrm{d} sigmoid(z_{1})}{\mathrm{d} z_{1}}*a_{0}

对其进行步骤拆分:

对于W2:    \delta _{2}=a_{2}-y

                 \Delta _{2}=\delta _{2}*sigmoid{}'(z_{2})

 \frac{\partial J(W_{1},W_{2})}{\partial W2}=\frac{1}{m}\Delta _{2}*a_{1}

对于W1:     \delta _{1}=\Delta _{2}*W_{2}

                 \Delta _{1}=\delta _{1}*sigmoid{}'(z_{1})

 \frac{\partial J(W_{1},W_{2})}{\partial W1}=\frac{1}{m}\Delta _{1}*a_{0}

梯度下降更新权值:

W_{2}=W_{2}-\alpha* \frac{\partial J(W_{1},W_{2})}{\partial W2}

W_{1}=W_{1}- \alpha* \frac{\partial J(W_{1},W_{2})}{\partial W1}

注:(1)\alpha是学习率,用以调节每一步下降的幅度。

       (2)\large sigmoid'(x)=sigmoid(x)[1-sigmoid(x)]

        

猜你喜欢

转载自blog.csdn.net/cxzgood/article/details/120834778