Illustrates a back-propagation algorithm

 

Recently depth learning to see things, to see the beginning of the UFLDL tutorial Andrew Ng, a Chinese version directly read, and later found that some places are not always very clear, and to see the English version, and then find some information to see only We found that formula Chinese version of the translator will be omitted when the translation of derivation supplement, but the supplement is wrong, no wonder that there is a problem. In fact, back-propagation method is the basis of neural networks, but a lot of people always encounter some problems when I learn, or see a large piece of the equation is difficult to feel as if they will retreat, it is not difficult, is seeking a chain guide with the law repeatedly. If you do not want to look at the formula, you can directly bring in value, the actual calculation, to understand what this process again after the derivation formula, which would find it very easy.

  Speaking of neural networks, we see this figure should not be unfamiliar:

  This is the basic configuration of a typical three-layer neural network, Layer L1 is the input layer, Layer L2 is hidden layer, Layer L3 is hidden layer, we now have a bunch of data hands {x1, x2, x3, ..., xn}, but also a bunch of data output {y1, y2, y3, ..., yn}, they now want to do some transformation in the hidden layer, so that after you get your data filling into the desired output. If you want your output and the original input, as it is the most common self-coding model (Auto-Encoder). Some may ask, why should the input and output are the same? Any functions? In fact, very wide application in image recognition, text classification, etc. will be used, I will specifically write an Auto-Encoder to illustrate the article, including some variants like. If your original input and output are not the same, it is very common artificial neural network, and the equivalent of raw data to obtain an output data we want through a map, that is, we talk about the topic of today.

  This article directly give an example, demonstrate the value into the process of back-propagation method, derivation of the formula until the next time the write Auto-Encoder to write, in fact, very simple, interested students can try to derive lower yourself:) ( Note: This article assumes you already know the basic neural network structure, if fully understand, can refer to written notes Poll: [Mechine Learning & Algorithm] neural network infrastructure)

  Suppose you have such a network layer:

  第一层是输入层,包含两个神经元i1,i2,和截距项b1;第二层是隐含层,包含两个神经元h1,h2和截距项b2,第三层是输出o1,o2,每条线上标的wi是层与层之间连接的权重,激活函数我们默认为sigmoid函数。

  现在对他们赋上初值,如下图:

  其中,输入数据  i1=0.05,i2=0.10;

     输出数据 o1=0.01,o2=0.99;

     初始权重  w1=0.15,w2=0.20,w3=0.25,w4=0.30;

           w5=0.40,w6=0.45,w7=0.50,w8=0.55

 

  目标:给出输入数据i1,i2(0.05和0.10),使输出尽可能与原始输出o1,o2(0.01和0.99)接近。

 

  Step 1 前向传播

  1.输入层---->隐含层:

  计算神经元h1的输入加权和:

神经元h1的输出o1:(此处用到激活函数为sigmoid函数):

 

  同理,可计算出神经元h2的输出o2:

  

 

  2.隐含层---->输出层:

  计算输出层神经元o1和o2的值:

  

这样前向传播的过程就结束了,我们得到输出值为[0.75136079 , 0.772928465],与实际值[0.01 , 0.99]相差还很远,现在我们对误差进行反向传播,更新权值,重新计算输出。

 

Step 2 反向传播

1.计算总误差

总误差:(square error)

但是有两个输出,所以分别计算o1和o2的误差,总误差为两者之和:

 

2.隐含层---->输出层的权值更新:

以权重参数w5为例,如果我们想知道w5对整体误差产生了多少影响,可以用整体误差对w5求偏导求出:(链式法则)

下面的图可以更直观的看清楚误差是怎样反向传播的:

现在我们来分别计算每个式子的值:

计算:

计算:

(这一步实际上就是对sigmoid函数求导,比较简单,可以自己推导一下)

 

计算

最后三者相乘:

这样我们就计算出整体误差E(total)对w5的偏导值。

回过头来再看看上面的公式,我们发现:

为了表达方便,用来表示输出层的误差:

因此,整体误差E(total)对w5的偏导公式可以写成:

如果输出层误差计为负的话,也可以写成:

最后我们来更新w5的值:

(其中,是学习速率,这里我们取0.5)

同理,可更新w6,w7,w8:

3.隐含层---->隐含层的权值更新:

 方法其实与上面说的差不多,但是有个地方需要变一下,在上文计算总误差对w5的偏导时,是从out(o1)---->net(o1)---->w5,但是在隐含层之间的权值更新时,是out(h1)---->net(h1)---->w1,而out(h1)会接受E(o1)和E(o2)两个地方传来的误差,所以这个地方两个都要计算。

 

计算

先计算

 

 

同理,计算出:

          

两者相加得到总值:

再计算

 

再计算

最后,三者相乘:

 为了简化公式,用sigma(h1)表示隐含层单元h1的误差:

最后,更新w1的权值:

同理,额可更新w2,w3,w4的权值:

  这样误差反向传播法就完成了,最后我们再把更新的权值重新计算,不停地迭代,在这个例子中第一次迭代之后,总误差E(total)由0.298371109下降至0.291027924。迭代10000次后,总误差为0.000035085,输出为[0.015912196,0.984065734](原输入为[0.01,0.99]),证明效果还是不错的。
————————————————
原文链接:https://blog.csdn.net/Super_Json/article/details/85019124

发布了128 篇原创文章 · 获赞 21 · 访问量 3万+

Guess you like

Origin blog.csdn.net/Bluenapa/article/details/103046338