Shallow steps to build neural network model summary

1. Construction of a data set

2. Initialize the four variables

   - 确定样本个数
   - 输入层神经元的个数 (样本的特征个数)
   - 隐藏层神经元的个数
   - 输出层神经元的个数

3. Network initialization parameters W and b

 其中，神经网络的层数(除输入层)决定初始化参数W和b的对数。(此处以2层为例);
 各参数的shape总结如下:
       -  W1：(下一层神经元的个数，样本个数)
       - b1 ：(下一层神经元的个数，1)
       - W2：(下一层神经元的个数，上一层神经元的个数)
       - b2 ：(下一层神经元的个数，1)

Note: When the initialization assignment, b can typically all initialized to zero, but not all initialized to 0 W, the method is initialized using numpy.random.randn () randomly generated.

4. Forward propagation (FP)

正向传播的过程分为线性部分和非线性部分，其目的是求神经网络模型的预测值y。线性部分结果表示为Z[1],Z[2],非线性部分结果表示为A[1],A[2];非线性也可理解为激活函数(AF)，常见的激活函数有4种：
- Sigmoid 函数
- tanh 函数     
- ReLU 函数
- Leaky ReLU 函数

NOTE: Standard activation function selection methods are: tanh function select the hidden layer and the output layer Sigmoid function selection;

The loss function

通过正向传播来计算损失函数，其目的是求解真实的y值与预测的y值的差距；首先得确定损失函数的选择，常见的损失函数有以下三种：
 - hinge loss(支持向量机损失函数)
 - softmax(互熵损失函数)
 - Cross Entropy(交叉熵损失函数)

Calculate the final value of the loss function selected based on the loss, the loss of smaller value is better model. That next step needs to be optimized for the loss function, such that the value of the minimum loss.

6. The back-propagation (BP)

Seeking a partial derivative of the loss function by the chain rule, then request specific values substituted into d (W1), d (b1), d (W2), d (b2).

7. The network parameter updates gradient descent method

根据反向传播的结果更新W1，b1, W2, b2。由梯度下降法更新的具体公式如下所示：   
  W1 = W1 - n * d(W1)
  b1 = b1 - n * d(b1)
  W2 = W2 - n * d(W2)
  b2 = b2 - n * d(b2)

Wherein, said learning rate n (learning factor), should be appropriately calculated to a value of n before.

8. loop iteration

 从步骤4开始循环迭代，直到损失函数达到最小或者迭代次数达到上限。

Lgqiann

Published 21 original articles · won praise 6 · views 7996

Private letter concerns