Use one-dimensional arrays to represent these vectors and parameters, so there is only one square bracket
W1_1: represents the W of the first neuron of layer 1
Z1_1: Indicates the dot product between W1_1 and input X, and then added to b1_1
a1_1: Indicates that the sigmoid function of Z1_1 is applied
a1: Indicates that a1_1, a1_2, a1_3 are combined into a one-dimensional array as the output of layer 1
Implementation of the dense() function
W can be regarded as a 2 * 3 matrix, the first column is the parameter w1_1, the second column is the parameter w1_2, and the third column is the parameter w1_3. b can be viewed as a one-dimensional array.
The function of the dense() function is: given the parameters w, b and the activation function g(), the activation value of the previous layer, and then output the activation value of the current layer
shape[0]: number of rows, shape[1]: number of columns. And here the number of columns of the W matrix is equal to the number of neurons in the layer
Initialize the a_out array, make its number of elements equal to the number of units in the layer, and set it to 0 array
j is the index, from 0 to the number of units in the layer minus one, that is, 0, 1, 2
[ : , j ], two-dimensional array slice, take each row of the jth column, that is, take the jth column of the matrix
Use the dot() function to calculate the dot product of w and a_in, add b, and then bring the whole into the g() function as z to obtain the activation value a_out[j] of the unit
Output a_out, the activation value of this layer
Implementation of the sequential() function
Given the input feature X, the activation value a of each layer is then calculated, where W and b are sometimes also called the weight of the layer. Return the activation value f_x of the last layer, which is a4, which is the result f_x of the entire neural network model
W is used here, by convention, uppercase letters for matrices and lowercase letters for vectors or scalars