Getting started with tensorflow - what is Param in module.summary and how to calculate it

        We know that using the module.summary function we can get a table to view the various parameters of this neural network model. The last column is the Param parameter.

Table of contents

1. What is param?

2. Give an example of how to calculate param

  1 Take a three-layer fully connected neural network as an example

  2 param in convolutional neural network


       

1. What is param?

param represents the number of parameters that need to be trained in   each layer . In the fully connected layer, it is the number of synaptic weights . In the convolutional layer, it is the number of parameters of the convolution kernel .

2. Give an example of how to calculate param

  1 Take a three-layer fully connected neural network as an example

# 定义一个三层结构,输入层,隐藏层,输出层
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28,28)), #图片为28 28格式
    keras.layers.Dense(128, activation= tf.nn.relu),    #隐藏层有128个神经元,这个数字是自己定的
    keras.layers.Dense(10, activation = tf.nn.softmax)  #10指的输出有10个类别,10个神经元
])

             The above code creates a 3×3 network. The first layer flattens the input 28×28 image into one dimension. The second layer is the hidden layer with 128 neurons, and the third layer has 10 neurons in the output layer. As shown below:

Network structure 

          Using module.summary you can get the following table:

The table obtained by module.summary 

         param is the number of parameters that need to be trained for each layer, because this is a fully connected layer, which corresponds to the number of parameters in the weight matrix.       

  • The first layer is just flat and has no parameters that need training, so param is 0.
  • The param in the second layer is 100480, which is the 28×28 input of the first layer, multiplied by the weight corresponding to 128 neurons, that is, 28×28×128 = 100352. There are only 100352 weight parameters here. why? Because each layer has a bias, this bias is automatically added to every layer except the last layer.              So the parameters that need to be trained are (28×28+1)×128=10080
  • In the same way, 1290 = (128+1) × 100 = 1290 in the third layer, and one more bias neuron is added to the middle layer.

2 param in convolutional neural network

        Because param is the number of parameters that need to be trained. In the convolution layer, the parameter that needs to be trained is filter, which is the convolution kernel. The number of parameters in the convolution kernel is the size of param. Similarly, a bias will be added to each convolution kernel here . Just pay attention when calculating the size.

Guess you like

Origin blog.csdn.net/weixin_51286347/article/details/127739300