Fully connected Dense while writing code and learning

1. The principle of full connection

Dense Layer in Tensorflow

Fully Connected Neural Network (Fully Connected Neural Network) is the most basic neural network structure, also known as Multilayer Perceptron (MLP). Its principle is to simulate the connection mode between neurons in the human brain. It consists of multiple layers of neurons, and each neuron is connected to all neurons in the previous layer and the next layer.

The principle of fully connected neural network is as follows:

  1. Input Layer : Accepts input data, and each input feature corresponds to an input node.

  2. Hidden Layer : Located between the input layer and the output layer, it can contain multiple layers. Each hidden layer consists of multiple neurons, and each neuron is connected to all neurons in the previous layer with weight values.

  3. Output Layer : Output the prediction results of the neural network, usually corresponding to the category or value of the problem.

  4. Weights : Each connection has a weight value, indicating the strength of the connection. Weight values ​​are updated during network training to enable the neural network to learn appropriate feature representations and patterns.

  5. Biases : Each neuron has a bias term, which can be seen as the activation threshold of the neuron. Bias can adjust whether a neuron is activated or not.

  6. Activation Function : Located in each neuron, it is used to introduce nonlinearity, allowing the neural network to learn complex function mappings. Common activation functions include Sigmoid, ReLU, tanh, etc.

 The process of training a fully connected neural network is generally achieved through the backpropagation algorithm (Backpropagation) . It consists of forward propagation (from input to output) to compute predictions and compute the error between the predicted and true values; then backpropagation to compute gradients and update weights and biases in the network to minimize the error function . This process is iterated until the network achieves good predictive performance.

shortcoming:

When dealing with large-scale data, fully connected neural networks may face problems such as overfitting and large consumption of computing resources. To solve these problems, more complex neural network structures, such as convolutional neural network (CNN) and recurrent neural network (RNN), have been developed, which perform well in specific domains.

2. Detailed introduction of tf.keras.layers.Dense method and parameters

tf.keras.layers.Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)

parameter

  • units : positive integer, the dimension of the output space.
  • activation : The activation function to use. If you don't specify anything, no activation will be applied (i.e. "linear" activation: a(x) = x).
  • use_bias : Boolean, whether the layer uses a bias vector.
  • kernel_initializer : The initializer for the kernel weight matrix.
  • bias_initializer r: The initializer for the bias vector.
  • kernel_regularizer : A regularization function to apply to the kernel weight matrix.
  • bias_regularizer : A regularization function to apply to the bias vector.
  • Activity_regularizer : A regularization function (its "activations") to apply to the output of the layer.
  • kernel_constraint : Constraint function to apply to the kernel weight matrix.
  • bias_constraint : Constraint function to apply to the bias vector.

Just regular densely-connected NN layers.

Dense implementation operation: output = activation(dot(input, kernel) + bias), where activation is the element-wise activation function passed as the activation parameter, and kernel is the weight matrix created by the layer (weights matrix), bias (bias) is the bias vector created by the layer (only applicable if use_bias is True). These are properties of Dense.

Note: If the layer's input has a rank greater than 2, Dense computes the dot product between the input and the kernel along the last axis of the input and the kernel's axis 0 (using tf.tensordot). For example, if the input has dimensions (batch_size, d0, d1), then we create a kernel of shape (d1, units) , and the kernel 2 pairs each subunit of shape (1, 1, d1) along the axis of the input tensors (subtensors with batch_size * d0). The output in this case will have shape (batch_size, d0, units).

Also, properties of a layer cannot be modified after being called once (except for trainable properties). When passed the popular kwarg input_shape keras will create an input layer to insert before the current layer. This can be considered equivalent to explicitly defining an InputLayer.

Input Shape

N-dimensional tensor (ND tensor), shape: (batch_size, ..., input_dim). The most common case is a 2D input of shape (batch_size, input_dim).

Output Shape

N-dimensional tensor (ND tensor), shape: (batch_size, ..., units). For example, for a 2D input of shape (batch_size, input_dim), the output will have shape (batch_size, units).

3. Example code

3.1. One layer Dense model

Build a model built only with Dense. The input is 20, the output is 10. The activation function is relu. The default is no activation function. The shape of the weight matrix is ​​(20, 10). The shape of the bias is (10). Weight parameter = 20 x 10 +10

def simple_dense_layer():
    # Create a dense layer with 10 output neurons and input shape of (None, 20)
    model = tf.keras.Sequential([
     keras.layers.Dense(units=10, input_shape=(20,), activation = 'relu')
    ]);
    # Print the summary of the dense layer
    print(model.summary())

if __name__ == '__main__':
    simple_dense_layer()

output:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 10)                210       
                                                                 
=================================================================
Total params: 210
Trainable params: 210
Non-trainable params: 0
_________________________________________________________________

3.2. Multi-layer Dense model

The model built by three times Dense. The activation function of the first two layers is relu. The last layer is softmax.

def multi_Layer_perceptron():
    input_dim = 20
    output_dim = 5

    # Create a simple MLP with 2 hidden dense layers
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(units=64, activation='relu', input_shape=(input_dim,)),
        tf.keras.layers.Dense(units=32, activation='relu'),
        tf.keras.layers.Dense(units=output_dim, activation='softmax')
    ])

    # Print the model summary
    print(model.summary())
if __name__ == '__main__':
    multi_Layer_Perceptron()

output


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 64)                1344      
                                                                 
 dense_1 (Dense)             (None, 32)                2080      
                                                                 
 dense_2 (Dense)             (None, 5)                 165       
                                                                 
=================================================================
Total params: 3,589
Trainable params: 3,589
Non-trainable params: 0
_________________________________________________________________

3.3. View weight matrix and modify weight matrix and bias.

Define a Dense layer. Inputting a set of data causes Dense to initialize its weight matrix and biases. Then print the weight matrix and biases. You will find that they are all random numbers between -1 and 1.

def change_weight():
    # Create a simple Dense layer
    dense_layer = keras.layers.Dense(units=5, activation='relu', input_shape=(10,))

    # Simulate input data (batch size of 1 for demonstration)
    input_data = tf.ones((1, 10))

    # Pass the input data through the layer to initialize the weights and biases
    _ = dense_layer(input_data)

    # Access the weights and biases of the dense layer
    weights, biases = dense_layer.get_weights()

    # Print the initial weights and biases
    print("Initial Weights:")
    print(weights)
    print("Initial Biases:")
    print(biases)

output

Initial Weights:
[[-0.11511135  0.32900262 -0.1294617  -0.03869444 -0.03002286]
 [-0.24887764  0.20832229  0.48636192  0.09694523 -0.0915786 ]
 [-0.22499037 -0.1025297   0.25898546  0.5259896  -0.19001997]
 [-0.28182945 -0.38635993  0.39958888  0.44975716 -0.21765932]
 [ 0.418611   -0.56121594  0.27648276 -0.5158085   0.5256552 ]
 [ 0.34709007 -0.10060292  0.4056484   0.6316313   0.12976009]
 [ 0.40947527 -0.2114836   0.38547724 -0.1086036  -0.29271656]
 [-0.30581984 -0.14133212 -0.11076003  0.36882895  0.3007568 ]
 [-0.45729238  0.16293162  0.11780071 -0.31189078 -0.00128847]
 [-0.46115184  0.18393213 -0.08268476 -0.5187934  -0.608922  ]]
Initial Biases:
[0. 0. 0. 0. 0.]

According to the shape of the weight matrix and the shape of the offset, change them to 1 and 0 respectively. Then set the weight matrix and offset in Dense. After that, a set of data is input and a set of output is obtained. You can manually calculate it according to our theory above and verify whether it is the same in the end.

    # Modify the weights and biases (for demonstration purposes)
    new_weights = tf.ones_like(weights)  # Set all weights to 1
    new_biases = tf.zeros_like(biases)  # Set all biases to 0

    # Set the modified weights and biases back to the dense layer
    dense_layer.set_weights([new_weights, new_biases])

    # Access the weights and biases again after modification
    weights, biases = dense_layer.get_weights()

    # Print the modified weights and biases
    print("Modified Weights:")
    print(weights)
    print("Modified Biases:")
    print(biases)

    input_data = tf.constant([[1,1,3,1,2,1,1,1,1,2]])

    output = dense_layer(input_data)

    print(output)

output

Modified Weights:
[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
Modified Biases:
[0. 0. 0. 0. 0.]
tf.Tensor([[14. 14. 14. 14. 14.]], shape=(1, 5), dtype=float32)

3.4. Add a custom activation function

Customize an activation function, this function is very simple, it is to square the input value. Then set the custom weight matrix and bias to the Dense layer, input the data defined by yourself, and finally verify whether the result is the same as the theory introduced.

def custom_Activation_Function():
    def custom_activation(x):
        return tf.square(x)

    dense_layer = keras.layers.Dense(units=2, activation=custom_activation, input_shape=(4,))

    weights = tf.ones((4,2))
    biases = tf.ones((2))


    input_data = tf.ones((1, 4))
    _ = dense_layer(input_data)
    dense_layer.set_weights([weights, biases])

    # Print the modified weights and biases
    print("Modified Weights:")
    print(dense_layer.get_weights()[0])
    print("Modified Biases:")
    print(dense_layer.get_weights()[1])

    input_data = tf.constant([[1, 2, 3, 1]])
    output = dense_layer(input_data)

    print(output)


if __name__ == '__main__':
    custom_Activation_Function()

output

Modified Weights:
[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]
Modified Biases:
[1. 1.]
tf.Tensor([[64. 64.]], shape=(1, 2), dtype=float32)

3.5. Training a model can implement a specific function

Suppose I have a function: y=2x_1+3x_2+4. We want our model to implement it. Below is the code to implement this function.

def certain_function_implementation():
    import numpy as np
    from keras.models import Sequential
    from keras.layers import Dense

    # Generate random data for training
    np.random.seed(42)
    x_train = np.random.rand(1000, 2)  # 100 samples with 2 features (x1 and x2)
    y_train = 2 * x_train[:, 0] + 3 * x_train[:, 1] + np.random.randn(1000) * 0.1 + 4

    # Build the neural network
    model = Sequential()
    model.add(Dense(1, input_shape=(2,), name = 'dense_layer'))

    # Compile the model
    model.compile(optimizer='adam', loss='mean_squared_error')

    # Train the model
    epochs = 400
    model.fit(x_train, y_train, epochs=epochs)

    # Generate random data for testing
    x_test = np.array([[1, 1], [2, 3], [3, 4], [4, 5], [5, 6]])

    # Test the model with the new data
    y_pred = model.predict(x_test)
    print("Predicted outputs:")
    print(y_pred.flatten())
    print(model.get_layer("dense_layer").get_weights())


if __name__ == '__main__':
    certain_function_implementation()

Output the result. It can be seen that the weight parameters in Dense are: 2.0018072, 2.989778, 4.005859, which are already close to the function we want, and the parameters we expect should be 2, 3, 4. The predicted results are also very ideal. The error will not exceed 0.1.

.....
3Epoch 398/400
32/32 [==============================] - 0s 3ms/step - loss: 0.0095
Epoch 399/400
32/32 [==============================] - 0s 2ms/step - loss: 0.0095
Epoch 400/400
32/32 [==============================] - 0s 3ms/step - loss: 0.0096
1/1 [==============================] - 0s 34ms/step
Predicted outputs:
[ 8.997444 16.978807 21.970394 26.961979 31.953564]
[array([[2.0018072],
       [2.989778 ]], dtype=float32), array([4.005859], dtype=float32)]

Guess you like

Origin blog.csdn.net/keeppractice/article/details/131927920