Deep Neural Networks - An Introduction to Neural Networks

activation function

Introduce nonlinear factors into the neural network, and through the activation function, the neural network can fit various curves. If the activation function is not used, the output of each layer is a linear function of the input of the upper layer. No matter how many layers the neural network has, the output is a linear combination of the inputs.

Sigmoid/logistics function

insert image description here

tf.nn.sigoid

tanh (hyperbolic tangent curve)

Compared with sigmoid, it is centered on 0
insert image description here

tf.nn.tanh

resume

insert image description here

tf.nn.relu

LeakyReLu

Mitigation of ReLU's "neuron death"
insert image description here

tf.nn.leaky_relu

SoftMax

Used in the multi-classification process, it is an extension of the binary classification function sigmoid on multi-classification, the purpose is to display the results of multi-classification in the form of probability
insert image description here
insert image description here

tf.nn.softmax

other activation functions

insert image description here

How to choose an activation function
insert image description here

parameter initialization

For a certain neuron, there are two types of parameters that need to be initialized: one is the weight w, and the other is the bias b, and the bias b is initially 0. The initialization of the weight w is more important

random initialization

Sampling from a normal distribution with mean 0 and standard deviation 1, initialize the parameter W with some small values

standard initialization

The weight parameters are initialized uniformly with values ​​from the interval. That is, in ( − 1 d \frac{-1}{\sqrt{d}}d 1, 1 d \frac{1}{\sqrt{d}} d 1) to generate the weight of the current neuron in a uniform distribution, where d is the number of inputs to each neuron

Xavier initialization (Glorot initialization)

insert image description here

import tensorflow as tf
init=tf.keras.initializers.glorot_normal()
values=init(shape=(9,1))
print(values)

insert image description here

import tensorflow as tf
init=tf.keras.initializers.glorot_uniform()
values=init(shape=(9,1))
print(values)

He initialization

insert image description here

import tensorflow as tf
init=tf.keras.initializers.he_normal()
values=init(shape=(9,1))
print(values)

insert image description here
insert image description here

import tensorflow as tf
init=tf.keras.initializers.he_uniform()
values=init(shape=(9,1))
print(values)

The construction of the neural network

Build with Sequential

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

model=keras.Sequential([
    #第一个隐层
    layers.Dense(3,activation='relu',kernel_initializer='he_normal',name='layer1',input_shape=(3,)),
    #第二个隐层
    layers.Dense(2,activation='relu',kernel_initializer='he_normal',name='layer2'),
    #输出层
    layers.Dense(2,activation='sigmoid',kernel_initializer='he_normal',name='layer3')
],
    name='sequential'
)
model.summary()

insert image description here
param is the number of weights and biases

keras.utils.plot_model(model)

insert image description here

Build with functional API

#定义模型的输入
inputs=keras.Input(shape=(3,),name='input')
#第一个隐层
x=layers.Dense(3,activation='relu',name='layer1')(inputs) #此层输入为inputs,输出为x
#第二个隐层
x=layers.Dense(2,activation='relu',name='layer2')(x)
#输出层
outputs=layers.Dense(2,activation='sigmoid',name='output')(x)

#创建模型
model=keras.Model(inputs=inputs,outputs=outputs,name='Functional API Model')
model.summary()

insert image description here

keras.utils.plot_model(model,show_shapes=True)

insert image description here

Build by subclassing model

#定义一个model的子类
class MyModel(keras.Model):
    #定义网络的层结构
    def __init__(self):
        super(MyModel,self).__init__() #继承并初始化父类MyModel的方法
        #第一个隐层
        self.layers1=layers.Dense(3,activation='relu',name='layer1')
        #第二个隐层
        self.layers2=layers.Dense(2,activation='relu',name='layer2')
        #输出层
        self.layers3=layers.Dense(2,activation='sigmoid',name='layer3')
    #定义网络的前向传播
    def call(self,inputs):
        x=self.layers1(inputs)
        x=self.layers2(x)
        outputs=self.layers3(x)
        return outputs
#实例化模型
model=MyModel()
#设置输入
x=tf.ones((1,3))
y=model(x)
model.summary()

insert image description here

Guess you like

Origin blog.csdn.net/qq_40527560/article/details/131489542