Table of contents
activation function
Introduce nonlinear factors into the neural network, and through the activation function, the neural network can fit various curves. If the activation function is not used, the output of each layer is a linear function of the input of the upper layer. No matter how many layers the neural network has, the output is a linear combination of the inputs.
Sigmoid/logistics function
tf.nn.sigoid
tanh (hyperbolic tangent curve)
Compared with sigmoid, it is centered on 0
tf.nn.tanh
resume
tf.nn.relu
LeakyReLu
Mitigation of ReLU's "neuron death"
tf.nn.leaky_relu
SoftMax
Used in the multi-classification process, it is an extension of the binary classification function sigmoid on multi-classification, the purpose is to display the results of multi-classification in the form of probability
tf.nn.softmax
other activation functions
How to choose an activation function
parameter initialization
For a certain neuron, there are two types of parameters that need to be initialized: one is the weight w, and the other is the bias b, and the bias b is initially 0. The initialization of the weight w is more important
random initialization
Sampling from a normal distribution with mean 0 and standard deviation 1, initialize the parameter W with some small values
standard initialization
The weight parameters are initialized uniformly with values from the interval. That is, in ( − 1 d \frac{-1}{\sqrt{d}}d−1, 1 d \frac{1}{\sqrt{d}} d1) to generate the weight of the current neuron in a uniform distribution, where d is the number of inputs to each neuron
Xavier initialization (Glorot initialization)
import tensorflow as tf
init=tf.keras.initializers.glorot_normal()
values=init(shape=(9,1))
print(values)
import tensorflow as tf
init=tf.keras.initializers.glorot_uniform()
values=init(shape=(9,1))
print(values)
He initialization
import tensorflow as tf
init=tf.keras.initializers.he_normal()
values=init(shape=(9,1))
print(values)
import tensorflow as tf
init=tf.keras.initializers.he_uniform()
values=init(shape=(9,1))
print(values)
The construction of the neural network
Build with Sequential
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
model=keras.Sequential([
#第一个隐层
layers.Dense(3,activation='relu',kernel_initializer='he_normal',name='layer1',input_shape=(3,)),
#第二个隐层
layers.Dense(2,activation='relu',kernel_initializer='he_normal',name='layer2'),
#输出层
layers.Dense(2,activation='sigmoid',kernel_initializer='he_normal',name='layer3')
],
name='sequential'
)
model.summary()
param is the number of weights and biases
keras.utils.plot_model(model)
Build with functional API
#定义模型的输入
inputs=keras.Input(shape=(3,),name='input')
#第一个隐层
x=layers.Dense(3,activation='relu',name='layer1')(inputs) #此层输入为inputs,输出为x
#第二个隐层
x=layers.Dense(2,activation='relu',name='layer2')(x)
#输出层
outputs=layers.Dense(2,activation='sigmoid',name='output')(x)
#创建模型
model=keras.Model(inputs=inputs,outputs=outputs,name='Functional API Model')
model.summary()
keras.utils.plot_model(model,show_shapes=True)
Build by subclassing model
#定义一个model的子类
class MyModel(keras.Model):
#定义网络的层结构
def __init__(self):
super(MyModel,self).__init__() #继承并初始化父类MyModel的方法
#第一个隐层
self.layers1=layers.Dense(3,activation='relu',name='layer1')
#第二个隐层
self.layers2=layers.Dense(2,activation='relu',name='layer2')
#输出层
self.layers3=layers.Dense(2,activation='sigmoid',name='layer3')
#定义网络的前向传播
def call(self,inputs):
x=self.layers1(inputs)
x=self.layers2(x)
outputs=self.layers3(x)
return outputs
#实例化模型
model=MyModel()
#设置输入
x=tf.ones((1,3))
y=model(x)
model.summary()