"TensorFlow 2.0 depth learning algorithm real materials" study notes (seven, Kears high-level interface)

Keras is a Python language developed mainly by the open source neural network computing base. Keras library is divided into front and rear, wherein the rear end can be based on the existing depth learning framework implemented as Theano, CNTK, TensorFlow, i.e., front-end interface through a unified interface Keras abstract API.

So Keras and tf.keras What is the difference with the link? In fact Keras can be understood as a set of high-level API to build and train the neural network protocols, Keras itself has achieved this agreement, you can easily call TensorFlow, CNTK back-end to complete the accelerated computing; in TensorFlow, but also to achieve a set of Keras agreement that tf.keras, but only based on TensorFlow back-end computing, and support TensorFlow better. For developers who use the TensorFlow, tf.keras can be understood as a common sub-modules, and other sub-modules, such as tf.math, tf.data and so did not make any difference.

Common functional module

Keras provides a series of high-level neural network classes and functions, such as the common set of data loading function, the network layer type, model container, loss function class, class optimization, like classical model class.

Common network layer class

  • tf.nn module: For the common neural network layer, a bottom tensor way to implement the interface function, the interface function module generally tf.nn
  • tf.keras.layers Namespace: network layer provides a number of common class interface, such as a fully connected layers, the aqueous layer is activated, cell layer, convolution layer, like recurrent neural network layer

In Softmax layer, for example, both may be used to complete the front tf.nn.softmax Softmax function arithmetic logic propagation, the class may be set up by the network layer Softmax layers.Softmax (axis), wherein the axis of the operation argument specifies the dimension softmax .

import tensorflow as tf
# 导入keras 模型,不能使用import keras,它导入的是标准的Keras 库
from tensorflow import keras
from tensorflow.keras import layers # 导入常见网络层类

Then create Softmax layer, and calls to the calculation method to complete before __call__:

In [1]:
x = tf.constant([2.,1.,0.1])
layer = layers.Softmax(axis=-1) # 创建Softmax 层
layer(x) # 调用softmax 前向计算

After Softmax network layer to obtain a probability distribution:

The container network

For common network, call operation prior to completion of propagation of each layer class instance manually, when the network becomes deeper layers, this part of the code is very cumbersome. Sequential vessel network may be provided by a plurality of network layer Keras encapsulated into a large network model, only need to call model example of a network to complete a data sequence of operation from the first layer to the last layer.

For example, 2-layer fully connected plus a separate layer activation function, a network may be packaged by Sequential containers.

# 导入Sequential 容器
from tensorflow.keras import layers, Sequential
network = Sequential([ # 封装为一个网络
layers.Dense(3, activation=None), # 全连接层
layers.ReLU(),#激活函数层
layers.Dense(2, activation=None), # 全连接层
layers.ReLU() #激活函数层
])
x = tf.random.normal([4,3])
network(x) # 输入从第一层开始,逐层传播至最末层

Sequential containers can continue by adding a new network layer add () method, functional dynamically created network:

In [2]:
layers_num = 2 # 堆叠2 次
network = Sequential([]) # 先创建空的网络
for _ in range(layers_num):
	network.add(layers.Dense(3)) # 添加全连接层
	network.add(layers.ReLU())# 添加激活函数层
network.build(input_shape=(None, 4)) # 创建网络参数
network.summary()

By summary () function can easily print out the amount of network structure and parameters:
Here Insert Picture Description
see Layer as each name, the name is maintained by internal TensorFlow, Python and object name are not the same, Param # number of parameters as layer , total params parameter statistics of the total amount, the amount trainable params parameter is to be optimized, Non-trainable params that does not require optimization of the amount of parameters.

When we Sequential through the multi-layer network layer encapsulation capacity parameter list will be automatically incorporated into all layers of the container Sequential parameter list, the parameter list does not require artificial merge network. trainable_variables objects and variables Sequential Optimization tensor contains a list of all layers and be all tensor list:

In [3]: # 打印网络的待优化参数名与shape
for p in network.trainable_variables:
	print(p.name, p.shape)
Out[3]:
dense_2/kernel:0 (4, 3)
dense_2/bias:0 (3,)
dense_3/kernel:0 (3, 3)
dense_3/bias:0 (3,)

Model assembly, testing and training

In training the network, the general flow is obtained by the network prior to calculating an output value, then the network error is calculated by the loss of function, and then by derivation tool automatically calculate and update the gradient, while intermittently test network performance.

Model assembly

In Keras, there are two special classes: keras.Model and keras.layers.Layer classes. Wherein Layer class parent class is the network layer, the network defines a number of common functional layer, such as adding weights, like the management of the list of values. Model class is the parent class network, in addition to the function Layer class, also added a save, load model, training and testing models handy features. Sequential is a subclass of Model, and therefore has all the features of the Model class.

Next comes the model assembly and training function Model and its subclasses. We have container package Sequential network as an example, first create a fully connected network layer 5 for digital pictures MNIST handwriting recognition:

# 创建5 层的全连接层网络
network = Sequential([layers.Dense(256, activation='relu'),
							layers.Dense(128, activation='relu'),
							layers.Dense(64, activation='relu'),
							layers.Dense(32, activation='relu'),
							layers.Dense(10)])
network.build(input_shape=(None, 28*28))
network.summary()

After creating a network, the normal process by the multi-pass loop iteration data sets, each training data generated by batch, prior to the calculation, then the error value is calculated by the loss of function, and automatically calculates the gradient back propagation, updated network parameters. Since this part of the logic is very common, there is provided a compile () and Fit () function conveniently implemented in the above-described logic keras. Optimization specified first object by using compile network function, loss function, evaluation indicators:

# 导入优化器,损失函数模块
from tensorflow.keras import optimizers,losses
# 采用Adam 优化器,学习率为0.01;采用交叉熵损失函数,包含Softmax
network.compile(optimizer=optimizers.Adam(lr=0.01),
			loss=losses.CategoricalCrossentropy(from_logits=True),
			metrics=['accuracy'] # 设置测量指标为准确率
)

We specified in compile () function parameter optimizer, and so is loss of function when we need to use their own training parameters, and there is nothing special, but keras this part of the common logic to achieve, and improve development efficiency.

Model training

After completion of the assembly model can () function of the data to be sent to the training and validation data sets used by the fit:

# 指定训练集为train_db,验证集为val_db,训练5 个epochs,每2 个epoch 验证一次
# 返回训练信息保存在history 中
history = network.fit(train_db, epochs=5, validation_data=val_db,
validation_freq=2)

Wherein train_db is tf.data.Dataset objects, you can also pass Numpy Array types of data; epochs specified number of iterations of training epochs; validation_data specified verification (testing) and validation data for a frequency validation_freq.

Run the code to implement the function of training and validation of the network, the function will return Fit data recording History training process, wherein history.history a dictionary object containing Loss training process, measurement indicators entries:

In [4]: history.history # 打印训练记录
Out[4]: # 训练准确率
{'accuracy': [0.00011666667, 0.0, 0.0, 0.010666667, 0.02495],
'loss': [2465719710540.5845, # 训练误差
78167808898516.03,
404488834518159.6,
1049151145155144.4,
1969370184858451.0],
'val_accuracy': [0.0, 0.0], # 验证准确率
# 验证误差
'val_loss': [197178788071657.3, 1506234836955706.2]}

You can see the code compile & fit achieved by way of a very simple and efficient, greatly reducing development time. But because the interface is very high-level, flexibility is also reduced, if the user needs to use their own judgment.

Model test

By Model.predict (x) to complete the predictive model method:

# 加载一个batch 的测试数据
x,y = next(iter(db_test))
print('predict x:', x.shape)
out = network.predict(x) # 模型预测
print(out)

Wherein the network is the output out.

If only simple performance test model, the test may be to cycle through all the samples in the data set by db Model.evaluate (db), and prints out the performance index:

network.evaluate(db_test) # 模型测试

Save and load model

After completion of the training model, the model needs to be saved to the file system, so as to facilitate the follow-up model testing and deployment. In fact, during training intermittently to save the model state it is also very good habit.

In Keras, there are three commonly used models to save and load methods.

Tensor way

Mainly in the state of the network and the network layer internal structure tensor parameters of the network, and therefore subject to the availability of the network structure of the source file, stored directly to the network parameter tensor is the most lightweight way the document. We MNIST handwriting recognition model as an example of digital pictures, you can tell by calling Model.save_weights current network parameters are saved to the file path (path) method:

network.save_weights('weights.ckpt')

The code will save the network model to weights.ckpt file, when needed, just a good first create a network object, then call the network objects load_weights (path) method can be saved in the specified model file tensor value write into the current network parameters to:

# 保存模型参数到文件上
network.save_weights('weights.ckpt')
print('saved weights.')
del network # 删除网络对象
# 重新创建相同的网络结构
network = Sequential([layers.Dense(256, activation='relu'),
							layers.Dense(128, activation='relu'),
							layers.Dense(64, activation='relu'),
							layers.Dense(32, activation='relu'),
							layers.Dense(10)])
network.compile(optimizer=optimizers.Adam(lr=0.01),
			loss=tf.losses.CategoricalCrossentropy(from_logits=True),
			metrics=['accuracy']
)
# 从参数文件中读取数据并写入当前网络
network.load_weights('weights.ckpt')
print('loaded weights!')

This save and load the network of the most lightweight way to save files in just a numeric parameter tensor, and no other additional structural parameters. But it needs to use the same network configuration to be able to restore the network status, it is generally used when the network has a source file.

Network mode

We do not need to introduce a network source file, only the model parameters need to restore a file mode network model. Function may be preserved by the Model.save (path) and the structure of the model parameters of the model on a path to a file, without the need for network source file by keras.models.load_model (path) to restore the network structure and parameter.

We first save digital pictures MNIST handwriting recognition model to the file and delete network objects:

# 保存模型结构与模型参数到文件
network.save('model.h5')
print('saved total model.')
del network # 删除网络对象

At this time, the file can be restored by model.h5 structures and network status:

# 从文件恢复网络结构与网络参数
network = tf.keras.models.load_model('model.h5')

We can see, model.h5 save the file in addition to the model parameters, but also save the network configuration information, do not need to create a model in advance can be restored directly from the file in the network network objects.

SavedModel way

When you need to deploy the model to other platforms, SavedModel way using TensorFlow made more independent of platform. The model can SavedModel way to save the directory path by tf.keras.experimental.export_saved_model (network, path):

# 保存模型结构与模型参数到文件
tf.keras.experimental.export_saved_model(network, 'model-savedmodel')
print('export saved model.')
del network # 删除网络对象

Appears on the file system model-savedmodel network file directory as follows:
Here Insert Picture Description
Users do not care about saving the file format, only through

# 从文件恢复网络结构与网络参数
network = tf.keras.experimental.load_from_saved_model('model-savedmodel')

To restore the network structure and parameters, each platform can facilitate seamless network model trained well.

Custom Class

Although Keras provides many common network layer, the network layer depth study can be used far more than the classic network layer, the need to create custom logical network layer, can be implemented by a custom class. Creating a custom network layer class, need to inherit from base classes layers.Layer; custom classes to create a network, keras.Model need to inherit from base classes, the class of such custom produced easily be able to use Layer / Model base class It provides a parameter management function and also able to interact with other standards-based network layer.

Custom network layer

For self-defined network layer, and a front propagation method __init__ initialization logic required to implement the method call.

class MyDense(layers.Layer):
# 自定义网络层
def __init__(self, inp_dim, outp_dim):
	super(MyDense, self).__init__()
	# 创建权值张量并添加到类管理列表中,设置为需要优化
	self.kernel = self.add_variable('w', [inp_dim, outp_dim],
trainable=True)

self.add_variable return this tensor python referenced by the name variable name
to maintain internal TensorFlow, use less. We can

In [5]: net = MyDense(4,3) # 创建输入为4,输出为3 节点的自定义层
net.variables,net.trainable_variables

Custom Network

Create custom network class inherits first create and Model base class corresponding to the object to create the network layer are:

class MyModel(keras.Model):
	# 自定义网络类,继承自Model 基类
	def __init__(self):
			super(MyModel, self).__init__()
			# 完成网络内需要的网络层的创建工作
			self.fc1 = MyDense(28*28, 256)
			self.fc2 = MyDense(256, 128)
			self.fc3 = MyDense(128, 64)
			self.fc4 = MyDense(64, 32)
			self.fc5 = MyDense(32, 10)

Then implement custom network forward arithmetic logic:

	def call(self, inputs, training=None):
			# 自定义前向运算逻辑
			x = self.fc1(inputs)
			x = self.fc2(x)
			x = self.fc3(x)
			x = self.fc4(x)
			x = self.fc5(x)
			return x

Model Paradise

For common network model, such as ResNet, VGG, etc., do not need to manually create a network, you can be created directly from the next line of code keras.applications submodule and use these classical models, but also can be loaded by setting the pre-trained network weights parameter parameters, very convenient.

Load Model

ResNet50 to transfer learning, for example, generally after the last layer is removed ResNet50 network as a new sub-network feature extraction task, i.e. using a pre-trained ImageNet above feature extraction method to migrate to our custom data sets, and based on custom a task category additional fully connected layers classification categories corresponding to the number of data, which can be quickly and efficiently learn new tasks on the basis of pre-trained network.

# 加载ImageNet 预训练网络模型,并去掉最后一层
resnet = keras.applications.ResNet50(weights='imagenet',include_top=False)
resnet.summary()
# 测试网络的输出
x = tf.random.normal([4,224,224,3])
out = resnet(x)
out.shape

The code is automatically downloaded from the server on the model structure and the data set ImageNet pre-trained network parameters, due to the removal last layer, the output size of the network is [b, 7,7,2048].

In [6]:
# 新建池化层
global_average_layer = layers.GlobalAveragePooling2D()
In [7]:
# 新建全连接层
fc = layers.Dense(100)
# 重新包裹成我们的网络模型
mynet = Sequential([resnet, global_average_layer, fc])
mynet.summary()

By setting resnet.trainable = False parameter you can select a network portion ResNet freezing, only new training network layer, allowing fast and efficient completion of the training of the network model.

Measuring tools

In the training process, the statistical information is often necessary precision, recall rate, and in addition to obtain statistical data by calculating the average manual mode outside, Keras provides some common measurement tool keras.metrics, specially trained for statistical process index data needs.

Using the measurement tool Keras generally have four basic operating procedures: measuring a new data is written, read and cleared measuring statistics.

New measuring device

In keras.metrics module, it provides a more common measurement type, such as a statistical average Mean class, the class statistical accuracy of the Accuracy of the statistical class cosine similarity CosineSimilarity like. Here we statistical error value, for example, when the forward operation, we'll get every batch of average error, but we hope that a statistical average error epoch, so we chose to use measuring Mean:

# 新建平均测量器,适合Loss 数据
loss_meter = metrics.Mean()

data input

New data can be written by the measuring function update_state:

# 记录采样的数据
loss_meter.update_state(float(loss))
上述采样代码放置在每个batch 运算完成后,测量器会自动根据采样的数据来统计平均值。

Read statistics

After sampling a plurality of times, measuring instrument can result () function to get the statistics:

# 打印统计的平均loss
print(step, 'loss:', loss_meter.result())

Remove

Since the measurement statistics will all history, it is necessary to clear the history of the state at the right time. It can be realized by reset_states (). For example, the average error After each reading, clearing statistics, in order to start the next round of statistics:

if step % 100 == 0:
	# 打印统计的平均loss
	print(step, 'loss:', loss_meter.result())
	loss_meter.reset_states() # 清零测量器

Visualization

TensorFlow provides a special visualization tool, called TensorBoard, he will monitor the data is written to the file system through TensorFlow, and back-end monitoring using the Web corresponding file directory, which can allow users to view the monitoring data from remote networks.

TensorBoard need to be trained to use part of the browser and interactive work.

End model

In the end model, we need to create Summary class write monitoring data and write monitoring data when needed. First create monitoring objects through tf.summary.create_file_writer, and to develop monitoring data written to the directory:

# 创建监控类,监控数据将写入log_dir 目录
summary_writer = tf.summary.create_file_writer(log_dir)

We monitor error data and visualization picture data, for example, describes how to write monitoring data. After completion of the first calculation, this error for scalar data, we recorded monitoring data tf.summary.scalar function and specifies a timestamp step:

with summary_writer.as_default():
	# 当前时间戳step 上的数据为loss,写入到ID 位train-loss 对象中
	tf.summary.scalar('train-loss', float(loss), step=step)

Note that, TensorBoard distinguished by a string ID to monitor different types of data, and therefore data for errors, we name it "train-loss", other types of data is not written to the object, to prevent contamination of the data.

For the type of picture data, picture data is written by tf.summary.image monitoring function:

with summary_writer.as_default():
# 写入测试准确率
	tf.summary.scalar('test-acc', float(total_correct/total),step=step)
	# 可视化测试用的图片,设置最多可视化9 张图片
	tf.summary.image("val-onebyone-images:", val_images,max_outputs=9, step=step)

After running the model program, the corresponding data is written to the specified file directory.

Browser

When you run the program, Web back-end monitoring specified by running tensorboard --logdir path file directory path.

Opens a browser and enter the URL http: // localhost: 6006 (IP addresses can also be accessed remotely via a specific port might change, you can see the command prompt) to monitor network training schedule.

At the upper end of the monitoring page can choose to monitor pages of different types of data, such as scalar monitoring page SCALARS, picture visual page IMAGES, you can view a histogram and so tensor in HISTOGRAMS page.

In addition to monitoring scalar data and picture data, TensorBoard also supported by tf.summary.histogram view tensor data histogram distribution, as well as information tf.summary.text print text:

with summary_writer.as_default():
	# 当前时间戳step 上的数据为loss,写入到ID 位train-loss 对象中
	tf.summary.scalar('train-loss', float(loss), step=step)
	# 可视化真实标签的直方图分布
	tf.summary.histogram('y-hist',y, step=step)
	# 查看文本信息
	tf.summary.text('loss-text',str(float(loss)))
Published 227 original articles · won praise 94 · views 540 000 +

Guess you like

Origin blog.csdn.net/wiborgite/article/details/104266533