How does Keras build complex models?

Preface

In Keras, we most commonly use the Sequencial model to build a neural network model.
Sequencial's model building is particularly simple and straightforward, but the disadvantage is that this approach cannot build a nonlinear model (such as a residual network), and it cannot build a model with multiple inputs and multiple outputs.
This blog is about how to use the functional API in Keras to build a nonlinear, multi-input and output neural network model.

Keras functional API

Keras functional API-called Keras functional API before translation- refers to the layer or model created in Keras as a function, which can accept tensors as parameters and output processed Tensor .

E.g:

dense = layers.Dense(64, activation="relu")

Then dense can be used as a function, so the following call is legal:

x= dense(inputs)

Where inputsis also a tensor.

Keras treats each functional API as a node in a neural network calculation graph. Each time a functional API is created, a new node is added to the current calculation graph, and the edges between nodes are passed through Tensors to connect.

After the calculations required to build the node and FIG well, and then call keras.Modelto specify the input good model, it can be output. The call is keras.Modelmainly for the convenience of using the fit function of the Model object to fill the model with data.

Next, we give a few examples to illustrate how to use Keras functional API to build complex models.

Build a simple residual network

First, build the calculation graph:

inputs = keras.Input(shape=(32, 32, 3), name="img")
x = layers.Conv2D(32, 3, activation="relu")(inputs)
x = layers.Conv2D(64, 3, activation="relu")(x)
block_1_output = layers.MaxPooling2D(3)(x)

x = layers.Conv2D(64, 3, activation="relu", padding="same")(block_1_output)
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
block_2_output = layers.add([x, block_1_output]) 
#block_1_output 直接跳过两个卷积层

x = layers.Conv2D(64, 3, activation="relu", padding="same")(block_2_output)
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
block_3_output = layers.add([x, block_2_output])
#block_2_output 直接跳过两个卷积层

x = layers.Conv2D(64, 3, activation="relu")(block_3_output)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(256, activation="relu")(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(10)(x)

Call the keras.Modelfunction to specify the input and output of the model.

model = keras.Model(inputs, outputs, name="toy_resnet")
model.summary()

Calculation graph: When the
Insert picture description here
model is training, call the fit function to feed in the data.

Multiple input, multiple output model

Suppose, we have the following model road traffic congestion model:
Insert picture description here
This model will have two inputs, one is the historical information of the traffic flow, and the other is the length information of the traffic flow.
These two inputs have to go through different neural networks for feature extraction. Length information is flow_length_classificationbranched, while historical information is branched flow_client_net. Finally, concate the two parts of information to make the final classification.
Moreover, we also want to use flow_length_classificationthe output of as an output.

How should we build it?
Build calculation diagram:

with self.session.as_default():
    with self.graph.as_default():
                self.flow_pkt_length = Input(shape=(self.flow_length,1),name='flow_pkt_length')
                self.flow_client_history = Input(shape=(self.frame_length,self.nb_classes),name='flow_client_history')
                self.flow_length_vector = self.flow_sidechannel_extractor(self.flow_pkt_length)

                self.flow_client_history_vector = self.flow_client_extractor(self.flow_client_history)

                vectors = Concatenate()([self.flow_length_vector, self.flow_client_history_vector])

                self.flow_label = self.flow_final_classification(vectors)

Call the keras.Modelfunction to string the input and output together:

                self.model = Model(inputs=[self.flow_pkt_length,self.flow_client_history],
                                   outputs=[self.flow_length_vector,self.flow_label],name='FP-Net')

Because there are two outputs, when compile the model, you need to specify the corresponding loss and metrics for each output:

        with self.session.as_default():
            with self.graph.as_default():
                #使用Adamx 训练方法
                OPTIMIZER = Adamax(lr=1e-4,
                                  beta_1=0.9,
                                  beta_2=0.99,
                                  epsilon=1e-8,
                                  decay=0)
                self.model.compile(optimizer=OPTIMIZER,
                              #评价指标,使用准确率作为评价指标
                              metrics={
    
    
                                 'flow_length_classification': "accuracy",
                                 'final_classification': "accuracy"
                              },
                              #两个loss,第一部分是基于流的包长序列的loss,第二部分是基于包长序列和客户端历史信息的loss
                              loss={
    
    
                                'flow_length_classification': "categorical_crossentropy",
                                'final_classification': "categorical_crossentropy"
                              },
                              #两个loss的权重,
                              loss_weights={
    
    
                                'flow_length_classification': 0.3,
                                'final_classification':0.7,
                              })

It is worth noting that we use a dictionary when specifying loss and metrics, where key represents the name of the output tensor, which is given by the model we built. 'flow_length_classification' and 'final_classification' two sub-network names, rather than sub-network layer tensor of the last name , because my model inside two sub-networks with Sequencial built.

When training, we feed the data like this:

self.model.fit({
    
    
                            'flow_pkt_length': X_train_pkt_length,
                            'flow_client_history': X_train_client_history
                            },{
    
    
                            'flow_length_classification' : y_train,
                            'final_classification': y_train
                            },validation_data=({
    
    
                            'flow_pkt_length': X_valid_pkt_length,
                            'flow_client_history': X_valid_client_history
                            },{
    
    
                            'flow_length_classification' : y_valid,
                            'final_classification': y_valid
                            }
                        ),
                            epochs = 50,
                            batch_size=128,
                            verbose=2
                            )

When we feed the data, we also feed the input data through the tensor string model. 'flow_pkt_length' and'flow_client_history' are two keras.Inputnames.

Guess you like

Origin blog.csdn.net/jmh1996/article/details/111089236