Using Flying Paddle to Realize the Boston House Price Prediction Task

Main points:

  • Refer to the official case

Flying paddle PaddlePaddle-an open source deep learning platform derived from industrial practice


1 Load the relevant class library of the flying paddle framework

#加载飞桨、NumPy和相关类库
import paddle
from paddle.nn import Linear
import paddle.nn.functional as F
import numpy as np
import os
import random

Paddle supports two deep learning modeling writing methods, the dynamic graph mode that is easier to debug and the static graph mode that is better in performance and easier to deploy.

  • Dynamic graph mode (imperative programming paradigm, analogous to Python): analytical execution mode. Users do not need to define a complete network structure in advance, and can obtain calculation results at the same time when writing a line of network code;
  • Static graph mode (declarative programming paradigm, analogous to C++): Compile first and then execute. Users need to define a complete network structure in advance, and then compile and optimize the network structure before executing it to obtain calculation results.

Paddle Framework 2.0 and later versions use the dynamic graph mode for encoding by default, and provide complete dynamic and static support. Developers only need to add a decorator ( to_static ), and Paddle will automatically convert the program of the dynamic graph into A program for static graphs, and use this program to train and save static models for inference deployment.

2.1 Data processing

The code for data processing does not depend on the framework implementation, and is the same as the code for building house price prediction tasks using Python

def load_data():
    # 从文件导入数据
    datafile = './work/housing.data'
    data = np.fromfile(datafile, sep=' ', dtype=np.float32)

    # 每条数据包括14项,其中前面13项是影响因素,第14项是相应的房屋价格中位数
    feature_names = [ 'CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', \
                      'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV' ]
    feature_num = len(feature_names)

    # 将原始数据进行Reshape,变成[N, 14]这样的形状
    data = data.reshape([data.shape[0] // feature_num, feature_num])

    # 将原数据集拆分成训练集和测试集
    # 这里使用80%的数据做训练,20%的数据做测试
    # 测试集和训练集必须是没有交集的
    ratio = 0.8
    offset = int(data.shape[0] * ratio)
    training_data = data[:offset]

    # 计算train数据集的最大值,最小值,平均值
    maximums, minimums, avgs = training_data.max(axis=0), training_data.min(axis=0), \
                                 training_data.sum(axis=0) / training_data.shape[0]
    
    # 记录数据的归一化参数,在预测时对数据做归一化
    global max_values
    global min_values
    global avg_values
    max_values = maximums
    min_values = minimums
    avg_values = avgs

    # 对数据进行归一化处理
    for i in range(feature_num):
        data[:, i] = (data[:, i] - avgs[i]) / (maximums[i] - minimums[i])

    # 训练集和测试集的划分比例
    training_data = data[:offset]
    test_data = data[offset:]
    return training_data, test_data

Verification code:

# 验证数据集读取程序的正确性
training_data, test_data = load_data()
print(training_data.shape)
print(training_data[1,:])

2.2 Model design

The essence of model definition is to define the network structure of linear regression . Paddle suggests to complete the definition of model network by creating a Python class. This class needs to inherit the parent class of paddle.nn.Layer, and define initfunctions and forwardfunctions in the class. forwardA function is a function specified by the framework to implement the forward calculation logic. The program will be automatically executed when the model instance is called, and the forwardnetwork layer used in the function needs to be initdeclared in the function.

  • Define initfunction: declare the implementation function of each layer of network in the initialization function of the class. In the housing price prediction task, only one fully connected layer needs to be defined, and the model structure is consistent with the chapter "Using Python and NumPy to Build a Neural Network Model";
  • Define forwardthe function: Construct the neural network structure, realize the forward calculation process, and return the prediction result. In this task, the house price prediction result is returned.
class Regressor(paddle.nn.Layer):

    # self代表类的实例自身
    def __init__(self):
        # 初始化父类中的一些参数
        super(Regressor, self).__init__()
        
        # 定义一层全连接层,输入维度是13,输出维度是1
        self.fc = Linear(in_features=13, out_features=1)
    
    # 网络的前向计算
    def forward(self, inputs):
        x = self.fc(inputs)
        return x

2.3 Training configuration

The training configuration process  is shown in Figure 2:

  • Declare the defined regression model instance as Regressor, and set the state of the model to train;
  • Use load_datafunctions to load training data and test data;
  • Set the optimization algorithm and learning rate, the optimization algorithm uses stochastic gradient descent SGD, and the learning rate is set to 0.01.
# 声明定义好的线性回归模型
model = Regressor()
# 开启模型训练模式
model.train()
# 加载数据
training_data, test_data = load_data()
# 定义优化算法,使用随机梯度下降SGD
# 学习率设置为0.01
opt = paddle.optimizer.SGD(learning_rate=0.01, parameters=model.parameters())

illustrate:

A model instance has two states: training state .train()and prediction state .eval(). During training, two processes of forward calculation and backpropagation gradient are performed, while during prediction, only forward calculation is required to specify the running state for the model. There are two reasons:

  1. Some advanced operators have different logics in the two states, such as: Dropout and BatchNorm (will be introduced in detail in the subsequent "Computer Vision" chapter);
  2. In terms of performance and storage space, it saves more memory when predicting the state (no need to record the reverse gradient), and the performance is better.

2.4 Training process

The training process adopts a two-layer loop nesting method:

  • Inner loop:  responsible for a traversal of the entire data set in batches. Assuming that the number of samples in the data set is 1000, and there are 10 samples in one batch, the number of batches to traverse the data set once is 1000/10=100, that is, the inner loop needs to be executed 100 times.

  for iter_id, mini_batch in enumerate(mini_batches):

Outer loop:  defines the number of times to traverse the data set, set by the parameter EPOCH_NUM.

  for epoch_id in range(EPOCH_NUM):

Description :

The value of the batch will affect the training effect of the model. If the batch is too large, the memory consumption and calculation time will be increased, and the training effect will not be significantly improved (each time the parameters only move a small step in the opposite direction of the gradient, so the direction does not need to be special. Accurate); the batch is too small, the sample data of each batch has no statistical significance, and the calculated gradient direction may have a large deviation. Since the training data set of the housing price prediction model is small, the batch is set to 10.

Each inner loop needs to perform  the steps shown in Figure 3, and the calculation process is exactly the same as the model written in Python.

  • Data preparation: first convert a batch of data into nparray format, and then into Tensor format;
  • Forward calculation: pour a batch of sample data into the network and calculate the output result;
  • Calculation of loss function: The forward calculation result and the real housing price are used as input, and the loss function value (Loss) is calculated through the loss function square_error_cost API.
  • Backpropagation: Execute the gradient backpropagation backwardfunction, that is, calculate the gradient of each layer layer by layer from back to front, and update the parameters ( opt.stepfunctions) according to the set optimization algorithm.
EPOCH_NUM = 10   # 设置外层循环次数
BATCH_SIZE = 10  # 设置batch大小

# 定义外层循环
for epoch_id in range(EPOCH_NUM):
    # 在每轮迭代开始之前,将训练数据的顺序随机的打乱
    np.random.shuffle(training_data)
    # 将训练数据进行拆分,每个batch包含10条数据
    mini_batches = [training_data[k:k+BATCH_SIZE] for k in range(0, len(training_data), BATCH_SIZE)]
    # 定义内层循环
    for iter_id, mini_batch in enumerate(mini_batches):
        x = np.array(mini_batch[:, :-1]) # 获得当前批次训练数据
        y = np.array(mini_batch[:, -1:]) # 获得当前批次训练标签(真实房价)
        # 将numpy数据转为飞桨动态图tensor的格式
        house_features = paddle.to_tensor(x)
        prices = paddle.to_tensor(y)
        
        # 前向计算
        predicts = model(house_features)
        
        # 计算损失
        loss = F.square_error_cost(predicts, label=prices)
        avg_loss = paddle.mean(loss)
        if iter_id%20==0:
            print("epoch: {}, iter: {}, loss is: {}".format(epoch_id, iter_id, avg_loss.numpy()))
        
        # 反向传播,计算每层参数的梯度值
        avg_loss.backward()
        # 更新参数,根据设置好的学习率迭代一步
        opt.step()
        # 清空梯度变量,以备下一轮计算
        opt.clear_grad()

2.5 Save and test the model

2.5.1 Save the model

Use the paddle.save API to save the current parameter data model.state_dict() of the model to a file for program calls for model prediction or verification.

# 保存模型参数,文件名为LR_model.pdparams
paddle.save(model.state_dict(), 'LR_model.pdparams')
print("模型保存成功,模型参数保存在LR_model.pdparams中")

Save the model and parameters separately:

model_path = 'crnn'
ocr.save_inference_model(model_path)

params_path = 'crnn_params'
paddle.fluid.io.save_params(ocr.exe, params_path, ocr.exe.get_inference_program())

2.5.2 Test model

Next, select a data sample to test the prediction effect of the model. The testing process is consistent with the process of using the model in the application scenario, and can be divided into the following three steps:

  1. Configure machine resources for model predictions. This case uses the native machine by default, so there is no need to write code to specify it.
  2. Load the trained model parameters into the model instance. It is completed by two statements, the first sentence is to read the model parameters from the file; the second sentence is to load the parameter content to the model. After loading, you need to adjust the status of the model to eval()(validation). As mentioned above, the model of the training state needs to support forward calculation and reverse conduction gradient at the same time, and the implementation of the model is relatively bloated, while the model of the verification and prediction state only needs to support forward calculation, the implementation of the model is simpler, and the performance is better. good.
  3. Input the sample features to be predicted into the model, and print out the predicted results.

Use load_one_examplethe function to draw a sample from the data set as a test sample. The specific implementation code is as follows.

def load_one_example():
    # 从上边已加载的测试集中,随机选择一条作为测试数据
    idx = np.random.randint(0, test_data.shape[0])
    idx = -10
    one_data, label = test_data[idx, :-1], test_data[idx, -1]
    # 修改该条数据shape为[1,13]
    one_data =  one_data.reshape([1,-1])

    return one_data, label
# 参数为保存模型参数的文件地址
model_dict = paddle.load('LR_model.pdparams')
model.load_dict(model_dict)
model.eval()

# 参数为数据集的文件地址
one_data, label = load_one_example()
# 将数据转为动态图的variable格式 
one_data = paddle.to_tensor(one_data)
predict = model(one_data)

# 对结果做反归一化处理
predict = predict * (max_values[-1] - min_values[-1]) + avg_values[-1]
# 对label数据做反归一化处理
label = label * (max_values[-1] - min_values[-1]) + avg_values[-1]

print("Inference result is {}, the corresponding label is {}".format(predict.numpy(), label))

Guess you like

Origin blog.csdn.net/March_A/article/details/130047199