Analyze the Pytorch-based residual neural network (ResNet18 model), and use the dataset CIFAR10 for prediction and training

Analyze the Pytorch-based residual neural network (ResNet18 model), and use the dataset CIFAR10 for prediction and training

1.0, what is a residual neural network

Note: I have little talent and knowledge, if there are any mistakes, please feel free to enlighten me

The residual neural network is actually inseparable from the convolutional neural network. We know that the convolutional neural network can be composed of many convolutional layers, activation layers, and pooling layers. It does not matter how many, but as the number of layers increases , the amount of calculation required for one round of training also increases, which is not the most unacceptable. The most helpless thing is that as the number of layers increases, the network will show negative optimization. The reasons are explained in detail below.

We know that the function of the convolution kernel is to extract features, and training the convolutional neural network is to optimize the convolution kernel to reduce the gap between the predicted value and the real value, that is, the loss value. A convolution kernel can focus on extracting a feature. Generally, a convolution layer will have multiple convolution kernels for feature extraction. If it is a linear problem, we only need one layer of convolution, or most problems are actually You can use only one layer of convolution, and then continuously train the convolution kernel in this layer of convolution, and the reason why we need multi-layer convolution is that we hope that the subsequent convolution layer will extract features of higher latitude (this paragraph does not understand It doesn’t matter if you understand it, because I don’t understand it very well, maybe I will update it later)

​ But with the increase of the convolutional layer, the input between layers generally depends on the output of the previous layer, but there is a problem, if the output of the previous layer does not effectively extract features, such as : A certain convolution kernel is used to judge whether there is a horizontal line in the middle of the image, but because this convolution kernel is too poor to extract the features of the horizontal line, then the output of this convolution kernel is useless, and the subsequent convolution No matter how the convolution layer is based on this output, it is useless, that is, the output of this convolution kernel is invalid data, then it is not g, so I thought of a way:

​ Suppose there are three layers of convolution: A, B, C, and the output of the convolution of the A layer are A1, and the B layer will use A1 as the input for convolution to obtain B1, but because B1 is invalid data, B1 cannot be directly converted at this time Put it into the C layer as an input, so we can add A1 and B1 as the input of the C layer. The advantage of this is that even if B1 is not very useful, A1 is useful, so that the convolution of the C layer can still obtain a certain Characteristics.

In general, the residual neural network is equivalent to increasing the weight of each output convolution in the back

2.0, How to use Pytorch to write a residual neural network

Here I take the ResNet18 model as an example. This model is a residual neural network model provided by Pytorch. 18 can be simply understood as the number of layers. The larger the value, the more layers, but I don’t know exactly what layer it is,,,

2.1, official website model structure

We can use the following code to get the official model (we will reproduce this model below)

import torchvision
# True表示需不需要使用官方提供的模型参数、模型就相当于一个骨架,我们对模型进行训练得到的数据会反馈给模型,让模型由骨架转变为有五脏六腑的人类,,,模型参数就相当于五脏六腑、具体使用还要看是否符合你需要训练的数据集,如,如果都是对图片进行分类,那就可以为True
resnet = torchvision.models.resnet18(pretrained=True)
print(resnet)

Running the code we see this output:

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer2): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer3): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer4): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
  (fc): Linear(in_features=512, out_features=1000, bias=True)
)
进程已结束,退出代码0

Let's break it down:

1. It consists of an initial convolutional layer plus 4 residual blocks and the last fully connected layer

A convolutional layer is needed at the beginning because the residual neural network needs an output like A1 as a subsequent input

The fully connected layer is used to turn the data into one-dimensional, and then perform classification or the operation you need,,, (it doesn't matter if you don't understand, just wash it later)

ResNet(
    开始的卷积层
    (layer1): Sequential()# 残差块
    (layer2): Sequential()
    (layer3): Sequential()
    (layer4): Sequential()
	全连接层
)

2.2. Use the ResNet18 model to process the CIFAR10 training set

What is the CIFAR10 training set? You can directly read my previous blog about convolutional neural networks. There are notes in it: Blog

The following is the official website explanation of the data set: Official website

The general meaning is that the picture is 32X32 in size, and the number of channels is 3 channels (that is, RGB three colors).

Code writing steps:

1. Obtain the data set and process the data set into the form required by the model

2. Create a residual neural network model

3. Create a loss function

4. Create an optimizer

5. Training:

1. Read a certain number of pictures

2. Put the picture into the model for prediction

3. Compare the predicted result with the real value

4. Gradient clearing, backpropagation

6. Test:

1. Read a certain number of pictures

2. Put the picture into the model for prediction

3. Take the predicted result with the highest probability as the output and compare it with the real value

​ 4. Get all the comparison results to get the correct rate of this round of prediction

The code is not written, because it is the same as in my convolutional neural network blog, but the model needs to be turned into a residual neural network.

Let's look at how the model of the residual neural network is written:

1. Write residual blocks: You can compare residual blocks to small models

​ Divided into definition and running sequence: In the __init__() method, the things that need to be used will be defined, such as convolution, pooling, activation, normalization, etc., and in forward(self, x): Describe the order in which we run convolution, pooling, activation, and normalization.

2. Write the model of the residual neural network

​ It is almost the same as in the residual block, so it will not be repeated

code:

# 定义残差块==========================================================================
'''
stride:卷积核滑动步长,为1则输出特征图大小与输入一致,如果为2则输出特征图为原来一半,通道数为原来2倍(因为整体不变)
downsample=None:默认为None时表示不需要进行A1+B1作为C层输入的操作、这会发生在第一个残差块中
'''
class ResidualBlock(nn.Module):
    def __init__(self, conv, bn, planes, stride=1, downsample=None):
        super(ResidualBlock, self).__init__()
        self.conv1 = conv # 就是一个卷积层,只不过由外部作为参数传入
        self.bn1 = bn# 归一化,就是将里面的每一个值映射为0->1区间,为了防止过拟合,过拟合是啥呢,我也不太清楚,但是反应的结果就是训练时的损失值很小,但是测试时的损失值很大
        self.relu = nn.ReLU(inplace=True)# 激活函数,也是一个映射,具体百度或者看我卷积神经网络的博客
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)
        self.downsample = downsample# 外部会传入此参数,决定是否进行A1+B1作为C层输入的操作
        self.stride = stride # 卷积核大小

    def forward(self, x):
        identity = x  # 这里就相当于先吧A1存储,如果后续需要用到就使用
        out = self.conv1(x) # 下面4步骤依次为:卷积、归一、激活、卷积、归一
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)

        # 然后判断需要需要进行A1+B1作为C层输入的操作、如果需要,因为x的值与最新的out维度大小啥的是不匹配的,需要加一些操作使其匹配
        # 因为要将out向量与identity相加求和再传递给下一个残差块,但是out是经过卷积了,且第一次卷积,卷积核数为2,
        # 即,经过第一次卷积后特征图大小减半,但由于全部特征图应该保持不变,所以我们将输入通道数由64变为128,
        # 也因此、为了identity与out匹配,即也需要将identity从64的通道数变为128,故加此一层
        if self.downsample is not None:
            identity = self.downsample(x)

        out = out + identity # 相当于 A1+B1
        out = self.relu(out) # 激活

        return out # 作为返回值,这样下一个残差块获得的输入就是 A1+B1 了
    
    
========================编写残差神经网络模型=========================================
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

'''
bias=False:偏置,此处表示不需要
num_classes=10:类别数
nn.ReLU(inplace=True):使用了inplace=True参数。这个参数的作用是直接在原有的张量上进行计算,而不是新创建一个张量。这样可以节省一定的内存空间,同时也可以提高计算效率。
'''
# 定义ResNet18模型
class ResNet18(nn.Module):
    def __init__(self, num_classes=10):
        super(ResNet18, self).__init__()
        # 具体方法参数在卷积神经网络博客中有详细介绍,我这里只说为啥是这个值
        # 因为图片是三通道,输出通道是64、然后将卷积核设置为7X7(它这里是简写),卷积核步长:2,将图片四周多加3块,变成35X35,这样更好提取边缘特征
        self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
        
        self.bn1 = nn.BatchNorm2d(64)  # 归一化 维度不变,变的只有里面的值
        self.relu = nn.ReLU(inplace=True)
        # 池化
        self.pool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
        # 64:通道数,2:残差块数量、blocks:只循环执行次数,为2时执行一次
        self.layer1 = self._make_layer(planes=64, blocks=2)
        self.layer2 = self._make_layer(planes=128, blocks=2, stride=2)
        self.layer3 = self._make_layer(planes=256, blocks=2, stride=2)
        self.layer4 = self._make_layer(planes=512, blocks=2, stride=2)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(512, num_classes)

    def _make_layer(self, planes, blocks, stride=1):
        downsample = None # 先初始化
        # 大致意思是,我们将第一个残差块不进行 A1+B1的操作,将后续的残差块进行,为啥我也不知道
        if stride != 1 or planes != 64:
            # 此处就是为 A1能够与B1相加做准备,将A1进行卷积与归一后其维度才与B1相同
            downsample = nn.Sequential(
                nn.Conv2d(int(planes/stride), planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(planes),
            )
		# 用来存储一个  self.layer1(x) 方法中需要多少个残差块,这由不同深度的残差神经网络去决定,这里是两个,固定一个加上通过变量blocks的值去决定是否还加,这样写也就是为了扩展性,不这样写也行
        layers = []
        '''
                 参数为啥要这样写 nn.Conv2d(int(planes/stride), planes, kernel_size=3, stride=stride
         正常情况下,如果stride=1,那么通道数应该是输入多少输出就是多少,但由于stride有等于2的情况,所以我们在初始通道数需要进行除法,但是除法后值是浮点数,而参数需要整型,所以使用int(),而且我这里这样写是为了迎合:
         self.layer1 = self._make_layer(planes=64, blocks=2)
         即我们开始定义的输入,因为planes变量是作为输出的定义,所以我们需要计算输入值、当输入变成:
         self.layer2 = self._make_layer(planes=128, blocks=2, stride=2)
         时,为了保证输入是输出的一半,所以这样写、也可以自己改
        '''
        layers.append(ResidualBlock(nn.Conv2d(int(planes/stride), planes, kernel_size=3, stride=stride, padding=1, bias=False), nn.BatchNorm2d(planes), planes, 1, downsample))
        for i in range(1, blocks):
            layers.append(ResidualBlock(nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False),
                                        nn.BatchNorm2d(planes), planes))

        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.pool(x)
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)
        # # 全局平均池化层、它的作用是将每个特征图上的所有元素取平均,得到一个指定大小的输出。在 ResNet18 中,该池化层的输出大小为 。(batch_size, 512, 1, 1)用于最终的分类任务
        x = self.avgpool(x)
        # # 作用是将张量  沿着第 1 个维度进行压平,即将  转换为一个 1 维张量,
        x = torch.flatten(x, 1)
        # # 这里对应模型的 self.fc = nn.Linear(512, num_classes),就是将一维向量经过映射缩小到 10,因为CIFAR10是个10分类问题
        x = self.fc(x)

        return x

Then, replace this model with the model in my previous convolutional neural network for training and testing:
Result:
You can see that after only 700 trainings, there is a 50% correct rate

PyDev console: starting.
Python 3.9.16 (main, Mar  8 2023, 10:39:24) [MSC v.1916 64 bit (AMD64)] on win32
runfile('C:\\Users\\11606\\PycharmProjects\\pythonProject\\src\\train.py', wdir='C:\\Users\\11606\\PycharmProjects\\pythonProject\\src')
Files already downloaded and verified
Files already downloaded and verified
训练数据集长度:50000
测试数据集长度:10000
ResNet18(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (pool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): ResidualBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): ResidualBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer2): Sequential(
    (0): ResidualBlock(
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): ResidualBlock(
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer3): Sequential(
    (0): ResidualBlock(
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): ResidualBlock(
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer4): Sequential(
    (0): ResidualBlock(
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): ResidualBlock(
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
  (fc): Linear(in_features=512, out_features=10, bias=True)
)
----------0 轮训练开始
训练次数:100,loss:1.7689638137817383
训练次数:200,loss:1.4394694566726685
训练次数:300,loss:1.338126301765442
训练次数:400,loss:1.412974238395691
训练次数:500,loss:1.241265892982483
训练次数:600,loss:1.250759482383728
训练次数:700,loss:1.3602595329284668
整体测试集上的Loss:226.82027339935303
整体测试集上的正确率:0.505899965763092
模型已保存
----------1 轮训练开始
训练次数:800,loss:1.1243025064468384

おすすめ

転載: blog.csdn.net/qq_43483251/article/details/130263126