Deep Learning Practical Chapter (11) -- TensorFlow Learning Road (8)

Scientific knowledge

MindSpore is an all-scenario AI computing framework. Its characteristics are that it can significantly reduce training time and cost (development state), run with fewer resources and the highest energy efficiency ratio (running state), and adapt to the whole environment including end, edge and cloud. Scenario (deployment state).

For different operating environments, the MindSpore framework architecture can be large or small, adapting to independent deployment in all scenarios. The MindSpore framework achieves cross-scenario collaboration under the premise of ensuring the protection of user privacy data by coordinating the processed gradient and model information without privacy information, rather than the data itself. In addition to privacy protection, MindSpore also built model protection into the framework to make the model safe and trustworthy.

MindSpore makes the development environment more friendly by implementing AI algorithms as code, which can significantly reduce model development time. Taking a typical NLP (Natural Language Processing) network as an example, compared with other frameworks, using MindSpore can reduce the amount of core code by 20%, greatly reduce the development threshold, and increase the overall efficiency by more than 50%.

Through the technical innovation of the MindSpore framework itself and its collaborative optimization with the Ascend processor, it effectively overcomes the complexity of AI computing and the challenges of the diversity of computing power, realizes high-efficiency running states, and greatly improves computing performance. In addition to the Ascend processor, MindSpore also supports other processors such as GPU and CPU.

review

    After studying the previous articles, I believe that you have learned how to build a common image recognition (classification) deep learning project. However, you may not be familiar with the details of the project code. Today, we will come to the previous Some places that cannot be explained in detail will be summarized, and I hope it will be helpful to everyone.

1. Summary

The summary is mainly divided into the following sections:

1. Data processing

2. Model building

3. Model training and storage

4. Model testing

data processing

The reason for the data processing part is that the original image data (jpg, png, grayscale image or multi-channel color image) needs to be decoded into the original matrix data, and then the matrix is ​​converted into the tensor (tensor) type supported by the deep learning framework. (floating-point data), and at the same time, according to the training method of the neural network, all the images in the training set are put into the network according to a certain batch (16, 32, 64, etc.) for training.

Code snippet 1: Read all the images in the training set and save them in two lists, storing the storage paths and corresponding labels of the images respectively (pre-scheduled)

Code snippet 2: Call the batch data processing function that comes with tf, divide the list read in the previous step into batches, return the batch data, and then directly pass it to the network for training.

model building

The model building is mainly responsible for the post-processing of the incoming data. Each layer of the network needs to strictly correspond to the input dimension changes, so that the forward propagation calculation can be completed.

The initial data dimension [B, H, W, C], this is the dimension information (shape) of the data input into the network, the first latitude B represents the batch data dimension, that is, how many data there are, and the following three dimensions represent the true shape of each data Dimensions, for example, [32,224,224,3] represent 32 images with 3 channels and a length and width of 224. In the neural network, the length and width of the image must be fixed. If it is constantly changing, the dimension of the network will be due to At the beginning, it is fixed and cannot be changed randomly, resulting in a dimension error. Therefore, usually in the data processing part, the images in the training set will be set to a fixed length, length and width. Usually, this line of code will appear:

The convolutional layer custom function in the network:

As shown above,

In the tf.nn.con2d() function, the dimensions of weights and stride need to be noted. The dimensions of weights are [kernel_size, kernel_size, in_channel, out_channel],

kernel_size indicates the size of the convolution kernel (generally 3x3, 1x1, 5x5), which can be selected according to the effect. in_channel is the dimension of the input data, that is, the dimension of the output of the previous layer of network. If the previous layer is the original image, follow the previous The dimension of the image is 3. If the previous layer is a convolutional layer, then the output dimension of the previous convolutional layer is the input dimension of the current convolution, and out_channel is the dimension that the current convolution wants to output, which is also the convolution kernel. Quantity, the number of output feature maps is out_channel, so the dimension of the network convolution kernel needs to be set accordingly, otherwise errors will be reported again and again.

Model Training and Storage

The most important part of this link is how to input data into the network, and how to calculate the loss function of the output of the network. Usually, the tf data is fed into the network through the feed function, and the data is only imported during the actual run. The model In addition to the need to participate in the loss calculation, the output also needs to calculate the current accuracy rate so that we can observe the learning effect of the network, (usually after a round of training, the test set will be tested) the output of the network is usually a category The probability distribution of the number (the sum of the distribution is 1), if it is two categories, then the final output of the network is [0.8,0.2] for each sample, and then the index [0] of the maximum value (0.8) is predicted Value, and then compared with the real label, the same is 1, otherwise it is 0, thus obtaining the correct number of a batch of data, after all batches are calculated, the accuracy of the entire test set can be obtained.

Save the model to save the best model for later use and experimentation.

model testing

    The model testing part is relatively simple. It mainly reads the previously saved model structure and corresponding weight parameters for actual prediction. At this time, it is only the forward propagation part of the operation, that is, only the predicted value of the network needs to be obtained, and the data processing part Just keep it consistent with your training.

epilogue

The content of this issue is to summarize the previous image classification projects. A complete deep learning project requires everyone to interpret and debug the details, especially the network construction part. The network parameters cannot be set arbitrarily, and the actual input data needs to be considered Dimensions are set.

At the same time, for the details of the code, there are some places that the editor did not explain. I hope you can find out and get the results. For example, the dimensional output of the convolution is four-dimensional, that is, [B, H, W, C], and the final network The output is [B,N], N is the number of categories, that is, a sample has N outputs, and the index of the maximum value is the prediction label, so how are the four dimensions converted into two dimensions? Hope you all find out and learn from it.

Happy weekend and see you next time!  

Editor: Layman Yueyi|Review: Layman Xiaoquanquan

Wonderful review of the past

Deep Learning Practical Chapter (10) -- TensorFlow Learning Road (7)

Deep Learning Practical Chapter (9) -- TensorFlow Learning Road (6)

Deep Learning Practical Chapter (8) -- TensorFlow Learning Road (5)

What have we done in the past days :

[Year-end summary] 2021, bid farewell to the old and welcome the new

[Year-end Summary] Saying goodbye to the old and welcoming the new, 2020, let's start again

Scan code to follow us

Advanced IT Tour

Praise me when I see you!

Guess you like

Origin blog.csdn.net/xyl666666/article/details/118470691