Neural network learning small record 70 - Pytorch uses Google Colab for deep learning
Precautions
This article takes the VOC dataset as an example, so classes_path, etc. are not modified during training. If you are training your own dataset, you must pay attention to modifying classes_path and other parameters!
study foreword
Colab is a cloud learning platform provided by Google, Very Nice. Recently, the card is not enough and I decided to go to prostitute for free. This blog will only explain how to use Colab to train the existing deep learning warehouse, and will not say how to access the external network, how to register, etc.
This blog is only to demonstrate the use of Colab, mainly to familiarize you with Colab operation, specific analysis of specific problems, improper operation and version changes will lead to step errors, if there is an error, it is recommended to go to Baidu, look at the code and instructions, and check the cause of the error , At the same time, it is recommended that students with a certain basic knowledge use Colab.
What is Google Colab
Google Colab is a free Jupyter notebook environment provided by Google, which can be used without any settings and environment configuration, and runs entirely in the cloud. Does not affect local use.
Google Colab provides researchers with a certain amount of free GPUs to write and execute code, all freely available through a browser. Students can easily run deep learning frameworks such as Tensorflow and Pytorch on it.
Although Google Colab provides certain free resources, the amount of resources is limited and all Colab runtimes will reset after a period of time. Colab Pro subscribers will still have limited usage, but roughly double the limit that non-subscribers can enjoy. Colab Pro+ subscribers also get increased stability.
Related Links
Colab official website: https://colab.research.google.com/
(requires an external network to enter)
ipynb Github: https://github.com/bubbliiiing/Colab
Training with Colab
This article takes the training of the YoloV4-Tiny-Pytorch version as an example to demonstrate the use of Colab.
1. Upload of datasets and pre-training weights
1. Data set upload
Colab and Google's own cloud disk are very well linked, so we need to upload the data set to the cloud disk first. The uploading process is actually very simple, and the data set is prepared locally first.
Since the libraries I uploaded all use the VOC dataset, we need to place them according to the VOC dataset. This article directly uses the VOC07+12 dataset as an example to demonstrate.
JPEGImages stores image files, Annotations stores label files, and ImageSets stores txt files that distinguish validation sets, training sets, and test sets.
Then package the entire VOCdevkit file. It should be noted that the above three folders are not packaged, but the VOCdevkit is packaged, so as to meet the data processing format.
After obtaining the packaged compressed package, upload the compressed package to Google Cloud Drive. I created a new VOC_datasets folder on Google Cloud Drive to store the compressed package.
At this point, the upload of the dataset is complete.
2. Upload of pre-trained weights
To create a folder on Google Cloud Disk, first create Models, then create yolov4-tiny-pytorch in Models, and then create logs and model_data in yolov4-tiny-pytorch.
model_data places the pre-training files.
The logs place the weights generated during the network training process.
Since we are using the YoloV4-Tiny-Pytorch library this time, we upload its pretrained weights to the model_data folder.
2. Open Colab and configure the environment
1. Create a notebook
In this step, we first open the official website of Colab.
Then click on the file to create a notebook, which will create a jupyter notebook.
After the creation is complete, change the name of the file to look better.
Then click the code execution program, then click Change runtime type, select GPU in the hardware accelerator section, Colab will configure a machine with GPU, and the notebook is created.
2. Simple configuration of the environment
Colab has integrated the pytorch environment, and there is no need to configure pytorch specifically, but the torch version used is relatively new.
Since our dataset is on Google Cloud Disk, we also need to mount the cloud disk.
from google.colab import drive
drive.mount('/content/gdrive')
We enter the above code into the notebook for execution. Mount the cloud disk to the server. Then click run.
At this point, click on the left column, something similar to a folder, you can open the folder and see the file deployment. gdrive is the Google cloud disk we configured. If not, go to the left and refresh.
Open gdrive, which has our dataset.
3. Download the deep learning library
In this step, we need to complete the download of the deep learning repository, we use the git clone command to download. After executing the following command, the yolov4-tiny-pytorch folder is added to the file on the left . If not, go to the left and refresh.
Then we moved the root directory to the yolov4-tiny-pytorch folder through the cd command.
!git clone https://github.com/bubbliiiing/yolov4-tiny-pytorch.git
%cd yolov4-tiny-pytorch/
4. Data set copying and decompression
Directly arranging the dataset on the Google Cloud Disk will result in a large amount of cloud disk data transfer, which is far slower than the local file, so we need to copy the dataset to the local for processing.
We enter the following code to copy and decompress the file. The first thing to execute is to delete the original empty VOCdevkit folder. Then decompress.
Since the zip file is used here, the unzip command is used. If it is another form of compressed package, the command needs to be modified according to the format of the compressed package ( please Baidu students ). After executing the following command, you can find that the VOC dataset has been decompressed in the file on the left . If not, go to the left and refresh.
!rm -rf ./VOCdevkit
!cp /content/gdrive/MyDrive/VOC_datasets/VOC07+12+test.zip ./
!unzip ./VOC07+12+test.zip -d ./
5. Save the path settings
The default save path of the code provided in this article is the logs folder, but Colab has an unstable problem, and it will disconnect after running for a period of time.
If the weights are saved in the logs folder under the original root directory, the network training will be useless if there is a disconnection, and a lot of time will be wasted.
The google cloud disk can be flexibly connected to the root directory, so even if the connection is disconnected, the weights will remain in the cloud disk.
The logs folder was created in the cloud disk before this article. Link this folder.
!rm -rf logs
!ln -s /content/gdrive/MyDrive/Models/yolov4-tiny-pytorch/logs logs
3. Start training
1. Processing of annotation files
Open the voc_annotation.py file. Since we are using the VOC dataset directly, we have already divided the training set, validation set and test set, so we set annotation_mode to 2.
Then enter the command to complete the processing of the label, and generate 2007_train.txt and 2007_val.txt.
!python voc_annotation.py
2. Processing of training files
Processing training files mainly includes three parts:
1. The use of pre-training files.
2. The setting of the storage period. This setting is because the storage space of the cloud disk is limited, and each generation of storage will cause the storage space to be full.
a. Use of pre-training files
First modify model_path to point to the weights file we uploaded to Google Cloud Drive. In the file column on the left, find models/yolov4-tiny-pytorch/model_data and copy the weights path.
Replace model_path on the right.
b. Save the setting of the cycle
Some repositories have been updated, and the save parameters of how many generations are added. You can directly modify save_period. In this article, we set save_period to 4, that is, save every 4 generations.
Warehouses that have not been updated can only be saved in each generation. Remember to go to the Google cloud disk to delete them occasionally .
3. Start training
At this point in the notebook enter:
!python train.py
to start training.
What should I do if the line is disconnected?
1. Anti-dropping measures
I heard that you can reduce the frequency of dropped calls by automatically clicking.
Press F12 in Google colab, click the console of the web page, and paste the following code:
function ConnectButton(){
console.log("Connect pushed");
document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click()
}
setInterval(ConnectButton,60000);
2. Is it over or disconnected?
There is no way, cheap things must have its disadvantages.
Follow the steps again, and then set the pre-training weights to the trained weights file in the logs folder.
In addition, parameters such as Init_epoch also need to be adjusted.
Summarize
The most important thing in using Colab training is to deal with the relationship between the paths, find which file is where, and where is the execution directory of the folder, you can simply run the program, but Colab does have a disconnection problem, we need to keep it at all times file, so I save the weights directly on the cloud disk so that they won't be lost.