Neural network learning small record 70 - Pytorch uses Google Colab for deep learning

Precautions

This article takes the VOC dataset as an example, so classes_path, etc. are not modified during training. If you are training your own dataset, you must pay attention to modifying classes_path and other parameters!

study foreword

Colab is a cloud learning platform provided by Google, Very Nice. Recently, the card is not enough and I decided to go to prostitute for free. This blog will only explain how to use Colab to train the existing deep learning warehouse, and will not say how to access the external network, how to register, etc.

This blog is only to demonstrate the use of Colab, mainly to familiarize you with Colab operation, specific analysis of specific problems, improper operation and version changes will lead to step errors, if there is an error, it is recommended to go to Baidu, look at the code and instructions, and check the cause of the error , At the same time, it is recommended that students with a certain basic knowledge use Colab.
insert image description here

What is Google Colab

Google Colab is a free Jupyter notebook environment provided by Google, which can be used without any settings and environment configuration, and runs entirely in the cloud. Does not affect local use.

Google Colab provides researchers with a certain amount of free GPUs to write and execute code, all freely available through a browser. Students can easily run deep learning frameworks such as Tensorflow and Pytorch on it.

Although Google Colab provides certain free resources, the amount of resources is limited and all Colab runtimes will reset after a period of time. Colab Pro subscribers will still have limited usage, but roughly double the limit that non-subscribers can enjoy. Colab Pro+ subscribers also get increased stability.
insert image description here

Related Links

Colab official website: https://colab.research.google.com/
(requires an external network to enter)
ipynb Github: https://github.com/bubbliiiing/Colab

Training with Colab

This article takes the training of the YoloV4-Tiny-Pytorch version as an example to demonstrate the use of Colab.

1. Upload of datasets and pre-training weights

1. Data set upload

Colab and Google's own cloud disk are very well linked, so we need to upload the data set to the cloud disk first. The uploading process is actually very simple, and the data set is prepared locally first.
Since the libraries I uploaded all use the VOC dataset, we need to place them according to the VOC dataset. This article directly uses the VOC07+12 dataset as an example to demonstrate.
insert image description here
JPEGImages stores image files, Annotations stores label files, and ImageSets stores txt files that distinguish validation sets, training sets, and test sets.
Then package the entire VOCdevkit file. It should be noted that the above three folders are not packaged, but the VOCdevkit is packaged, so as to meet the data processing format.
insert image description here
After obtaining the packaged compressed package, upload the compressed package to Google Cloud Drive. I created a new VOC_datasets folder on Google Cloud Drive to store the compressed package.
insert image description here
At this point, the upload of the dataset is complete.

2. Upload of pre-trained weights

To create a folder on Google Cloud Disk, first create Models, then create yolov4-tiny-pytorch in Models, and then create logs and model_data in yolov4-tiny-pytorch.

model_data places the pre-training files.
The logs place the weights generated during the network training process.

insert image description here
Since we are using the YoloV4-Tiny-Pytorch library this time, we upload its pretrained weights to the model_data folder.
insert image description here

2. Open Colab and configure the environment

1. Create a notebook

In this step, we first open the official website of Colab.
insert image description here
Then click on the file to create a notebook, which will create a jupyter notebook.
insert image description here
After the creation is complete, change the name of the file to look better.
insert image description here
Then click the code execution program, then click Change runtime type, select GPU in the hardware accelerator section, Colab will configure a machine with GPU, and the notebook is created.
insert image description here
insert image description here

2. Simple configuration of the environment

Colab has integrated the pytorch environment, and there is no need to configure pytorch specifically, but the torch version used is relatively new.
Since our dataset is on Google Cloud Disk, we also need to mount the cloud disk.

from google.colab import drive
drive.mount('/content/gdrive')

We enter the above code into the notebook for execution. Mount the cloud disk to the server. Then click run.
insert image description here
At this point, click on the left column, something similar to a folder, you can open the folder and see the file deployment. gdrive is the Google cloud disk we configured. If not, go to the left and refresh.
insert image description here
Open gdrive, which has our dataset.
insert image description here

3. Download the deep learning library

In this step, we need to complete the download of the deep learning repository, we use the git clone command to download. After executing the following command, the yolov4-tiny-pytorch folder is added to the file on the left . If not, go to the left and refresh.

Then we moved the root directory to the yolov4-tiny-pytorch folder through the cd command.

!git clone https://github.com/bubbliiiing/yolov4-tiny-pytorch.git
%cd yolov4-tiny-pytorch/

insert image description here

4. Data set copying and decompression

Directly arranging the dataset on the Google Cloud Disk will result in a large amount of cloud disk data transfer, which is far slower than the local file, so we need to copy the dataset to the local for processing.

We enter the following code to copy and decompress the file. The first thing to execute is to delete the original empty VOCdevkit folder. Then decompress.

Since the zip file is used here, the unzip command is used. If it is another form of compressed package, the command needs to be modified according to the format of the compressed package ( please Baidu students ). After executing the following command, you can find that the VOC dataset has been decompressed in the file on the left . If not, go to the left and refresh.

!rm -rf ./VOCdevkit
!cp /content/gdrive/MyDrive/VOC_datasets/VOC07+12+test.zip ./
!unzip ./VOC07+12+test.zip -d ./

insert image description here

5. Save the path settings

The default save path of the code provided in this article is the logs folder, but Colab has an unstable problem, and it will disconnect after running for a period of time.
If the weights are saved in the logs folder under the original root directory, the network training will be useless if there is a disconnection, and a lot of time will be wasted.
The google cloud disk can be flexibly connected to the root directory, so even if the connection is disconnected, the weights will remain in the cloud disk.
insert image description here
The logs folder was created in the cloud disk before this article. Link this folder.

!rm -rf logs
!ln -s /content/gdrive/MyDrive/Models/yolov4-tiny-pytorch/logs logs

insert image description here

3. Start training

1. Processing of annotation files

Open the voc_annotation.py file. Since we are using the VOC dataset directly, we have already divided the training set, validation set and test set, so we set annotation_mode to 2.

Then enter the command to complete the processing of the label, and generate 2007_train.txt and 2007_val.txt.

!python voc_annotation.py

insert image description here

2. Processing of training files

Processing training files mainly includes three parts:
1. The use of pre-training files.
2. The setting of the storage period. This setting is because the storage space of the cloud disk is limited, and each generation of storage will cause the storage space to be full.

a. Use of pre-training files

First modify model_path to point to the weights file we uploaded to Google Cloud Drive. In the file column on the left, find models/yolov4-tiny-pytorch/model_data and copy the weights path.
insert image description here
Replace model_path on the right.
insert image description here

b. Save the setting of the cycle

Some repositories have been updated, and the save parameters of how many generations are added. You can directly modify save_period. In this article, we set save_period to 4, that is, save every 4 generations.

Warehouses that have not been updated can only be saved in each generation. Remember to go to the Google cloud disk to delete them occasionally .
insert image description here

3. Start training

At this point in the notebook enter:

!python train.py

to start training.
insert image description here

What should I do if the line is disconnected?

1. Anti-dropping measures

I heard that you can reduce the frequency of dropped calls by automatically clicking.
Press F12 in Google colab, click the console of the web page, and paste the following code:

function ConnectButton(){
    
    
	console.log("Connect pushed");
	document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click()
}
setInterval(ConnectButton,60000);

2. Is it over or disconnected?

There is no way, cheap things must have its disadvantages.

Follow the steps again, and then set the pre-training weights to the trained weights file in the logs folder.

In addition, parameters such as Init_epoch also need to be adjusted.

Summarize

The most important thing in using Colab training is to deal with the relationship between the paths, find which file is where, and where is the execution directory of the folder, you can simply run the program, but Colab does have a disconnection problem, we need to keep it at all times file, so I save the weights directly on the cloud disk so that they won't be lost.

Guess you like

Origin blog.csdn.net/weixin_44791964/article/details/123659637