Quick introduction to deep learning with pytorch

Give up personal qualities

Enjoy a wicked life

Refuse to be mentally drained

Go crazy if something happens

1. Install Anaconda

Just select the appropriate system version and install it.

After installation, you can see the following content

2. Use Anaconda to create a development environment

This is why Anaconda is used. Different development environments can be created. The development packages selected in each development environment can be different, and the environments will not interfere with each other.

Open command black window

As you can see, we are currently in the initial base environment.

Use commands to create the required development environment

For example, create a development environment named pytorch and specify the python version as 3.6:

conda create -n pytorch python=3.6

After creation, enter this development environment:

conda activate pytorch

You can see that the environment has been switched to pytorch.

3. Install pytorch

Official website download address

One thing to note is to check if your computer has a GPU. If not, select CPU in the CUDA line.

And check the driver version. If the version is not enough, go to nvida official website to download the new driver corresponding to your graphics card.

Check driver version

Install pytorch

After selecting the official website, copy the following line and run the installation in the pytorch environment

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

Possible errors:

failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.

This error is usually caused by conda being unable to obtain the required package information from the current repodata.json file. One way to solve this problem is to try changing conda's channel configuration to use other available mirror sources to obtain package information.

Solution:

Change the cuda version to 11.8:

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Check whether the installation is successful:

python 
import torch
torch.cuda.is_available()

output

true

4. Initial understanding of pytorch loading data

First, assume that we have a massive data pool, which is filled with all kinds of data.

Dataset

The function is to obtain data and its corresponding label from the above data pool.

Dataset in Pytorch is an abstract class used to represent datasets. We can customize our own data set by inheriting the Dataset class. Custom Dataset needs to implement the __len__ and __getitem__ methods.

__len__ method: Returns the size of the data set and the number of samples in the data set.

__getitem__ method: Returns the sample at the specified index. In this method, we need to read the corresponding data from the dataset according to the index and convert it into a PyTorch tensor.

The advantage of custom Dataset is that it can flexibly handle various types of data, such as images, text, audio, etc. At the same time, we can also perform data enhancement, data preprocessing and other operations in the Dataset to improve the performance of the model.

Data loader

Package data to provide different data forms for the subsequent network.

dataloader in PyTorch is a data loader used to load data from a given dataset. This data set can be images in a folder, tabular data in a CSV file, or other forms of data. Dataloader is responsible for loading data in batches and supports parallel processing and data preprocessing to better train neural network models.