Deep Learning-Natural Language Processing (NLP)-Third Party Library (Toolkit): torchtext[pytorch's own text related APIs], torchvison[pytorch's own image related APIs]

The data set that comes with pytorch is provided by two upper-level APIs, which are torchvisionandtorchtext

among them:

  1. torchvisionProvides APIs and data related to image data processing
    • Data location:, torchvision.datasetsfor example: torchvision.datasets.MNIST(handwritten digital picture data)
  2. torchtextProvides API and data related to text data processing
    • Data location:, torchtext.datasetsfor example: torchtext.datasets.IMDB(电影comment text data)

Let's take Mnist handwritten numbers as an example to see how pytorch loads its own data set

The method of use is the same as before:

  1. Prepare the Dataset instance
  2. Give the dataset to the dataloder to disrupt the order and form a batch

4.1 torchversion.datasets

torchversoin.datasetsThe dataset classes in (for example torchvision.datasets.MNIST) are inherited fromDataset

Means: torchvision.datasets.MNISTan instance that can be obtained Datasetby instantiating directly

But the parameters in the MNIST API need to be noted:

torchvision.datasets.MNIST(root='/files/', train=True, download=True, transform=)

  1. rootThe parameter indicates where the data is stored
  2. train:The bool type indicates whether to use the data of the training set or the data of the test set
  3. download:bool type, indicating whether you need to download data to the root directory
  4. transform:Realized image processing function

4.2 Introduction to the MNIST data set

The original address of the data set:http://yann.lecun.com/exdb/mnist/

MNIST is Yann LeCuna free image recognition data set provided by et al., which includes 60,000 training samples and 10,000 test samples. The size of the image has been standardized. They are all black and white images with a size of28X28

Execute the code, download the data, observe the data type:

import torchvision

dataset = torchvision.datasets.MNIST(root="./data",train=True,download=True,transform=None)

print(dataset[0])

The downloaded data is as follows:
Insert picture description here

The output of the code is as follows:

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Processing...
Done!
(<PIL.Image.Image image mode=L size=28x28 at 0x18D303B9C18>, tensor(5))

Two pieces of data can be returned from the data set, which can be guessed as the data and target value of the picture

The 0th return value is of type Image, which can be opened by calling show() method and found to be handwritten number 5.

import torchvision

dataset = torchvision.datasets.MNIST(root="./data",train=True,download=True,transform=None)

print(dataset[0])

img = dataset[0][0]
img.show() #打开图片

The pictures are as follows:

Insert picture description here

It can be seen from the above: the return value is (图片,目标值), this result can also be obtained by observing the source code

Guess you like

Origin blog.csdn.net/u013250861/article/details/114297007