ImageFolder for data set loading
ImageFolder
A general data loader, the data in the data set is organized in the following way
The function is as follows
ImageFolder(root, transform``=``None``, target_transform``=``None``, loader``=``default_loader)
Parameter explanation
-
Root specifies the path to load the image
-
transform: Transformation operation performed on PIL Image, the input of transform is the returned object of using loader to read the image
-
target_transform: transformation of label
-
loader: How to read the image after a given path, the default is to read the PIL Image object in RGB format
Label is stored in a dictionary after sorting in the order of the folder name, that is {class name: class number (starting from 0)}. Generally speaking, it is best to name the folder directly with a number starting from 0, which will be consistent with the ImageFolder
actual label
, If it is not for this naming convention, it is recommended to look at the self.class_to_idx
attributes to understand the mapping relationship between label and folder name
It can be seen that it is divided into two categories: cat and dog
import torchvision.datasets as dset
dataset = dset.ImageFolder('./data/dogcat_2') #没有transform,先看看取得的原始图像数据
print(dataset.classes) #根据分的文件夹的名字来确定的类别
print(dataset.class_to_idx) #按顺序为这些类别定义索引为0,1...
print(dataset.imgs) #返回从所有文件夹中得到的图片的路径以及其类别
return
['cat', 'dog']
{
'cat': 0, 'dog': 1}
[('./data/dogcat_2/cat/cat.12484.jpg', 0), ('./data/dogcat_2/cat/cat.12485.jpg', 0), ('./data/dogcat_2/cat/cat.12486.jpg', 0), ('./data/dogcat_2/cat/cat.12487.jpg', 0), ('./data/dogcat_2/dog/dog.12496.jpg', 1), ('./data/dogcat_2/dog/dog.12497.jpg', 1), ('./data/dogcat_2/dog/dog.12498.jpg', 1), ('./data/dogcat_2/dog/dog.12499.jpg', 1)]
View the obtained picture data:
#从返回结果可见得到的数据仍是PIL Image对象
print(dataset[0])
print(dataset[0][0])
print(dataset[0][1]) #得到的是类别0,即cat
return:
(<PIL.Image.Image image mode=RGB size=497x500 at 0x11D99A9B0>, 0)
<PIL.Image.Image image mode=RGB size=497x500 at 0x11DD24278>
0
And then transform
become a tensor
data_transform = {
"train": transforms.Compose([transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor()
]),
"val": transforms.Compose([transforms.Resize((224, 224)), # cannot 224, must (224, 224)
transforms.ToTensor()
])}
train_dataset = datasets.ImageFolder(root=os.path.join(data_root, "train"),
transform=data_transform["train"])
print(train_dataset[0])
print(train_dataset[0][0])
print(train_dataset[0][1]) #得到的是类别0,即cat
get
(tensor([[[0.3765, 0.3765, 0.3765, ..., 0.3765, 0.3765, 0.3765],
[0.3765, 0.3765, 0.3765, ..., 0.3765, 0.3765, 0.3765],
[0.3804, 0.3804, 0.3804, ..., 0.3765, 0.3765, 0.3765],
...,
[0.4941, 0.4941, 0.4902, ..., 0.3804, 0.3765, 0.3765],
[0.5098, 0.5098, 0.5059, ..., 0.3804, 0.3765, 0.3765],
[0.5098, 0.5098, 0.5059, ..., 0.3804, 0.3765, 0.3765]],
[[0.5686, 0.5686, 0.5686, ..., 0.5843, 0.5804, 0.5804],
[0.5686, 0.5686, 0.5686, ..., 0.5843, 0.5804, 0.5804],
[0.5725, 0.5725, 0.5725, ..., 0.5843, 0.5804, 0.5804],
...,
[0.5686, 0.5686, 0.5686, ..., 0.5961, 0.5922, 0.5922],
[0.5686, 0.5686, 0.5686, ..., 0.5961, 0.5922, 0.5922],
[0.5686, 0.5686, 0.5686, ..., 0.5961, 0.5922, 0.5922]],
[[0.4824, 0.4824, 0.4824, ..., 0.4902, 0.4902, 0.4902],
[0.4824, 0.4824, 0.4824, ..., 0.4902, 0.4902, 0.4902],
[0.4863, 0.4863, 0.4863, ..., 0.4902, 0.4902, 0.4902],
...,
[0.3647, 0.3686, 0.3804, ..., 0.4824, 0.4784, 0.4784],
[0.3451, 0.3490, 0.3608, ..., 0.4824, 0.4784, 0.4784],
[0.3451, 0.3490, 0.3608, ..., 0.4824, 0.4784, 0.4784]]]), 0)
省略
图像Tensor [........]
标签Tensor [0]
You can know train_dataset[0]
that one tuple
is a tensor and the other is a scalar
print(train_dataset[0][0].shape)
print(train_dataset[0][1])
return
torch.Size([3, 224, 224])
0
DataLoader generates batch training data
for epoch in range(epoch_num):
for image, label in train_loader:
print("label: ", labels, labels.dtype)
print("imges: ", images, images.dtype)
return
label: tensor([0]) torch.int64
imges: tensor([[[[0.2275, 0.2275, 0.2235, ..., 0.2196, 0.2196, 0.2196],
[0.2275, 0.2275, 0.2235, ..., 0.2196, 0.2196, 0.2196],
[0.2235, 0.2235, 0.2235, ..., 0.2157, 0.2157, 0.2157],
...,
[0.2392, 0.2392, 0.2392, ..., 0.2235, 0.2235, 0.2235],
[0.2353, 0.2353, 0.2353, ..., 0.2235, 0.2275, 0.2275],
[0.2353, 0.2353, 0.2353, ..., 0.2235, 0.2275, 0.2275]],
[[0.2392, 0.2392, 0.2353, ..., 0.2275, 0.2275, 0.2275],
[0.2392, 0.2392, 0.2353, ..., 0.2275, 0.2275, 0.2275],
[0.2353, 0.2353, 0.2353, ..., 0.2235, 0.2235, 0.2235],
...,
[0.2275, 0.2275, 0.2314, ..., 0.2196, 0.2196, 0.2196],
[0.2235, 0.2235, 0.2275, ..., 0.2196, 0.2196, 0.2196],
[0.2235, 0.2235, 0.2275, ..., 0.2196, 0.2196, 0.2196]],
[[0.1961, 0.1961, 0.1961, ..., 0.1765, 0.1765, 0.1765],
[0.1961, 0.1961, 0.1961, ..., 0.1765, 0.1765, 0.1765],
[0.1922, 0.1922, 0.1922, ..., 0.1725, 0.1725, 0.1725],
...,
[0.1529, 0.1529, 0.1529, ..., 0.1686, 0.1686, 0.1686],
[0.1490, 0.1490, 0.1490, ..., 0.1686, 0.1686, 0.1686],
[0.1490, 0.1490, 0.1490, ..., 0.1686, 0.1686, 0.1686]]]]) torch.float32
enumerate loading
The enumerate() function is used to convert a traversable data object (such as a list,TupleOr string) is combined into an index sequence, which lists data and data subscripts at the same time, which is generally used in a for loop.
You can use this loop to load batchsize
each of the following 元组
to getCorresponding pictures and labels
Multi-element assignment
When Python detects that there are multiple variables on the left of the equal sign and list or tuple on the right, it will automatically decompress the list and tuple and assign them to the corresponding elements in turn:
l = [1, 2]
a, b = l
So you can
for step, data in enumerate(train_bar):
images, labels = data
Finally, nn.CrossEntropyLoss() calculates the loss regression
nn.CrossEntropyLoss()
It is nn.logSoftmax()
and nn.NLLLoss()
integration, it can be directly used to replace both the network operation. Let's take a look at the calculation process.
Where x is the output vector of the network and class is the real label
For example, the output of a three-classification network for an input sample is [-0.7715, -0.6205,-0.2562], the true label of the sample is 0, and the nn.CrossEntropyLoss()
calculated loss is