Conversion of PIL Image and tensor during PyTorch image preprocessing

Foreword: When using the deep learning framework PyTorch to preprocess image data, you may have encountered various problems like me. Although you can always find similar problems on the Internet, the code environment of different articles is different, and it may not be possible. Solve your own problems directly. At this time, you need to understand the general principles involved in the problem itself based on the bug you have made, and based on the specific location of the error ( you must read the bug information completely, do not only look at the last error message without looking at the intermediate call process ) in order to be faster and accurate Solve one's own problems

1. Principle overview

PIL (Python Imaging Library) is the most basic image processing library in Python, and using PyTorch to preprocess the original input image into neural network input , three formats are often used : PIL Image, Numpy and Tensor . The preprocessing includes but Not limited to " image cropping ", " image rotation " and " image data normalization ", etc. The various processing of the image can be packaged together and executed in the code, and it is generally used by transforms.Compose(transforms)combining multiple transforms. As follows

from torchvision import transforms 

transform = transforms.Compose([
			   # 重置大小
			   transforms.Resize(255), 
               transforms.CenterCrop(224),  
               # 随机旋转图片
               transforms.RandomHorizontalFlip(),
               transforms.ToTensor(), 
               # 正则化(降低模型复杂度)
               transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
            ])

Among them, different image processing methods require different input image formats. For example, Resize()and RandomHorizontalFlip()other methods require input images PIL Image, the regularization operation Normalize()processes the tensorformat image data. Therefore, according to the data format requirements of different operations, we need to format the input image data into the required format before different operations . With these conceptual understandings, we can handle the possible bugs with ease and precision.

Second, the conversion of PIL Image and tensor

2.1 Convert tensor to PIL Image

from torchvision.transforms 
PIL_img = transforms.ToPILImage()(tensor_img) 

2.2 Convert PIL Image to tensor

Generally placed transforms.Compose(transforms)in front of the regularization operation in the combination

transforms.ToTensor()

2.3 Convert Numpy to PIL Image

from PIL import Image
PIL_img = Image.fromarray(array)

Three, possible problems

3.1 img should be PIL Image. Got <class ‘torch.Tensor’>

TypeError: img should be PIL Image. Got <class 'torch.Tensor'>

This problem, most blog posts on the Internet and even stackoverflow, are all transforms.Compose(transforms)about the order in the combination, but after modifying the order according to these statements, I still have not solved the problem. Later, I understood the principle and combined with the actual location of the bug, and finally solved it.

As shown in the figure below, my bug appears in the handle in the red box. Unlike most blog posts, I first do grayscale processing on the image, and then trim and rotate the image, so the transforms.Compose(transforms)combined operation is here. After the line of code, naturally no matter how to change the order, it is indifferent. So from the location of the bug, it can be seen that this problem has nothing to do with the order of combination operations, but from the final type error, it can be seen that the observation type passed in this line of code is expected to be PIL, but it is actually tensor, so as long as the two formats are converted before this To solve the bug

Insert picture description here

Solution from

transform = T.Grayscale()
img = transform(img)

Becomes

transform = T.Grayscale()
img = T.ToPILImage()(img)
img = transform(img)

3.1 tensor should be a torch tensor. Got <class ‘PIL.Image.Image’>.

TypeError: tensor should be a torch tensor. Got <class 'PIL.Image.Image'>.

It must be that the required tensorimage operation is passed in PIL, so it can be PILconverted to before the appropriate positiontensor

The solution is from

transform = transforms.Compose([
			   transforms.Resize(255), 
               transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
            ])

To

transform = transforms.Compose([
			   transforms.Resize(255), 
               transforms.ToTensor(), 
               transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
            ])

references

[1] Conversion between PIL.Image and np.ndarray images and Tensor
[2] ToTensor interpretation after PyTorch loads the image (including PIL and OpenCV reading image comparison)
[3] How pytorch displays data images and tags TypeError: img should be PIL Image. Got <class'numpy.ndarray'>

Guess you like

Origin blog.csdn.net/SL_World/article/details/114149076