Article directory
1. Opencv
The basic image type of opencv can be converted to and from numpy arrays, so it can be directly called to torch.from_numpy(img)
convert the image intotensor
- Read:
img=cv2.imread(path)
After OpenCV reads the image, it returns a representative imagenumpy.ndarray
in the format of(H,W,C)
channel orderBGR
, value range[0,255]
,dtype=uint8
.
import cv2
def read_img_cv(path):
img_cv=cv2.imread(path)
return img_cv
- show:
cv2.imshow(name,img)
import cv2
def show_img_cv(img_cv):
cv2.imshow("Image", img_cv)
cv2.waitKey(0) # 暂停显示图片,数字0代表按键后 0 ms执行
- keep:
cv2.imwrite(path, img)
import cv2
def save_img_cv(img_cv,path):
cv2.imwrite(path, img_cv) # 保存图片
2. Matplotlib
matplotlib is an image drawing library developed by python based on matlab drawing. When plotting with matplotlib, it is possible to read tesnor
and numpy
datatype.
- read:
img=mpimg.imread(path)
If it is a grayscale image: return an array of (H, W) shape
; if it is an RGB image, return an array of (H, W, 3) shape, and the image channel order is; RGB
if it is an RGBA image, return an array of (HW, 4) shape , the picture channel order isRGBA
Additionally, PNG images are returned as an array of floats (0-1) dtype=float32
, and all other formats dtype=uint8
are returned as an array of int, with a bit depth determined by the specific image.
import matplotlib.image as mpimg
def read_img_mat(path):
img_mat=mpimg.imread(path)
return img_mat
- show:
plt.imshow(img)
plt.show()
- show color map
import matplotlib.pyplot as plt
# 如果在jupyter notebook中显示,需要添加如下一行代码
%matplotlib inline
def show_img_mat(img_mat):
plt.imshow(img_mat)
plt.axis('off')
plt.show()
- Display the grayscale image
matplotlib displays the image. By default, the image is displayed in three channels. We need to add parameters in plt.imshow()gray
.
def show_img_gray(img_gray):
plt.imshow(img_gray,cmap='gray')
plt.axis('off')
plt.show()
- display
Image
type image
def show_img_pil(img_pil):
plt.imshow(img_pil)
plt.axis('off')
plt.show()
- keep:
plt.imsave(name,img)
def save_img_pil(img_pil,name):
plt.imsave(name,img_pil)
3. PIL
PIL is python's basic library for image processing.
The mode of the image is as follows, for example 1
: binary image, L
grayscale image, P
: 8-bit color image, RGB
: 24-bit color image (8 bits per channel) such as jpg
an image, RGBA
: more alpha channel (opacity) than RGB, for example png
Images
can use img.convert(mode)
the transform mode.
- Read: img=Image.open(path)
reads aPIL.xxxImageFIie
type.
import PIL
from PIL import Image
def read_img_pil(path):
img_pil=Image.open(path) # PIL Image 类型
return img_pil
- show:
image.show()
def show_img_pil(img_pil):
img_pil.show()
- keep:
image.save(path)
def save_img_pil(img_pil,path):
img_pil.save(path)
4. The difference and mutual conversion among the three
The difference between the three
- The data type of Opencv is
Numpy数组
, the channel order isBGR
- Matplotlib's data type is
Numpy数组
and the channel order isRGB
- The data type of PIL is
PIL.Image类
and the channel order isRGB
Three image processing libraries are converted to each other
Opencv
Matplotlib
conversion between
# cv->mat
def cv2mat(img_cv):
img_mat=cv2.cvtColor(img_cv,cv2.COLOR_BGR2RGB) # 将颜色通道从BGR改变成RGB
# 另一种等价写法
# img_mat=img_cv[:,:,::-1]
return img_mat
def mat2cv(img_mat): # 将颜色通道从RGB改变成BGR
img_cv=img_mat[:,:,::-1]
return img_cv
Matplotlib
PIL
The mutual conversion between img-
np.asarry(img)
>array
Image.fromarray(array)
array->img
# mat->PIL
#方法1:三通道的转换
def mat2PIL_RGB(img_mat):
img_pil=Image.fromarray(img_mat.astype('uint8'))
# unit8 是无符号的8位整形,用astype [0,255]截断处理
# 另外一种写法
# img_pil= Image.fromarray(np.unit8(img_mat))
return img_pil
# 方法2: 四通道的转换
def mat2PIL_RGBA(img_mat):
img_pil=Image.fromarray(img_mat.astype('uint8')).convert('RGB')
return img_pil
# 方法三:使用torchvision的库函数
from torchvision import transforms
def mat2PIL_trans(img_mat):
trans=transformers.ToPILImage()
img_pil=trans(img_mat)
return img_pil
'''PIL->mat'''
def PIL2mat(img_pil):
img_mat=np.array(img_pil) # 深拷贝
# 如果是jpg格式,通道顺序是RGB, (H,W,3)
# 如果是png格式,通道顺序是RGBA, (H,W,4)
# 返回的类型均是`numpy.ndarray`, `dtype=unit8`, 取值返回[0,255]
# 或者也可以采用浅拷贝
# img_mat=np.asarray(img_pil)
return img_mat
'''区间变换'''
# [0,255]->[0,1]
def PIL2mat_norm(img_pil):
img_mat=np.asarray(img_pil)/255.0
return img_mat
# [0,1]->[0,255]
def mat_255(img_mat):
img_mat=(np.maximum(img_mat, 0) / img_mat.max()) * 255.0
img_mat=np.unit8(img_mat)
Opencv
PIL
conversion between
# cv->PIL
#方法1:三通道的转换
def cv2PIL_RGB(img_cv):
img_rgb = img_cv[:,:,::-1] # OpenCV 的通道顺序为 BGR, 转换成RGB
# nparray
img_pil= Image.fromarray(np.uint8(img_rgb))
return img_pil
# 方法2: 四通道的转换
def cv2PIL_RGBA(img_cv):
img_rgb = img_cv[:,:,::-1]
img_pil=Image.fromarray(img_rgb.astype('uint8')).convert('RGB')
return img_pil
# 方法三:使用torchvision的库函数
from torchvision import transforms
def cv2PIL_trans(img_cv):
img_rgb = img_cv[:,:,::-1]
trans=transformers.ToPILImage()
img_pil=trans(img_rgb)
return img_pil
# PIL->cv
def PIL2cv(img_pil):
img_ary=np.array(img_pil) # 深拷贝,通道顺序是 RGB, (H,W,C)
# 或者也可以采用深拷贝
# img_ary=np.asarray(img_pil)
img_cv=img_ary[:,:,-1]
return img_cv
Mutual conversion between the three formats and Tensor
- Convert numpy format to Tensor
import torch
def nparray2tensor(npary):
ts=torch.from_numpy(npary)
# 如果需要修改成浮点类型
# ts=torch.from_numpy(npary).float()
return ts
- Converting PIL and numpy formats into Tensor
can use the function in torchvision to convert the or : size , rangetransforms.ToTensor()
in PIL into : size , rangeImage
numpy.ndarray(dtype=unit8)
(H,W,C)
[0,255]
torch.FloatTensor
(C,H,W)
[0.0,1.0]
from torchvision import transforms
# img_pil: Image
trans=transforms.ToTensor()
tens=trans(img_pil) # (C,H,W) [0.0,1,0]
# tens_hwc=tens.transpose((1,2,0))
5. Relevant transformation library in Torchvision
5.1 ToPILImage([mode])
CLASS
torchvision.transforms.ToPILImage(mode=None)
-
Function
Convert tensor or ndarray to PIL image - this will not scale the values. This conversion does not support torchscript.
Converts the shape of
C x H x W
to ortorch.*Tensor
the shape ofH x W x C
while preserving the range of values.numpy ndarray
PIL图像
-
parameter
mode(PIL.Image mode)
The color space and pixel depth of the input data (optional). When mode is None (default), the following assumptions are made about the input data:- When the input is 4 channels, the mode is assumed to be RGBA.
- If the input is 3-channel, RGB mode is assumed.
- When the input is 2 channels, it is assumed to be LA mode.
- If the input has 1 channel, the mode is determined by the data type (i.e. int, float, short).
5.2 ToTensor
CLASS
torchvision.transforms.ToTensor
-
Function:
Convert a PIL image or ndarray to tensor, scaled accordingly. This conversion does not support torchscript.
Convert
PIL Image
or in-[0,255]
intervalnumpy.ndarray (H x W x C)
to in[0.0,1.0]
-intervaltorch.FloatTensor (C x H x W)
. where PIL Image belongs to one of the modes(L, LA, P, I, F, RGB, YCbCr, RGBA, CMYK, 1)
; if numpy.Ndarraydtype = np.uint8
in other cases, the tensor is returned without scaling.
5.3 ARROWTensor
CLASS
torchvision.transforms.PILToTensor
-
Function
Convert a PIL image to a tensor of the same type - this will not scale the values. This conversion does not support torchscript.
Tensor to
PIL Image (H x W x C)
convert to shape .(C x H x W)