Image reading and basic attribute operations
- RGB: Three channels, you all know it.
Data read-image
-
cv2.imread('filepath', flags): read in the image.
- filepath is the path
- flags is the flag to read in the picture
- cv2.IMREAD_COLOR: color image, ignore alpha channel,
- cv2.IMREAD_GRAYSCALE: grayscale image, 0
- cv2.IMREAD_UNCHANGED: read in the complete picture, including the alpha channel,
-
Alpha channel: more about it later
upper code
import cv2 #opencv读取的格式是BGR(不是RGB格式)
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
img = cv2.imread('cat.jpg') #读取图片信息
img
array([[[184, 196, 200],
[184, 196, 200],
[184, 196, 200],
…,
[201, 203, 204],
[201, 203, 204],
[203, 202, 204]],
[[184, 196, 200],
[184, 196, 200],
[184, 196, 200],
…,
[201, 203, 204],
[201, 203, 204],
[204, 203, 205]],
[[185, 196, 200],
[185, 196, 200],
[185, 196, 200],
…,
[202, 204, 205],
[202, 204, 205],
[204, 203, 205]],
…,
[[203, 205, 205],
[203, 205, 205],
[203, 205, 205],
…,
[226, 225, 227],
[226, 225, 227],
[225, 224, 226]],
[[204, 206, 206],
[204, 206, 206],
[204, 206, 206],
…,
[226, 225, 227],
[226, 225, 227],
[225, 224, 226]],
[[201, 203, 203],
[202, 204, 204],
[202, 204, 204],
…,
[228, 226, 226],
[228, 226, 226],
[229, 226, 228]]], dtype=uint8)
Notice
- dtype value: 8 bytes
- The dimension depends on the number of brackets, as above, there are 3 dimensions. img stores (height, width, depth) in [h,w,c]. That is to say, the first dimension is the height; the second dimension is the width; the third dimension is the innermost square brackets to indicate the depth, which is BGR.
Image display and wait time setting
- Main function: cv2.show()
code example
#图像的显示,也可以创建多个窗口
cv2.imshow ('image',img) #第一个参数是窗口的名字,其次为图像。
# cv2.imshow('image2',img) #创建多个窗口
#等待时间,毫秒级,0表示 任意键 终止
cv2.waitKey(10000)
#删除任何我们建立的窗口
cv2.destroyAllWindows()
# cv2.destroyWindow('image') #删除特定的窗口,在括号内输入想删除的窗口名
- Note that the size of the picture does not change after the window is enlarged. If you want to change it, see the function below.
- Function to adjust image size: cv2.namesWindow()
- Initially set the function label to cv2.WINDOW_AUTOSIZE
- The function label is cv2.WINDOW_NORMAL, you can adjust the window size
cv2.namedWindow('image',cv2.WINDOW_NORMAL)
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
If you want the code to be clean, just set up a function and frame it.
#执行上述三个函数的函数窗口
def cv_show(name,img):
cv2.imshow(name,img)
cv2.waitKey(0)
cv2.destroyAllWindows()
img.shape
(225, 225, 3)
Grayscale
I mentioned the flags of cv2.imread above, then cv2.IMREAD_GRAYSCALE can read grayscale images. Since there is no RGB, it is two-dimensional.
#灰度图的读取做法
img = cv2.imread('cat.jpg',cv2.IMREAD_GRAYSCALE)
img
array([[196, 196, 196, …, 203, 203, 203],
[196, 196, 196, …, 203, 203, 204],
[196, 196, 196, …, 204, 204, 204],
…,
[205, 205, 205, …, 226, 226, 225],
[206, 206, 206, …, 226, 226, 225],
[203, 204, 204, …, 226, 226, 227]], dtype=uint8)
img.shape
(225, 225)
#图像的显示,也可以创建多个窗口
cv2.imshow ('image',img)
#等待时间,毫秒级,0表示任意键终止
cv2.waitKey(10000)
cv2.destroyAllWindows()
other functions
keep
- Notice that a new picture has been added to the file
cv2.imwrite("mycat.png",img)
True
Format
type(img) #ndarray格式
numpy.ndarray
Number of pixels
img.size #size:像素点个数
50625
type of data
img.dtype #数据类型
dtype(‘uint8’)
Using Matplotlib
img = cv2.imread('cat.jpg')
plt.imshow(img,cmap = 'gray',interpolation='bicubic')
plt.xticks([]),plt.yticks([])
plt.show()
- The picture has chromatic aberration, the reason is that the cv2.imread() interface is used to read the picture, and the read in is BGR format and [0~255], so just convert it to RGB format
img_2 = img[:,:,[2,1,0]]
plt.imshow(img_2)
<matplotlib.image.AxesImage at 0x2608f2faef0>
Hey, it's amazing.
Data Reading - Video
Video is composed of images, which become frames, as you must know.
- cv2.VideoCapture can capture the lens and use numbers to control different devices, such as 0, 1
Run it, don't be intimidated by your big face (not talking about myself
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
- save video
- FourCC encoding
- Capture video from webcam, rotate each frame horizontally and save it.
# import numpy as np
# import cv2
# cap = cv2.VideoCapture(0)
# # Define the codec and create VideoWriter object
# fourcc = cv2.cv.FOURCC(*'XVID')
# out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))
# while(cap.isOpened()):
# ret, frame = cap.read()
# if ret==True:
# frame = cv2.flip(frame,0)
# # write the flipped frame
# out.write(frame)
# cv2.imshow('frame',frame)
# if cv2.waitKey(1) & 0xFF == ord('q'):
# break
# else:
# break
# # Release everything if job is finished
# cap.release()
# out.release()
# cv2.destroyAllWindows()
- If it is a video file, just make the path directly
vc = cv2.VideoCapture('test.mp4')
#检查是否打开正确
if vc.isOpened(): #判断能否打开
open, frame = vc.read() #读取帧
else:
open = False
while open:
ret, frame = vc.read()
if frame is None:
break
if ret == True:
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) #转换为灰度图
cv2.imshow('result',gray)
if cv2.waitKey(1) & 0xFF == 27: #waitkey数值越大,播放速度越慢
break
vc.release()
cv2.destroyAllWindows()
Capture part of the image data
Part of the picture that can be intercepted by yourself
img = cv2.imread('cat.jpg')
cat = img[0:200,0:20]
cv_show('cat',cat)
b,g,r = cv2.split(img)
b
array([[184, 184, 184, …, 201, 201, 203],
[184, 184, 184, …, 201, 201, 204],
[185, 185, 185, …, 202, 202, 204],
…,
[203, 203, 203, …, 226, 226, 225],
[204, 204, 204, …, 226, 226, 225],
[201, 202, 202, …, 228, 228, 229]], dtype=uint8)
r.shape
(225, 225)
img = cv2.merge((b,g,r))
img.shape
(225, 225, 3)
Only keep a single channel
-
R:G:B correspond to 0:1:2 respectively
-
Set the remaining two items to zero
#只保留R通道
cur_img = img.copy()
cur_img[:,:,0] = 0
cur_img[:,:,1] = 0
cv_show('R',cur_img)
#只保留G通道
cur_img = img.copy()
cur_img[:,:,0] = 0
cur_img[:,:,2] = 0
cv_show('G',cur_img)
#只保留B通道
cur_img = img.copy()
cur_img[:,:,1] = 0
cur_img[:,:,2] = 0
cv_show('B',cur_img)
border padding
- Convolution introduction, guide https://mlnotebook.github.io/post/CNN1/ Learn more
Look at the code comments, there are explanations
top_size, bottom_size, left_size, right_size = (50,50,50,50) #上下左右填充值
#复制法,复制最边缘像素
replicate = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_REPLICATE)
#反射法,对感兴趣的图像中的像素在两边进行赋值。例如: fedcba|abcdefgh|hgfedc
reflect = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_REFLECT)
#反射法,以最边缘像素为轴,对称。例如:gfedcb|abcdefgh|gfedcba
reflect101 = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_REFLECT_101)
#外包装法。例如:bcdefh|abcdefgh|abcdefg
wrap = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_WRAP)
#常量法,常数值填充
constant = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_CONSTANT, value = 700)#需设置value值,选择常数填充
import matplotlib.pyplot as plt
plt.subplot(231), plt.imshow(img, 'gray'), plt.title('ORIGINAL')
plt.subplot(232), plt.imshow(replicate, 'gray'), plt.title('REPLICATE')
plt.subplot(233), plt.imshow(reflect, 'gray'), plt.title('REFLECT')
plt.subplot(234), plt.imshow(reflect101, 'gray'), plt.title('REFLECT_101')
plt.subplot(235), plt.imshow(wrap, 'gray'), plt.title('WRAP')
plt.subplot(236), plt.imshow(constant, 'gray'), plt.title('CONSTANT')
plt.show()
Numeral Calculations
img_cat = cv2.imread('cat.jpg')
img_dog = cv2.imread('dog.jpg')
- additive constant
img_cat2 = img_cat + 10 #所有像素值加10
img_cat[:5,:,0]
array([[184, 184, 184, …, 201, 201, 203],
[184, 184, 184, …, 201, 201, 204],
[185, 185, 185, …, 202, 202, 204],
[185, 185, 185, …, 202, 202, 205],
[185, 185, 185, …, 203, 203, 206]], dtype=uint8)
img_cat2[:5,:,0]
array([[194, 194, 194, …, 211, 211, 213],
[194, 194, 194, …, 211, 211, 214],
[195, 195, 195, …, 212, 212, 214],
[195, 195, 195, …, 212, 212, 215],
[195, 195, 195, …, 213, 213, 216]], dtype=uint8)
- add two pictures
(img_cat + img_cat2)[:5,:,0] # 和 对256取余(像素值范围为0~255)
array([[122, 122, 122, …, 156, 156, 160],
[122, 122, 122, …, 156, 156, 162],
[124, 124, 124, …, 158, 158, 162],
[124, 124, 124, …, 158, 158, 164],
[124, 124, 124, …, 160, 160, 166]], dtype=uint8)
- Call the cv2.add function to add the two pictures, but it will cross the boundary, and finally take the maximum value, pay attention
cv2.add(img_cat,img_cat2)[:5,:,0] # add()函数,越界 取最大值
array([[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255]], dtype=uint8)
image fusion
- Pay attention to the shape value, the difference cannot be added
img_cat + img_dog #不可操作,两图像shape值不同,无法相加
ValueError Traceback (most recent call last)
in
----> 1 img_cat + img_dog #Not operable, the two images have different shape values and cannot be added
ValueError: operands could not be broadcast together with shapes (225,225,3) (678,1024,3)
img_cat.shape
(225, 225, 3)
- resize() function, change the shape value
img_dog = cv2.resize(img_dog, (225,225))
img_dog.shape
(225, 225, 3)
blend it
res = cv2.addWeighted(img_cat, 0.7, img_dog, 0.3, 2)
plt.imshow(res)
<matplotlib.image.AxesImage at 0x20032e7fc18>
It's pretty pretty, isn't it~
- cv2.resize() can adjust the size of the image
res = cv2.resize(img,(0,0),fx = 1, fy = 3)
plt.imshow(res)
<matplotlib.image.AxesImage at 0x20032d1f518>
res = cv2.resize(img,(0,0),fx = 3, fy = 1)
plt.imshow(res)
<matplotlib.image.AxesImage at 0x20032e1bcc0>
That's it, there may be additions, welcome to discuss.