Image reading and basic attribute operations

RGB: Three channels, you all know it.

Data read-image

cv2.imread('filepath', flags): read in the image.
- filepath is the path
- flags is the flag to read in the picture
  - cv2.IMREAD_COLOR: color image, ignore alpha channel,
  - cv2.IMREAD_GRAYSCALE: grayscale image, 0
  - cv2.IMREAD_UNCHANGED: read in the complete picture, including the alpha channel,
Alpha channel: more about it later

upper code

import cv2    #opencv读取的格式是BGR（不是RGB格式）
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

img = cv2.imread('cat.jpg')    #读取图片信息

img

array([[[184, 196, 200],
[184, 196, 200],
[184, 196, 200],
…,
[201, 203, 204],
[201, 203, 204],
[203, 202, 204]],

[[184, 196, 200],
[184, 196, 200],
[184, 196, 200],
…,
[201, 203, 204],
[201, 203, 204],
[204, 203, 205]],

[[185, 196, 200],
[185, 196, 200],
[185, 196, 200],
…,
[202, 204, 205],
[202, 204, 205],
[204, 203, 205]],

…,

[[203, 205, 205],
[203, 205, 205],
[203, 205, 205],
…,
[226, 225, 227],
[226, 225, 227],
[225, 224, 226]],

[[204, 206, 206],
[204, 206, 206],
[204, 206, 206],
…,
[226, 225, 227],
[226, 225, 227],
[225, 224, 226]],

[[201, 203, 203],
[202, 204, 204],
[202, 204, 204],
…,
[228, 226, 226],
[228, 226, 226],
[229, 226, 228]]], dtype=uint8)

Notice

dtype value: 8 bytes
The dimension depends on the number of brackets, as above, there are 3 dimensions. img stores (height, width, depth) in [h,w,c]. That is to say, the first dimension is the height; the second dimension is the width; the third dimension is the innermost square brackets to indicate the depth, which is BGR.

Image display and wait time setting

Main function: cv2.show()

code example

#图像的显示，也可以创建多个窗口
cv2.imshow ('image',img)    #第一个参数是窗口的名字，其次为图像。

# cv2.imshow('image2',img)    #创建多个窗口

#等待时间，毫秒级，0表示 任意键 终止
cv2.waitKey(10000)

#删除任何我们建立的窗口
cv2.destroyAllWindows()

# cv2.destroyWindow('image')    #删除特定的窗口，在括号内输入想删除的窗口名

Note that the size of the picture does not change after the window is enlarged. If you want to change it, see the function below.
Function to adjust image size: cv2.namesWindow()
- Initially set the function label to cv2.WINDOW_AUTOSIZE
- The function label is cv2.WINDOW_NORMAL, you can adjust the window size

cv2.namedWindow('image',cv2.WINDOW_NORMAL)
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

If you want the code to be clean, just set up a function and frame it.

#执行上述三个函数的函数窗口
def cv_show(name,img):
    cv2.imshow(name,img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

img.shape

(225, 225, 3)

Grayscale

I mentioned the flags of cv2.imread above, then cv2.IMREAD_GRAYSCALE can read grayscale images. Since there is no RGB, it is two-dimensional.

#灰度图的读取做法
img = cv2.imread('cat.jpg',cv2.IMREAD_GRAYSCALE)
img

array([[196, 196, 196, …, 203, 203, 203],
[196, 196, 196, …, 203, 203, 204],
[196, 196, 196, …, 204, 204, 204],
…,
[205, 205, 205, …, 226, 226, 225],
[206, 206, 206, …, 226, 226, 225],
[203, 204, 204, …, 226, 226, 227]], dtype=uint8)

img.shape

(225, 225)

#图像的显示，也可以创建多个窗口
cv2.imshow ('image',img)
#等待时间，毫秒级，0表示任意键终止
cv2.waitKey(10000)
cv2.destroyAllWindows()

other functions

keep

Notice that a new picture has been added to the file

cv2.imwrite("mycat.png",img)

True

Format

type(img)    #ndarray格式

numpy.ndarray

Number of pixels

img.size    #size：像素点个数

50625

type of data

img.dtype    #数据类型

dtype(‘uint8’)

Using Matplotlib

img = cv2.imread('cat.jpg')
plt.imshow(img,cmap = 'gray',interpolation='bicubic')
plt.xticks([]),plt.yticks([])
plt.show()

insert image description here

The picture has chromatic aberration, the reason is that the cv2.imread() interface is used to read the picture, and the read in is BGR format and [0~255], so just convert it to RGB format

img_2 = img[:,:,[2,1,0]]
plt.imshow(img_2)

<matplotlib.image.AxesImage at 0x2608f2faef0>

insert image description here
Hey, it's amazing.

Data Reading - Video

Video is composed of images, which become frames, as you must know.

cv2.VideoCapture can capture the lens and use numbers to control different devices, such as 0, 1

Run it, don't be intimidated by your big face (not talking about myself

import numpy as np
import cv2

cap = cv2.VideoCapture(0)

while(True):
    ret, frame = cap.read()
    
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    cv2.imshow('frame',gray)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

save video
FourCC encoding
Capture video from webcam, rotate each frame horizontally and save it.

# import numpy as np
# import cv2

# cap = cv2.VideoCapture(0)

# # Define the codec and create VideoWriter object
# fourcc = cv2.cv.FOURCC(*'XVID')
# out = cv2.VideoWriter('output.avi',fourcc, 20.0, (640,480))

# while(cap.isOpened()):
#     ret, frame = cap.read()
#     if ret==True:
#         frame = cv2.flip(frame,0)

#         # write the flipped frame
#         out.write(frame)

#         cv2.imshow('frame',frame)
#         if cv2.waitKey(1) & 0xFF == ord('q'):
#             break
#     else:
#         break

# # Release everything if job is finished
# cap.release()
# out.release()
# cv2.destroyAllWindows()

If it is a video file, just make the path directly

vc = cv2.VideoCapture('test.mp4')

#检查是否打开正确
if vc.isOpened():    #判断能否打开
    open, frame = vc.read()    #读取帧
else:
    open = False

while open:
    ret, frame = vc.read()
    if frame is None:
        break
    if ret == True:
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)    #转换为灰度图
        cv2.imshow('result',gray)
        if cv2.waitKey(1) & 0xFF == 27:    #waitkey数值越大，播放速度越慢
            break
vc.release()
cv2.destroyAllWindows()

Capture part of the image data

Part of the picture that can be intercepted by yourself

img = cv2.imread('cat.jpg')
cat = img[0:200,0:20]
cv_show('cat',cat)

b,g,r = cv2.split(img)

array([[184, 184, 184, …, 201, 201, 203],
[184, 184, 184, …, 201, 201, 204],
[185, 185, 185, …, 202, 202, 204],
…,
[203, 203, 203, …, 226, 226, 225],
[204, 204, 204, …, 226, 226, 225],
[201, 202, 202, …, 228, 228, 229]], dtype=uint8)

r.shape

(225, 225)

img = cv2.merge((b,g,r))
img.shape

(225, 225, 3)

Only keep a single channel

R:G:B correspond to 0:1:2 respectively
Set the remaining two items to zero

#只保留R通道
cur_img = img.copy()
cur_img[:,:,0] = 0
cur_img[:,:,1] = 0
cv_show('R',cur_img)

#只保留G通道
cur_img = img.copy()
cur_img[:,:,0] = 0
cur_img[:,:,2] = 0
cv_show('G',cur_img)

#只保留B通道
cur_img = img.copy()
cur_img[:,:,1] = 0
cur_img[:,:,2] = 0
cv_show('B',cur_img)

border padding

Convolution introduction, guide https://mlnotebook.github.io/post/CNN1/ Learn more

Look at the code comments, there are explanations

top_size, bottom_size, left_size, right_size = (50,50,50,50)    #上下左右填充值

#复制法，复制最边缘像素
replicate = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_REPLICATE)

#反射法，对感兴趣的图像中的像素在两边进行赋值。例如: fedcba|abcdefgh|hgfedc
reflect = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_REFLECT)

#反射法，以最边缘像素为轴，对称。例如：gfedcb|abcdefgh|gfedcba
reflect101 = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_REFLECT_101)

#外包装法。例如：bcdefh|abcdefgh|abcdefg
wrap = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_WRAP)

#常量法，常数值填充
constant = cv2.copyMakeBorder(img, top_size, bottom_size, left_size, right_size, borderType = cv2.BORDER_CONSTANT, value = 700)#需设置value值，选择常数填充

import matplotlib.pyplot as plt
plt.subplot(231), plt.imshow(img, 'gray'), plt.title('ORIGINAL')
plt.subplot(232), plt.imshow(replicate, 'gray'), plt.title('REPLICATE')
plt.subplot(233), plt.imshow(reflect, 'gray'), plt.title('REFLECT')
plt.subplot(234), plt.imshow(reflect101, 'gray'), plt.title('REFLECT_101')
plt.subplot(235), plt.imshow(wrap, 'gray'), plt.title('WRAP')
plt.subplot(236), plt.imshow(constant, 'gray'), plt.title('CONSTANT')

plt.show()

insert image description here

Numeral Calculations

img_cat = cv2.imread('cat.jpg')
img_dog = cv2.imread('dog.jpg')

additive constant

img_cat2 = img_cat + 10    #所有像素值加10
img_cat[:5,:,0]

array([[184, 184, 184, …, 201, 201, 203],
[184, 184, 184, …, 201, 201, 204],
[185, 185, 185, …, 202, 202, 204],
[185, 185, 185, …, 202, 202, 205],
[185, 185, 185, …, 203, 203, 206]], dtype=uint8)

img_cat2[:5,:,0]

array([[194, 194, 194, …, 211, 211, 213],
[194, 194, 194, …, 211, 211, 214],
[195, 195, 195, …, 212, 212, 214],
[195, 195, 195, …, 212, 212, 215],
[195, 195, 195, …, 213, 213, 216]], dtype=uint8)

add two pictures

(img_cat + img_cat2)[:5,:,0]    # 和 对256取余（像素值范围为0~255）

array([[122, 122, 122, …, 156, 156, 160],
[122, 122, 122, …, 156, 156, 162],
[124, 124, 124, …, 158, 158, 162],
[124, 124, 124, …, 158, 158, 164],
[124, 124, 124, …, 160, 160, 166]], dtype=uint8)

Call the cv2.add function to add the two pictures, but it will cross the boundary, and finally take the maximum value, pay attention

cv2.add(img_cat,img_cat2)[:5,:,0]    # add()函数，越界 取最大值

array([[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255]], dtype=uint8)

image fusion

Pay attention to the shape value, the difference cannot be added

img_cat + img_dog    #不可操作，两图像shape值不同，无法相加

ValueError Traceback (most recent call last)

in
----> 1 img_cat + img_dog #Not operable, the two images have different shape values and cannot be added

ValueError: operands could not be broadcast together with shapes (225,225,3) (678,1024,3)

img_cat.shape

(225, 225, 3)

resize() function, change the shape value

img_dog = cv2.resize(img_dog, (225,225))   
img_dog.shape

(225, 225, 3)

blend it

res = cv2.addWeighted(img_cat, 0.7, img_dog, 0.3, 2)

plt.imshow(res)

<matplotlib.image.AxesImage at 0x20032e7fc18>

insert image description here

It's pretty pretty, isn't it~

cv2.resize() can adjust the size of the image

res = cv2.resize(img,(0,0),fx = 1, fy = 3)
plt.imshow(res)

<matplotlib.image.AxesImage at 0x20032d1f518>
insert image description here

res = cv2.resize(img,(0,0),fx = 3, fy = 1)
plt.imshow(res)

<matplotlib.image.AxesImage at 0x20032e1bcc0>

insert image description here

That's it, there may be additions, welcome to discuss.

MyOpenCV-learning- basic image operations

Image reading and basic attribute operations

Data read-image

Notice

Image display and wait time setting

Grayscale

other functions

keep

Format

Number of pixels

type of data

Using Matplotlib

Data Reading - Video

Capture part of the image data

Only keep a single channel

border padding

Numeral Calculations

image fusion

Guess you like