ONNX Runtime implementa el modelo de clasificación de imágenes previamente entrenado de ImageNet (notas de estudio del hermano Tongji Zihao)

Código del entorno de instalación y configuración
que ejecuta la plataforma GPU en la nube: respuesta a consejos de inteligencia artificial de cuenta pública gpu
Tongji Zihao hermano 2022-8-22 2023-4-28 2023-5-8
Instalar Pytorch

pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu113

Instalar ONNX

pip install onnx -i https://pypi.tuna.tsinghua.edu.cn/simple

Instalar el motor de inferencia ONNX Runtime

pip install onnxruntime -i https://pypi.tuna.tsinghua.edu.cn/simple

Instalar otros kits de herramientas de terceros

pip install numpy pandas matplotlib tqdm opencv-python pillow -i https://pypi.tuna.tsinghua.edu.cn/simple

Verifique que la configuración de la instalación sea exitosa

import torch
import onnx
import onnxruntime as ort
#验证安装配置成功
print('PyTorch版本:',torch.__version__)
print('ONNX版本:',onnx.__version__)
print('ONNX Runtime版本:',ort.__version__)

Descargar archivos de materiales

https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20220716-mmclassification/dataset/imagenet/imagenet_class_index.csv
https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20220716-mmclassification/test/banana1.jpg
https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20220716-mmclassification/test/video_4.mp4

Descargar archivos de modelo ONNX

https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20220716-mmclassification/dataset/fruit30/onnx/resnet18_imagenet.onnx

Implementación del motor de inferencia ONNX Runtime: predicción de una sola imagen
Utilice el motor de inferencia ONNX Runtime para leer el archivo del modelo en formato ONNX y predecir el archivo de imagen única.
Hermano Tongji Zihao https://space.bilibili.com/1900783
2022-8-22 2023-5-8
Escenario de aplicación
El siguiente código se ejecuta en el hardware que debe implementarse (PC local, placa de desarrollo integrada, Raspberry Pi, Jetson Nano, servidor)
Simplemente envíe el archivo del modelo onnx al hardware de implementación e instale el entorno ONNX Runtime. Utilice las siguientes líneas de código para ejecutar el modelo.

import  onnxruntime
import torch
import torch.nn.functional as F
import pandas as pd
from PIL import Image # 用pillow载入
from torchvision import transforms


# 载入onnx模型,获取ONNX Runtime推理器
ort_session=onnxruntime.InferenceSession('resnet18_imagenet.onnx',
                                         providers=['CUDAExecutionProvider'])
# # 构造随随机输入,获取输出结果
# x=torch.rand(1,3,256,256).numpy()
# print('random:',x.shape)
# # onnx runtime 输入
# ort_inputs={'input':x}
# # onnx runtime 输出
# ort_output=ort_session.run(['output'],ort_inputs)[0]
# # 注意:输入输出张量的名称需要和torch.onnx.export中设置的输入输出名对应
# print('random ort_output:',ort_output.shape)

# 载入一张真正的测试图像
img_path='banana1.jpg'
img_pil=Image.open(img_path)
# img_pil.show() #显示这张图象
# 测试集图像预处理-RCNN:缩放旋转、转Tensor、归一化
test_transform=transforms.Compose([transforms.Resize(256),
                                   transforms.CenterCrop(256),
                                   transforms.ToTensor(),
                                   transforms.Normalize(
                                       mean=[0.485,0.456,0.406],
                                       std=[0.229,0.224,0.225])
                                   ])
# 运行预处理
input_img=test_transform(img_pil)
# print('input_img_shape:',input_img.shape)
input_tensor=input_img.unsqueeze(0).numpy()
# print('input_img_tensor:',input_tensor.shape)

# 推理预测
# ONN Runtime 输入
ort_inputs={
    
    'input':input_tensor}
# ONN Runtime 输出
pred_logits=ort_session.run(['output'],ort_inputs)[0]
pred_logits=torch.tensor(pred_logits)
# print('pred_logits:',pred_logits.shape)
# 对logit分数做softmax运算,得到置信度概率
pred_softmax=F.softmax(pred_logits,dim=1)
# print('pre_softmax:',pred_softmax.shape)
# 解析预测结果
# 取置信度最高的前n个结果
n=3
top_n=torch.topk(pred_softmax,n)
# print('top_n:',top_n)
# 预测结果
pred_ids=top_n.indices.numpy()[0]
# print('pre_ids:',pred_ids)
# 预测置信度
confs=top_n.values.numpy()[0]
# print('confs:',confs)

# 载入ID和 类别名称 对应关系
df=pd.read_csv('imagenet_class_index.csv')
idx_to_labels={
    
    }
for idx,row in df.iterrows():
    idx_to_labels[row['ID']]=row['class'] # 英文
    # idx_to_labels[row['ID']] = row['Chinese']# 中文
# print('idx_to_labels:',idx_to_labels)

# 分别用英文和中文打印预测结果
for i in range(n):
    class_name=idx_to_labels[pred_ids[i]] # 获取类别名称
    confidence=confs[i]*100
    text='{:<20}{:>.3f}%'.format(class_name,confidence)
    print(text)

Implementación de ImageNet-ONNX Runtime: cámara y video: inglés
Utilice el motor de inferencia ONNX Runtime para cargar el modelo onnx de clasificación de imágenes previamente entrenado de ImageNet para predecir las imágenes de la cámara en tiempo real.
Hermano Tongji Zihao: https://space.bilibili.com/1900783
Entorno de ejecución de prueba: Macbook Pro
Notas
Este código debe ejecutarse localmente donde la cámara está conectada y no se puede ejecutar en la plataforma GPU en la nube.
ejecutar localmente

pip install onnxruntime

Instale el tiempo de ejecución de onnx y prepare los archivos del modelo onnx.
Plantilla en inglés de procesamiento de video cuadro por cuadro

from PIL import Image
import onnxruntime
import torch
import torch.nn.functional as F
from torchvision import transforms
import pandas as pd
import cv2
import time
from tqdm import tqdm

# 处理帧函数
def process_frame(img_bgr):
    '''
    输入摄像头拍摄画面bgr-array,输出图像分类预测结果bgr-array
    '''
    # 载入 onnx 模型,获取 ONNX Runtime 推理器

    cuda = torch.cuda.is_available()
    providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
    ort_session = onnxruntime.InferenceSession('resnet18_imagenet.onnx',None,providers=providers)
    # 载入ImageNet 1000图像分类标签
    df = pd.read_csv('imagenet_class_index.csv')
    idx_to_labels = {
    
    }
    for idx, row in df.iterrows():
        idx_to_labels[row['ID']] = row['class']
    # 图像预处理
    # 测试集图像预处理-RCTN:缩放裁剪、转 Tensor、归一化
    test_transform = transforms.Compose([transforms.Resize(256),
                                         transforms.CenterCrop(256),
                                         transforms.ToTensor(),
                                         transforms.Normalize(
                                             mean=[0.485, 0.456, 0.406],
                                             std=[0.229, 0.224, 0.225])
                                         ])

    # 记录该帧开始处理的时间
    start_time = time.time()

    ## 画面转成 RGB 的 Pillow 格式
    img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)  # BGR转RGB
    img_pil = Image.fromarray(img_rgb)  # array 转 PIL

    ## 预处理
    input_img = test_transform(img_pil)  # 预处理
    input_tensor = input_img.unsqueeze(0).numpy()

    ## onnx runtime 预测
    ort_inputs = {
    
    'input': input_tensor}  # onnx runtime 输入
    pred_logits = ort_session.run(['output'], ort_inputs)[0]  # onnx runtime 输出
    pred_logits = torch.tensor(pred_logits)
    pred_softmax = F.softmax(pred_logits, dim=1)  # 对 logit 分数做 softmax 运算

    ## 解析top-n预测结果的类别和置信度
    top_n = torch.topk(pred_softmax, 5)  # 取置信度最大的 n 个结果
    pred_ids = top_n[1].cpu().detach().numpy().squeeze()  # 解析预测类别
    confs = top_n[0].cpu().detach().numpy().squeeze()  # 解析置信度

    # 在图像上写英文
    for i in range(len(confs)):
        pred_class = idx_to_labels[pred_ids[i]]

        # 写字:图片,添加的文字,左上角坐标,字体,字体大小,颜色,线宽,线型
        text = '{:<15} {:>.3f}'.format(pred_class, confs[i])
        img_bgr = cv2.putText(img_bgr, text, (50, 160 + 80 * i), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 255), 4,
                              cv2.LINE_AA)

    # 记录该帧处理完毕的时间
    end_time = time.time()
    # 计算每秒处理图像帧数FPS
    FPS = 1 / (end_time - start_time)
    # 图片,添加的文字,左上角坐标,字体,字体大小,颜色,线宽,线型
    img_bgr = cv2.putText(img_bgr, 'FPS  ' + str(int(FPS)), (50, 80), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 255), 4,
                          cv2.LINE_AA)

    return img_bgr

def generate_video(input_path):
    filehead = input_path.split('/')[-1]
    output_path = "out-" + filehead

    print('视频开始处理', input_path)

    # 获取视频总帧数
    cap = cv2.VideoCapture(input_path)
    frame_count = 0
    while (cap.isOpened()):
        success, frame = cap.read()
        frame_count += 1
        if not success:
            break
    cap.release()
    print('视频总帧数为', frame_count)

    # cv2.namedWindow('Crack Detection and Measurement Video Processing')
    cap = cv2.VideoCapture(input_path)
    frame_size = (cap.get(cv2.CAP_PROP_FRAME_WIDTH), cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

    # fourcc = int(cap.get(cv2.CAP_PROP_FOURCC))
    # fourcc = cv2.VideoWriter_fourcc(*'XVID')
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    fps = cap.get(cv2.CAP_PROP_FPS)

    out = cv2.VideoWriter(output_path, fourcc, fps, (int(frame_size[0]), int(frame_size[1])))

    # 进度条绑定视频总帧数
    with tqdm(total=frame_count - 1) as pbar:
        try:
            while (cap.isOpened()):
                success, frame = cap.read()
                if not success:
                    break

                # 处理帧
                # frame_path = './temp_frame.png'
                # cv2.imwrite(frame_path, frame)
                try:
                    frame = process_frame(frame)
                except:
                    print('报错!', error)
                    pass

                if success == True:
                    # cv2.imshow('Video Processing', frame)
                    out.write(frame)

                    # 进度条更新一帧
                    pbar.update(1)

                # if cv2.waitKey(1) & 0xFF == ord('q'):
                # break
        except:
            print('中途中断')
            pass

    cv2.destroyAllWindows()
    out.release()
    cap.release()
    print('视频已保存', output_path)

generate_video(input_path='video_4.mp4')

Llama a la cámara para obtener la plantilla en inglés de cada fotograma.

from PIL import Image
import onnxruntime
import torch
import torch.nn.functional as F
from torchvision import transforms
import pandas as pd
import cv2
import time

# 处理帧函数
def process_frame(img_bgr):
    '''
    输入摄像头拍摄画面bgr-array,输出图像分类预测结果bgr-array
    '''
    # 载入 onnx 模型,获取 ONNX Runtime 推理器

    cuda = torch.cuda.is_available()
    providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
    ort_session = onnxruntime.InferenceSession('resnet18_imagenet.onnx',None,providers=providers)
    # 载入ImageNet 1000图像分类标签
    df = pd.read_csv('imagenet_class_index.csv')
    idx_to_labels = {
    
    }
    for idx, row in df.iterrows():
        idx_to_labels[row['ID']] = row['class']
    # 图像预处理
    # 测试集图像预处理-RCTN:缩放裁剪、转 Tensor、归一化
    test_transform = transforms.Compose([transforms.Resize(256),
                                         transforms.CenterCrop(256),
                                         transforms.ToTensor(),
                                         transforms.Normalize(
                                             mean=[0.485, 0.456, 0.406],
                                             std=[0.229, 0.224, 0.225])
                                         ])

    # 记录该帧开始处理的时间
    start_time = time.time()

    ## 画面转成 RGB 的 Pillow 格式
    img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)  # BGR转RGB
    img_pil = Image.fromarray(img_rgb)  # array 转 PIL

    ## 预处理
    input_img = test_transform(img_pil)  # 预处理
    input_tensor = input_img.unsqueeze(0).numpy()

    ## onnx runtime 预测
    ort_inputs = {
    
    'input': input_tensor}  # onnx runtime 输入
    pred_logits = ort_session.run(['output'], ort_inputs)[0]  # onnx runtime 输出
    pred_logits = torch.tensor(pred_logits)
    pred_softmax = F.softmax(pred_logits, dim=1)  # 对 logit 分数做 softmax 运算

    ## 解析top-n预测结果的类别和置信度
    top_n = torch.topk(pred_softmax, 5)  # 取置信度最大的 n 个结果
    pred_ids = top_n[1].cpu().detach().numpy().squeeze()  # 解析预测类别
    confs = top_n[0].cpu().detach().numpy().squeeze()  # 解析置信度

    # 在图像上写英文
    for i in range(len(confs)):
        pred_class = idx_to_labels[pred_ids[i]]

        # 写字:图片,添加的文字,左上角坐标,字体,字体大小,颜色,线宽,线型
        text = '{:<15} {:>.3f}'.format(pred_class, confs[i])
        img_bgr = cv2.putText(img_bgr, text, (50, 160 + 80 * i), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 0, 255), 4,
                              cv2.LINE_AA)

    # 记录该帧处理完毕的时间
    end_time = time.time()
    # 计算每秒处理图像帧数FPS
    FPS = 1 / (end_time - start_time)
    # 图片,添加的文字,左上角坐标,字体,字体大小,颜色,线宽,线型
    img_bgr = cv2.putText(img_bgr, 'FPS  ' + str(int(FPS)), (50, 80), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 255), 4,
                          cv2.LINE_AA)

    return img_bgr


cap = cv2.VideoCapture(1)
# 打开cap
cap.open(0)
# 无限循环,直到break被触发
while cap.isOpened():
    # 获取画面
    success, frame = cap.read()
    if not success:  # 如果获取画面不成功,则退出
        print('获取画面不成功,退出')
        break
    ## 逐帧处理
    frame = process_frame(frame)
    # 展示处理后的三通道图像
    cv2.imshow('my_window', frame)
    key_pressed = cv2.waitKey(60)  # 每隔多少毫秒毫秒,获取键盘哪个键被按下
    # print('键盘上被按下的键:', key_pressed)
    if key_pressed in [ord('q'), 27]:  # 按键盘上的q或esc退出(在英文输入法下)
        break
# 关闭摄像头
cap.release()
# 关闭图像窗口
cv2.destroyAllWindows()
# 按键盘上的q键退出

ImageNet-ONNX Runtime Deployment-Camera and Video-Chinese
Utilice ImageNet para entrenar previamente el modelo de clasificación de imágenes para predecir las imágenes de la cámara en tiempo real.
Este código debe ejecutarse localmente donde está conectada la cámara y no se puede ejecutar en la plataforma GPU en la nube.
Hermano Tongji Zihao: https://space.bilibili.com/1900783
Entorno de ejecución de prueba: Macbook Pro
importa fuentes chinas

https://zihao-openmmlab.obs.cn-east-3.myhuaweicloud.com/20220716-mmclassification/dataset/SimHei.ttf

Plantilla china de procesamiento de video cuadro por cuadro

from PIL import Image, ImageFont, ImageDraw
import onnxruntime
import torch
import torch.nn.functional as F
from torchvision import transforms
import pandas as pd
import cv2
import numpy as np
import time
from tqdm import tqdm

# 处理帧函数
def process_frame(img_bgr):
    '''
    输入摄像头拍摄画面bgr-array,输出图像分类预测结果bgr-array
    '''
    # 导入中文字体,指定字体大小
    font = ImageFont.truetype('SimHei.ttf', 32)
    # 载入 onnx 模型,获取 ONNX Runtime 推理器

    cuda = torch.cuda.is_available()
    providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
    ort_session = onnxruntime.InferenceSession('resnet18_imagenet.onnx',None,providers=providers)
    # 载入ImageNet 1000图像分类标签
    df = pd.read_csv('imagenet_class_index.csv')
    idx_to_labels = {
    
    }
    for idx, row in df.iterrows():
        idx_to_labels[row['ID']] = row['Chinese']
    # 图像预处理
    # 测试集图像预处理-RCTN:缩放裁剪、转 Tensor、归一化
    test_transform = transforms.Compose([transforms.Resize(256),
                                         transforms.CenterCrop(256),
                                         transforms.ToTensor(),
                                         transforms.Normalize(
                                             mean=[0.485, 0.456, 0.406],
                                             std=[0.229, 0.224, 0.225])
                                         ])

    # 记录该帧开始处理的时间
    start_time = time.time()

    img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)  # BGR转RGB
    img_pil = Image.fromarray(img_rgb)  # array 转 PIL

    ## 预处理
    input_img = test_transform(img_pil)  # 预处理
    input_tensor = input_img.unsqueeze(0).numpy()

    ## onnx runtime 预测
    ort_inputs = {
    
    'input': input_tensor}  # onnx runtime 输入
    pred_logits = ort_session.run(['output'], ort_inputs)[0]  # onnx runtime 输出
    pred_logits = torch.tensor(pred_logits)
    pred_softmax = F.softmax(pred_logits, dim=1)  # 对 logit 分数做 softmax 运算

    ## 解析图像分类预测结果
    n = 5
    top_n = torch.topk(pred_softmax, n)  # 取置信度最大的 n 个结果
    pred_ids = top_n[1].cpu().detach().numpy().squeeze()  # 解析出类别
    confs = top_n[0].cpu().detach().numpy().squeeze()  # 解析出置信度

    ## 在图像上写中文
    draw = ImageDraw.Draw(img_pil)
    for i in range(len(confs)):
        pred_class = idx_to_labels[pred_ids[i]]

        # 写中文:文字坐标,中文字符串,字体,rgba颜色
        text = '{:<15} {:>.3f}'.format(pred_class, confs[i])  # 中文字符串
        draw.text((50, 100 + 50 * i), text, font=font, fill=(255, 0, 0, 1))

    img_rgb = np.array(img_pil)  # PIL 转 array
    img_bgr = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2BGR)  # RGB转BGR

    # 记录该帧处理完毕的时间
    end_time = time.time()
    # 计算每秒处理图像帧数FPS
    FPS = 1 / (end_time - start_time)
    # 图片,添加的文字,左上角坐标,字体,字体大小,颜色,线宽,线型
    img_bgr = cv2.putText(img_bgr, 'FPS  ' + str(int(FPS)), (50, 80), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 255), 4,
                          cv2.LINE_AA)
    return img_bgr

def generate_video(input_path='videos/robot.mp4'):
    filehead = input_path.split('/')[-1]
    output_path = "out-" + filehead

    print('视频开始处理', input_path)

    # 获取视频总帧数
    cap = cv2.VideoCapture(input_path)
    frame_count = 0
    while (cap.isOpened()):
        success, frame = cap.read()
        frame_count += 1
        if not success:
            break
    cap.release()
    print('视频总帧数为', frame_count)

    # cv2.namedWindow('Crack Detection and Measurement Video Processing')
    cap = cv2.VideoCapture(input_path)
    frame_size = (cap.get(cv2.CAP_PROP_FRAME_WIDTH), cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

    # fourcc = int(cap.get(cv2.CAP_PROP_FOURCC))
    # fourcc = cv2.VideoWriter_fourcc(*'XVID')
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    fps = cap.get(cv2.CAP_PROP_FPS)

    out = cv2.VideoWriter(output_path, fourcc, fps, (int(frame_size[0]), int(frame_size[1])))

    # 进度条绑定视频总帧数
    with tqdm(total=frame_count - 1) as pbar:
        try:
            while (cap.isOpened()):
                success, frame = cap.read()
                if not success:
                    break

                # 处理帧
                # frame_path = './temp_frame.png'
                # cv2.imwrite(frame_path, frame)
                try:
                    frame = process_frame(frame)
                except:
                    print('报错!', error)
                    pass

                if success == True:
                    # cv2.imshow('Video Processing', frame)
                    out.write(frame)

                    # 进度条更新一帧
                    pbar.update(1)

                # if cv2.waitKey(1) & 0xFF == ord('q'):
                # break
        except:
            print('中途中断')
            pass

    cv2.destroyAllWindows()
    out.release()
    cap.release()
    print('视频已保存', output_path)

generate_video(input_path='video_4.mp4')

Llama a la cámara para obtener la plantilla china de cada fotograma.

from PIL import Image, ImageFont, ImageDraw
import onnxruntime
import torch
import torch.nn.functional as F
from torchvision import transforms
import pandas as pd
import cv2
import numpy as np
import time
from tqdm import tqdm

# 处理帧函数
def process_frame(img_bgr):
    '''
    输入摄像头拍摄画面bgr-array,输出图像分类预测结果bgr-array
    '''
    # 导入中文字体,指定字体大小
    font = ImageFont.truetype('SimHei.ttf', 32)
    # 载入 onnx 模型,获取 ONNX Runtime 推理器

    cuda = torch.cuda.is_available()
    providers = ['CUDAExecutionProvider', 'CPUExecutionProvider'] if cuda else ['CPUExecutionProvider']
    ort_session = onnxruntime.InferenceSession('resnet18_imagenet.onnx',None,providers=providers)
    # 载入ImageNet 1000图像分类标签
    df = pd.read_csv('imagenet_class_index.csv')
    idx_to_labels = {
    
    }
    for idx, row in df.iterrows():
        idx_to_labels[row['ID']] = row['Chinese']
    # 图像预处理
    # 测试集图像预处理-RCTN:缩放裁剪、转 Tensor、归一化
    test_transform = transforms.Compose([transforms.Resize(256),
                                         transforms.CenterCrop(256),
                                         transforms.ToTensor(),
                                         transforms.Normalize(
                                             mean=[0.485, 0.456, 0.406],
                                             std=[0.229, 0.224, 0.225])
                                         ])

    # 记录该帧开始处理的时间
    start_time = time.time()

    img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)  # BGR转RGB
    img_pil = Image.fromarray(img_rgb)  # array 转 PIL

    ## 预处理
    input_img = test_transform(img_pil)  # 预处理
    input_tensor = input_img.unsqueeze(0).numpy()

    ## onnx runtime 预测
    ort_inputs = {
    
    'input': input_tensor}  # onnx runtime 输入
    pred_logits = ort_session.run(['output'], ort_inputs)[0]  # onnx runtime 输出
    pred_logits = torch.tensor(pred_logits)
    pred_softmax = F.softmax(pred_logits, dim=1)  # 对 logit 分数做 softmax 运算

    ## 解析图像分类预测结果
    n = 5
    top_n = torch.topk(pred_softmax, n)  # 取置信度最大的 n 个结果
    pred_ids = top_n[1].cpu().detach().numpy().squeeze()  # 解析出类别
    confs = top_n[0].cpu().detach().numpy().squeeze()  # 解析出置信度

    ## 在图像上写中文
    draw = ImageDraw.Draw(img_pil)
    for i in range(len(confs)):
        pred_class = idx_to_labels[pred_ids[i]]

        # 写中文:文字坐标,中文字符串,字体,rgba颜色
        text = '{:<15} {:>.3f}'.format(pred_class, confs[i])  # 中文字符串
        draw.text((50, 100 + 50 * i), text, font=font, fill=(255, 0, 0, 1))

    img_rgb = np.array(img_pil)  # PIL 转 array
    img_bgr = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2BGR)  # RGB转BGR

    # 记录该帧处理完毕的时间
    end_time = time.time()
    # 计算每秒处理图像帧数FPS
    FPS = 1 / (end_time - start_time)
    # 图片,添加的文字,左上角坐标,字体,字体大小,颜色,线宽,线型
    img_bgr = cv2.putText(img_bgr, 'FPS  ' + str(int(FPS)), (50, 80), cv2.FONT_HERSHEY_SIMPLEX, 2, (255, 0, 255), 4,
                          cv2.LINE_AA)
    return img_bgr

# 获取摄像头,传入0表示获取系统默认摄像头
cap = cv2.VideoCapture(1)
# 打开cap
cap.open(0)
# 无限循环,直到break被触发
while cap.isOpened():
    # 获取画面
    success, frame = cap.read()
    if not success:  # 如果获取画面不成功,则退出
        print('获取画面不成功,退出')
        break
    ## 逐帧处理
    frame = process_frame(frame)
    # 展示处理后的三通道图像
    cv2.imshow('my_window', frame)
    key_pressed = cv2.waitKey(60)  # 每隔多少毫秒毫秒,获取键盘哪个键被按下
    # print('键盘上被按下的键:', key_pressed)
    if key_pressed in [ord('q'), 27]:  # 按键盘上的q或esc退出(在英文输入法下)
        break
# 关闭摄像头
cap.release()
# 关闭图像窗口
cv2.destroyAllWindows()
# 按键盘上的q键退出

Supongo que te gusta

Origin blog.csdn.net/qq_50993557/article/details/132854171
Recomendado
Clasificación