Python implements batch image text recognition and specifies part extraction

When we need to identify some numbers in the same area in batches at work, we can use Python

Need to use these libraries easycr openvc os matplotlib core is the first two

opevc realizes image cropping, easycr realizes text recognition

installation method

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

pip install opencv-contrib-python

pip install opencv-python

pip install opencv-python-headless

pip install easyocr

# -*- coding: utf-8 -*-
import easyocr 
import os
import cv2
import pandas as  pd

Traverse the image and crop

# im_path:图片路径
 
def clip_image(im_path): 
    i=0 
    filelist = os.listdir(im_path)
    for file in filelist:
        file_path=os.path.join(im_path,file)
        im=cv2.imread(file_path)
        #[h,w]根据自己图片中目标的位置修改
        im=im[19:38,256:353] 
        b=str(i)   #数字变为字符串方便后面命名               
        save_path = r'E:\pythonprograms\easyocr\img\img2' #裁剪后路径
        save_path_file = os.path.join(save_path,b+".jpg")           
        cv2.imwrite(save_path_file,im)            
        i=i+1

im_path = r'E:\pythonprograms\easyocr\img\img'  #裁剪前路径
clip_image(im_path)

Batch text recognition, splicing and merging, converting to excel storage

render = easyocr.Reader(['ch_sim','en'])
filepath =r'E:\pythonprograms\easyocr\img\img2' #裁剪后路径
file = os.listdir(filepath)
spa =[]
for f in file :
    url =os.path.join(filepath,f)
    content = render.readtext(url,detail=0) #detail=0 表示去掉细节 
    s = ' '.join(content)
    spa.append(s)
b2=pd.DataFrame(spa )
b2.to_excel("results.xls")  
b2
0 4.818 ruicms
1 8.907 ruicms
2 2.556 ruicms
3 3.280 ruicms
4 3.189 rnicms
5 3.028 rnicm

 related information

There are three ways to read pictures, namely matplotlib, opencv, and PIL. It is relatively simple to use matplotlib for display

Locate the area you want to extract from the image, you can view it through the drawing software

For example, if you move the mouse on it, you can see the details of the positioning pixels. Then you can extract the picture corresponding to the pixel position

image1=r'E:\pythonprograms\easyocr\img\img\R-0148_01692_Screenshot.png'
im=cv2.imread(image1)
im2=im[19:38,256:353] 

After cv2 reads the picture, you can get the image attributes and crop the image

https://blog.csdn.net/yukinoai/article/details/86423937

# 获取图像属性
shape = im.shape
print('图像的形状为: ', shape)  # 打印图像形状,包括行、列、通道
size = im.size
print('图像的像素数目为: ', size)  # 打印图像的像素数目
dtype = im.dtype
print('图像的数据类型为: ', dtype)  # 打印图像的数据类型

Read pictures with matplotlib 

import matplotlib.image as mpimg#读取图片
import matplotlib.pyplot as plt #显示图片
%matplotlib inline

image = mpimg.imread(image1)
plt.title('展示部分')
plt.axis('off')# 不显示坐标轴
plt.imshow(im2)
plt.show()

import matplotlib.image as mpimg#读取图片
import matplotlib.pyplot as plt #显示图片
%matplotlib inline

image = mpimg.imread(image1)
plt.title('Read Image by Matplotlib')
plt.axis('off')# 不显示坐标轴
plt.imshow(image)
plt.show()

Records of problems encountered during the process

easyocr import error 

The easycr installation is fine but the import reports an error

py run error is:

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.

OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see Intel® Product Support

ipynb will also report an error

method:

Deleted the libiomp5md.dll file under E:\Anaconda\anacanda3\Library\bin. You can also modify the suffix name.

Then it will work.

cv2 display image error

Use, cv2.imshow(" ", img) has been reporting an error, and finally did not solve it, using matplotlib to display pictures.

Guess you like

Origin blog.csdn.net/weixin_42984235/article/details/128003120