Simple use of Baidu Flying Paddle PaddleOCR

PaddleOCR aims to create a rich, leading, and practical OCR tool library to help developers train better models and implement applications.

PaddleOCR is an image recognition library, I just used its OCR function, which is to recognize the text in the picture. Other functions are also very powerful, I have not used them.

Install

UPDATE: As of December 21, 2022, paddlepaddle is 2.4.1, paddleocr is 2.6.1.2, numpy is 1.24.0, and numpy is installed automatically. If the error AttributeError: module 'numpy' has no attribute 'int' is reported during identification, then numpy 1.23.4 will be installed, because paddlepaddle on the server (online official environment) is 2.3.2, paddleocr is 2.6.1.0, and numpy is 1.23. 4. It is normal on the server.

Update: As of November 28, 2022, paddlepaddle is 2.4.0, and paddleocr is 2.6.1.1. When installing linux, shapely does not need to be installed manually, but shapely will be installed automatically when paddleocr is installed. In addition, if you run the code on CentOS and report an error ImportErrer: libpython3.8. s0.1.0: cammot open shared object file: No such file or directory, this article can solve it. There was no such problem before, and it may be due to the version update. question.

linux

  • pip install paddlepaddle
  • pip install shapely
  • pip install paddleocr

windows

  • pip install paddlepaddle
  • Manually download the shapely whl file here and install it (the installation command is pip install xxx.whl, the whl file path is best to be an absolute path, or cmd enters the directory where the whl file is located in advance), you cannot directly pip install shapely, there will be problems, which one to download specifically document? Execute pip debug --verbose in cmd, I am python 3.8, actually choose according to your python version.

  •  pip install paddleocr may report an error ERROR: Failed building wheel for python_Levenshtein when installing, also manually download the whl file to install here , and then install paddleocr

use 

The following code has some more content than the official sample code (version 2.6) 1 and the official sample code (version 2.6, including parameter descriptions) 2 , such as importing packages, turning off log printing, and speeding up the recognition speed (my recognition before speeding up) It takes 3.1 seconds to create a picture, and 1.5 seconds after speeding up. I don’t use a GPU. The specific time-consuming depends on the performance of the machine).

The parameter description of PaddleOCR() is at the bottom of the official sample code page.

from paddleocr import PaddleOCR, draw_ocr, paddleocr
import logging

# paddleocr.logging.disable(logging.DEBUG)  # 关闭DEBUG日志的打印,用PaddleOCR(enable_mkldnn=True, use_tensorrt=True, use_angle_cls=False, lang="ch")时生效
# 还有关闭日志打印的方法https://github.com/PaddlePaddle/PaddleOCR/issues/2467,未测试
# paddleocr.logging.disable(logging.WARNING)  # 关闭WARNING日志的打印

# Paddleocr目前支持中英文、英文、法语、德语、韩语、日语,可以通过修改lang参数进行切换
# 参数依次为`ch`, `en`, `french`, `german`, `korean`, `japan`。
ocr = PaddleOCR(use_angle_cls=True, lang="ch")
# ocr = PaddleOCR(enable_mkldnn=True, use_tensorrt=True, use_angle_cls=False, lang="ch")  # enable_mkldnn是Intel芯片的加速库,识别一张身份证大约需1.5秒 from https://www.cnblogs.com/newmiracle/p/15358230.html和https://www.cnblogs.com/newmiracle/p/15346284.html和https://github.com/PaddlePaddle/PaddleOCR/issues/1500,官方文档对enable_mkldnn参数的介绍https://github.com/PaddlePaddle/PaddleOCR/blob/release%2F2.5/doc/doc_ch/FAQ.md#qpaddleocr%E4%B8%AD%E5%AF%B9%E4%BA%8E%E6%A8%A1%E5%9E%8B%E9%A2%84%E6%B5%8B%E5%8A%A0%E9%80%9Fcpu%E5%8A%A0%E9%80%9F%E7%9A%84%E9%80%94%E5%BE%84%E6%9C%89%E5%93%AA%E4%BA%9B%E5%9F%BA%E4%BA%8Etenorrt%E5%8A%A0%E9%80%9Fgpu%E5%AF%B9%E8%BE%93%E5%85%A5%E6%9C%89%E4%BB%80%E4%B9%88%E8%A6%81%E6%B1%82
# 输入待识别图片路径
img_path = r"d:\Desktop\4A34A16F-6B12-4ffc-88C6-FC86E4DF6912.png"
# 输出结果保存路径
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

from PIL import Image
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores)
im_show = Image.fromarray(im_show)
im_show.show()

For the renderings, please refer to the reference link (there are also renderings in the official sample code), I also learned from their steps, but I encountered problems when installing in windows, please record it.

If you still think the recognition speed is slow after acceleration, you can directly buy the recognition interface of Baidu, Alibaba Cloud or Tencent Cloud, or buy a server with GPU. If you have other ways to speed up for free please let me know, thanks.

Put another introduction to the text orientation classifier (version 2.6) . I don’t know how to use it in the code. It may be the use_angle_cls parameter. There is an introduction in the link with the parameter description, but I can’t understand it. For example, if the ID card is taken vertically, it needs to be rotated 90 degrees to the left or right to make the text horizontal. I don’t know how PaddleOCR can rotate the picture before recognition, or whether PaddleOCR will automatically rotate the picture before recognition. When I actually used it, I set use_angle_cls to False, because the recognition speed is slow when it is True, and I only have CPU. And the version 2.6 introduction says:

The text direction classifier is mainly used in the scene where the picture is not at 0 degrees. In this scene, it is necessary to perform a normalization operation on the text lines detected in the picture. In the PaddleOCR system, the text line pictures obtained after text detection are sent to the recognition model after affine transformation. At this time, only one angle classification of 0 and 180 degrees is required for the text, so the built-in text direction classifier of PaddleOCR only supports Classification of 0 and 180 degrees . If you want to support more angles, you can modify the algorithm yourself to support it.

reference link

Several entry-level Python ocr recognition libraries suitable for Xiaobai

Baidu OCR (Text Recognition) Service Use Guide

python, use (pip install .) Failed building wheel for python_Levenshtein solution_A Wu Guangzhi's Blog-CSDN Blog

Python syntax problem-module 'pip._internal' has no attribute 'pep425tags'. Reasons and solutions, 32-bit and 64-bit view pip support universal method_Struggling blue algae blog-CSDN blog_pip._internal

Python paddleocr method to increase recognition speed- newmiracle universe- Blog Park (cnblogs.com)

The method of paddleocr to improve recognition - newmiracle universe - blog garden (cnblogs.com)

If you have other questions, you can also read the official FAQ 

Guess you like

Origin blog.csdn.net/fj_changing/article/details/126243370