Baidu Feipian OCR recognition form introduction python practice

1. Baidu PaddlePaddle

Baidu PaddlePaddle is a deep learning platform launched by Baidu, aiming to provide developers with powerful deep learning frameworks and tools. Feipiao provides a variety of functions including OCR (Optical Character Recognition), which can help developers achieve efficient text recognition in various applications. Official website link: https://www.paddlepaddle.org.cn/.

Insert image description here

First time use, installation:

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple paddlepaddle

To verify the installation, use python to enter the python interpreter, enter import paddle, and then enter paddle.utils.run_check().

python
Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] on win32
Type “help”, “copyright”, “credits” or “license” for more information.

import paddle
paddle.utils.run_check()
Running verify PaddlePaddle program …
I0904 17:11:21.570567 15712 interpretercore.cc:237] New Executor is Running.
I0904 17:11:21.702833 15712 interpreter_util.cc:518] Standalone Executor is Used.
PaddlePaddle works well on 1 CPU.
PaddlePaddle is installed successfully! Let’s start deep learning with PaddlePaddle now.

2. Flying paddle OCR

PaddleOCR, the text recognition development kit, aims to create a rich, leading and practical OCR tool library. It has open sourced practical ultra-lightweight Chinese and English OCR models based on PP-OCR, universal Chinese and English OCR models, and German, French, Japanese and Korean and other multi-language OCR models. It also provides the above model training methods and multiple prediction deployment methods. At the same time, the text style data synthesis tool Style-Text and the semi-automatic text image annotation tool PPOCRLable are open source.

The simple recognition process of Fei Paddle OCR text is shown in the figure below.
Insert image description here

2.1. Install Flying Propeller OCR

If you have clear OCR vertical application needs in your enterprise, we recommend you to use PaddleX, a one-stop full-process high-efficiency development platform, to help the rapid implementation of AI technology.

First, download the shapely installation package (Address: https://www.lfd.uci.edu/~gohlke/pythonlibs/) and install it.

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple e:\software\python\Shapely-1.8.2-cp38-cp38-win_amd64.whl

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple paddleocr

Universal OCR text recognition, the first example.

Insert image description here

from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换
# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`
ocr = PaddleOCR(use_angle_cls=True, lang="ch")  # need to run only once to download and load model into memory
img_path = './imgs/11.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
    res = result[idx]
    for line in res:
        print(line)

# 显示结果
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

Insert image description here
My python environment, for reference:

  • Operating system: Windows 10 Professional version 22H2
  • python 3.8.10
  • The contents of the installation package are as follows. Please see the attachment for details.

2.2. PP-Structure Quick Start

PP-Structure is a table structure recognition toolkit based on PaddlePaddle, which can help developers quickly identify and extract table structures.

Chart recognition, input image as shown below, web form with watermark:
Insert image description here
official sample code:

import os
import cv2
from paddleocr import PPStructure,draw_structure_result,save_structure_res

table_engine = PPStructure(show_log=True)

save_folder = 'output'
img_path = 'img/12.jpg'
img = cv2.imread(img_path)
result = table_engine(img)
save_structure_res(result, save_folder,os.path.basename(img_path).split('.')[0])

for line in result:
    line.pop('img')
    print(line)

from PIL import Image

font_path = 'C:\Windows\Fonts\simfang.ttf'   # PaddleOCR下提供字体包
image = Image.open(img_path).convert('RGB')
im_show = draw_structure_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result2.jpg')

Insert image description here

download https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar to 
C:\Users\xiaoyw/.paddleocr/whl\table\ch_ppstructure_mobile_v2.0_SLANet_infer\ch_ppstructure_mobile_v2.0_SLANet_infer.tar
100%| 10.3M/10.3M [00:01<00:00, 6.69MiB/s]
download https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar to 
C:\Users\xiaoyw/.paddleocr/whl\layout\picodet_lcnet_x1_0_fgd_layout_cdla_infer\picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar
100%|| 10.1M/10.1M [00:00<00:00, 10.2MiB/s]

reference:

VipSoft. Baidu Paddle (PaddlePaddle) - PaddleHub OCR text recognition is simple to use . Blog Park. 2023.05
Autobot. What are the differences between the three frameworks Pytorch, TensorFlow and PaddlePaddle? . Zhihu. 2022.08
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/ppstructure/docs/quickstart.md

appendix:

Package                   Version
------------------------- -----------
anyio                     4.0.0
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.2.3
astor                     0.8.1
asttokens                 2.3.0
async-lru                 2.0.4
attrdict                  2.0.1
attrs                     23.1.0
Babel                     2.12.1
backcall                  0.2.0
bce-python-sdk            0.8.90
beautifulsoup4            4.12.2
bleach                    6.0.0
blinker                   1.6.2
cachetools                5.3.1
certifi                   2023.7.22
cffi                      1.15.1
charset-normalizer        3.2.0
click                     8.1.7
colorama                  0.4.6
comm                      0.1.4
contourpy                 1.1.0
cssselect                 1.2.0
cssutils                  2.7.1
cycler                    0.11.0
Cython                    3.0.2
debugpy                   1.6.7.post1
decorator                 5.1.1
defusedxml                0.7.1
dnspython                 2.4.2
et-xmlfile                1.1.0
exceptiongroup            1.1.3
executing                 1.2.0
fastjsonschema            2.18.0
fire                      0.5.0
flask                     2.3.3
flask-babel               3.1.0
fonttools                 4.42.1
fqdn                      1.5.1
future                    0.18.3
h11                       0.14.0
httpcore                  0.17.3
httpx                     0.24.1
idna                      3.4
imageio                   2.31.3
imgaug                    0.4.0
importlib-metadata        6.8.0
importlib-resources       6.0.1
ipykernel                 6.25.1
ipython                   8.12.2
ipython-genutils          0.2.0
ipywidgets                8.1.0
isoduration               20.11.0
itsdangerous              2.1.2
jedi                      0.19.0
Jinja2                    3.1.2
joblib                    1.3.2
json5                     0.9.14
jsonpointer               2.4
jsonschema                4.19.0
jsonschema-specifications 2023.7.1
kiwisolver                1.4.5
lazy-loader               0.3
lmdb                      1.4.1
lxml                      4.9.3
MarkupSafe                2.1.3
matplotlib                3.7.2
matplotlib-inline         0.1.6
mistune                   3.0.1
nbclient                  0.8.0
nbconvert                 7.8.0
nbformat                  5.9.2
nest-asyncio              1.5.7
networkx                  3.1
notebook                  7.0.3
notebook-shim             0.2.3
numpy                     1.24.4
opencv-contrib-python     4.6.0.66
opencv-python             4.6.0.66
openpyxl                  3.1.2
opt-einsum                3.3.0
overrides                 7.4.0
packaging                 23.1
paddle-bfloat             0.1.7
paddleocr                 2.7.0.2
paddlepaddle              2.5.1
pandas                    2.0.3
pandocfilters             1.5.0
parso                     0.8.3
pdf2docx                  0.5.6
pickleshare               0.7.5
Pillow                    10.0.0
pip                       21.1.1
pkgutil-resolve-name      1.3.10
platformdirs              3.10.0
premailer                 3.10.0
prometheus-client         0.17.1
prompt-toolkit            3.0.39
protobuf                  3.20.2
psutil                    5.9.5
pure-eval                 0.2.2
pyclipper                 1.3.0.post4
pycparser                 2.21
pycryptodome              3.18.0
Pygments                  2.16.1
pymongo                   4.5.0
PyMuPDF                   1.20.2
pyparsing                 3.0.9
python-dateutil           2.8.2
python-docx               0.8.11
python-json-logger        2.0.7
pytz                      2023.3
PyWavelets                1.4.1
pywin32                   306
pywinpty                  2.0.11
PyYAML                    6.0.1
pyzmq                     25.1.1
qtconsole                 5.4.4
QtPy                      2.4.0
rapidfuzz                 3.2.0
rarfile                   4.0
referencing               0.30.2
requests                  2.31.0
rfc3339-validator         0.1.4
rfc3986-validator         0.1.1
rpds-py                   0.10.0
scikit-image              0.21.0
scikit-learn              1.3.0
scipy                     1.10.1
Send2Trash                1.8.2
setuptools                56.0.0
Shapely                   1.8.2
six                       1.16.0
sniffio                   1.3.0
soupsieve                 2.5
stack-data                0.6.2
termcolor                 2.3.0
terminado                 0.17.1
threadpoolctl             3.2.0
tifffile                  2023.7.10
tinycss2                  1.2.1
tomli                     2.0.1
tornado                   6.3.3
tqdm                      4.66.1
traitlets                 5.9.0
typing-extensions         4.7.1
tzdata                    2023.3
uri-template              1.3.0
urllib3                   2.0.4
visualdl                  2.5.3
wcwidth                   0.2.6
webcolors                 1.13
webencodings              0.5.1
websocket-client          1.6.2
werkzeug                  2.3.7
widgetsnbextension        4.0.8
zipp                      3.16.2

Guess you like

Origin blog.csdn.net/xiaoyw/article/details/132673587