1. Baidu PaddlePaddle
Baidu PaddlePaddle is a deep learning platform launched by Baidu, aiming to provide developers with powerful deep learning frameworks and tools. Feipiao provides a variety of functions including OCR (Optical Character Recognition), which can help developers achieve efficient text recognition in various applications. Official website link: https://www.paddlepaddle.org.cn/.
First time use, installation:
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple paddlepaddle
To verify the installation, use python to enter the python interpreter, enter import paddle, and then enter paddle.utils.run_check().
python
Python 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] on win32
Type “help”, “copyright”, “credits” or “license” for more information.import paddle
paddle.utils.run_check()
Running verify PaddlePaddle program …
I0904 17:11:21.570567 15712 interpretercore.cc:237] New Executor is Running.
I0904 17:11:21.702833 15712 interpreter_util.cc:518] Standalone Executor is Used.
PaddlePaddle works well on 1 CPU.
PaddlePaddle is installed successfully! Let’s start deep learning with PaddlePaddle now.
2. Flying paddle OCR
PaddleOCR, the text recognition development kit, aims to create a rich, leading and practical OCR tool library. It has open sourced practical ultra-lightweight Chinese and English OCR models based on PP-OCR, universal Chinese and English OCR models, and German, French, Japanese and Korean and other multi-language OCR models. It also provides the above model training methods and multiple prediction deployment methods. At the same time, the text style data synthesis tool Style-Text and the semi-automatic text image annotation tool PPOCRLable are open source.
The simple recognition process of Fei Paddle OCR text is shown in the figure below.
2.1. Install Flying Propeller OCR
If you have clear OCR vertical application needs in your enterprise, we recommend you to use PaddleX, a one-stop full-process high-efficiency development platform, to help the rapid implementation of AI technology.
First, download the shapely installation package (Address: https://www.lfd.uci.edu/~gohlke/pythonlibs/) and install it.
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple e:\software\python\Shapely-1.8.2-cp38-cp38-win_amd64.whl
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple paddleocr
Universal OCR text recognition, the first example.
from paddleocr import PaddleOCR, draw_ocr
# Paddleocr目前支持的多语言语种可以通过修改lang参数进行切换
# 例如`ch`, `en`, `fr`, `german`, `korean`, `japan`
ocr = PaddleOCR(use_angle_cls=True, lang="ch") # need to run only once to download and load model into memory
img_path = './imgs/11.jpg'
result = ocr.ocr(img_path, cls=True)
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# 显示结果
from PIL import Image
result = result[0]
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='./fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
My python environment, for reference:
- Operating system: Windows 10 Professional version 22H2
- python 3.8.10
- The contents of the installation package are as follows. Please see the attachment for details.
2.2. PP-Structure Quick Start
PP-Structure is a table structure recognition toolkit based on PaddlePaddle, which can help developers quickly identify and extract table structures.
Chart recognition, input image as shown below, web form with watermark:
official sample code:
import os
import cv2
from paddleocr import PPStructure,draw_structure_result,save_structure_res
table_engine = PPStructure(show_log=True)
save_folder = 'output'
img_path = 'img/12.jpg'
img = cv2.imread(img_path)
result = table_engine(img)
save_structure_res(result, save_folder,os.path.basename(img_path).split('.')[0])
for line in result:
line.pop('img')
print(line)
from PIL import Image
font_path = 'C:\Windows\Fonts\simfang.ttf' # PaddleOCR下提供字体包
image = Image.open(img_path).convert('RGB')
im_show = draw_structure_result(image, result,font_path=font_path)
im_show = Image.fromarray(im_show)
im_show.save('result2.jpg')
download https://paddleocr.bj.bcebos.com/ppstructure/models/slanet/ch_ppstructure_mobile_v2.0_SLANet_infer.tar to
C:\Users\xiaoyw/.paddleocr/whl\table\ch_ppstructure_mobile_v2.0_SLANet_infer\ch_ppstructure_mobile_v2.0_SLANet_infer.tar
100%| 10.3M/10.3M [00:01<00:00, 6.69MiB/s]
download https://paddleocr.bj.bcebos.com/ppstructure/models/layout/picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar to
C:\Users\xiaoyw/.paddleocr/whl\layout\picodet_lcnet_x1_0_fgd_layout_cdla_infer\picodet_lcnet_x1_0_fgd_layout_cdla_infer.tar
100%|| 10.1M/10.1M [00:00<00:00, 10.2MiB/s]
reference:
VipSoft. Baidu Paddle (PaddlePaddle) - PaddleHub OCR text recognition is simple to use . Blog Park. 2023.05
Autobot. What are the differences between the three frameworks Pytorch, TensorFlow and PaddlePaddle? . Zhihu. 2022.08
https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/ppstructure/docs/quickstart.md
appendix:
Package Version
------------------------- -----------
anyio 4.0.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.2.3
astor 0.8.1
asttokens 2.3.0
async-lru 2.0.4
attrdict 2.0.1
attrs 23.1.0
Babel 2.12.1
backcall 0.2.0
bce-python-sdk 0.8.90
beautifulsoup4 4.12.2
bleach 6.0.0
blinker 1.6.2
cachetools 5.3.1
certifi 2023.7.22
cffi 1.15.1
charset-normalizer 3.2.0
click 8.1.7
colorama 0.4.6
comm 0.1.4
contourpy 1.1.0
cssselect 1.2.0
cssutils 2.7.1
cycler 0.11.0
Cython 3.0.2
debugpy 1.6.7.post1
decorator 5.1.1
defusedxml 0.7.1
dnspython 2.4.2
et-xmlfile 1.1.0
exceptiongroup 1.1.3
executing 1.2.0
fastjsonschema 2.18.0
fire 0.5.0
flask 2.3.3
flask-babel 3.1.0
fonttools 4.42.1
fqdn 1.5.1
future 0.18.3
h11 0.14.0
httpcore 0.17.3
httpx 0.24.1
idna 3.4
imageio 2.31.3
imgaug 0.4.0
importlib-metadata 6.8.0
importlib-resources 6.0.1
ipykernel 6.25.1
ipython 8.12.2
ipython-genutils 0.2.0
ipywidgets 8.1.0
isoduration 20.11.0
itsdangerous 2.1.2
jedi 0.19.0
Jinja2 3.1.2
joblib 1.3.2
json5 0.9.14
jsonpointer 2.4
jsonschema 4.19.0
jsonschema-specifications 2023.7.1
kiwisolver 1.4.5
lazy-loader 0.3
lmdb 1.4.1
lxml 4.9.3
MarkupSafe 2.1.3
matplotlib 3.7.2
matplotlib-inline 0.1.6
mistune 3.0.1
nbclient 0.8.0
nbconvert 7.8.0
nbformat 5.9.2
nest-asyncio 1.5.7
networkx 3.1
notebook 7.0.3
notebook-shim 0.2.3
numpy 1.24.4
opencv-contrib-python 4.6.0.66
opencv-python 4.6.0.66
openpyxl 3.1.2
opt-einsum 3.3.0
overrides 7.4.0
packaging 23.1
paddle-bfloat 0.1.7
paddleocr 2.7.0.2
paddlepaddle 2.5.1
pandas 2.0.3
pandocfilters 1.5.0
parso 0.8.3
pdf2docx 0.5.6
pickleshare 0.7.5
Pillow 10.0.0
pip 21.1.1
pkgutil-resolve-name 1.3.10
platformdirs 3.10.0
premailer 3.10.0
prometheus-client 0.17.1
prompt-toolkit 3.0.39
protobuf 3.20.2
psutil 5.9.5
pure-eval 0.2.2
pyclipper 1.3.0.post4
pycparser 2.21
pycryptodome 3.18.0
Pygments 2.16.1
pymongo 4.5.0
PyMuPDF 1.20.2
pyparsing 3.0.9
python-dateutil 2.8.2
python-docx 0.8.11
python-json-logger 2.0.7
pytz 2023.3
PyWavelets 1.4.1
pywin32 306
pywinpty 2.0.11
PyYAML 6.0.1
pyzmq 25.1.1
qtconsole 5.4.4
QtPy 2.4.0
rapidfuzz 3.2.0
rarfile 4.0
referencing 0.30.2
requests 2.31.0
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rpds-py 0.10.0
scikit-image 0.21.0
scikit-learn 1.3.0
scipy 1.10.1
Send2Trash 1.8.2
setuptools 56.0.0
Shapely 1.8.2
six 1.16.0
sniffio 1.3.0
soupsieve 2.5
stack-data 0.6.2
termcolor 2.3.0
terminado 0.17.1
threadpoolctl 3.2.0
tifffile 2023.7.10
tinycss2 1.2.1
tomli 2.0.1
tornado 6.3.3
tqdm 4.66.1
traitlets 5.9.0
typing-extensions 4.7.1
tzdata 2023.3
uri-template 1.3.0
urllib3 2.0.4
visualdl 2.5.3
wcwidth 0.2.6
webcolors 1.13
webencodings 0.5.1
websocket-client 1.6.2
werkzeug 2.3.7
widgetsnbextension 4.0.8
zipp 3.16.2