If the article is an original article, please indicate the source of the original article when reprinting it.
OCR technology is the most common technology in daily life, just like the face recognition function.
Record here the whole process of OCR learning.
1. Introduction
OCR recognition is divided into two parts, one is to detect the text, and the other is to recognize the text.
PaddleOCR: An OCR tool library based on flying paddles, including ultra-lightweight Chinese OCR with a total model of only 8.6M. A single model supports Chinese and English digit combination recognition, vertical text recognition, and long text recognition.
PaddleOCR is a Python library whose text recognition performance is not inferior to commercial ones! The deployment was also successful on RV1126. It will be deployed on NPU boards such as RK3568 in the future.
2. Environment Creation
I use the AutoDL cloud platform and rent a 3060 GPU. The price is 1.58 yuan/hour, which is quite cost-effective. You can also use other ones.
1. Environment setup
# 创建
conda create -n paddle python=3.8
# 激活
conda activate paddle
2. Download paddleocr
git clone https://github.com/PaddlePaddle/PaddleOCR.git
3. Install the wheels
cd PaddleOCR
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
1) Installation error:
Building wheel for lanms-neo (pyproject.toml) ... error error: subprocess-exited-with-error
deal with:
2) Error:
ERROR: Failed building wheel for Polygon3
deal with
Open the URL https://www.lfd.uci.edu/~gohlke/pythonlibs/ and download Polygon3-3.0.9.1-cp38-cp38-win_amd64.whl
Install
pip install Polygon3-3.0.9.1-cp38-cp38-win_amd64.whl -i https://pypi.tuna.tsinghua.edu.cn/simple
3) Error:
ERROR: Failed building wheel for lanms-neo
deal with
4. Label samples
1) Install paddlepaddle:
Get started_Flying Paddle-an open source deep learning platform derived from industrial practice
The CPU installation method is used because it is only used for marking:
# 安装paddle
pip install paddlepaddle==2.4.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
# 验证安装
安装完成后您可以使用 python 进入 python 解释器,输入import paddle ,再输入 paddle.utils.run_check()
如果出现PaddlePaddle is installed successfully!,说明您已成功安装。
#卸载
python -m pip uninstall paddlepaddle
After installing paddlepadle,
2) Start the annotation tool
# 安装标注工具
cd PaddleOCR/PPOCRLabel
python setup.py bdist_wheel
pip install .\dist\PPOCRLabel-2.1.3-py2.py3-none-any.whl -i https://pypi.tuna.tsinghua.edu.cn/simple
PPOCRLabel --lang ch
Open PPOCRLabel
PPOCRLabel --lang ch
3) PPOCRLabel usage instructions
Learn how to use PPOCRLabel yourself
5. Test
PaddleOCR provides a series of test images, click here to download and unzip
下载地址
https://paddleocr.bj.bcebos.com/dygraph_v2.1/ppocr_img.zip
执行测试
paddleocr --image_dir ./ppocr_img/imgs/11.jpg --use_angle_cls true --use_gpu false
运行正常
If there is any infringement or you need the complete code, please contact the blogger in time.