Python uses flying paddle OCR, layoutparser, TensorFlow to detect and recognize bills

draft-detect

Operating environment:

Windows 10 system 6 cores 6 threads 16G memory

Based on Python3.8 version vitualenv virtual environment

1. Installation dependencies

Create a virtual environment first, based on Python3.8, and then activate the virtual environment.

Install dependencies in requirements

pip install -r requirements.txt

Install a specific library for layout analysis, note: only this library can be installed, other versions of the library will have problems

pip install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl

2. Analysis process

1. Related services

​ api_server interface service, used to externally upload bills for bill recognition, and return the recognition content.

​ slice_table is used to classify bills and separate form data from ticket numbers and other data.

​ table_ceil is used to identify and parse the table into Excel, and at this time, it can be parsed into structured data according to the table.

​ ocr_dect is used to recognize the text in the image and output the text to the specified position. It is mainly to identify the ticket number data and the data in the form.

​ focus_draft bill data aggregation service, used to aggregate all incoming data into a whole bill structured data.

2. Identification process

1. The image is received by the web service.

​ 2. Use slice_table to do layout analysis, and remove the table part and ticket number part.

​ 3. Recognize the ticket number part and the table part separately, and finally merge them. I tried multi-thread and multi-process separate asynchronous recognition. Because it is computationally intensive, the recognition time has increased.

4. Use the aggregation service to aggregate the data into a whole bill structured data.

5. The result is returned by the web interface.

3. Related models

inference/ch_PP-OCRv2_det_infer flying paddle OCR text recognition model

inference/ch_PP-OCRv2_rec_infer flying paddle OCR text detection model

inference/ch_ppocr_mobile_v2.0_cls_infer flying paddle OCR text direction classification model

models/table-line-fine-last.h5 TensorFlow-based table detection model

lp://TableBank/ppyolov2_r50vd_dcn_365e_tableBank_latex/config in slice_table.py Layout analysis model based on layoutparser

3. Start the service

python api_server.py

4. Testing service

Interface address: http://127.0.0.1:8080/table/predict

Request method: POST

Request parameter: form-data pass parameter

  • file The ticket image to be recognized
  • isToExcel Whether to output the recognized Excel, 1 means yes, otherwise no. The save location is ceil-result
  • isToCeil Whether to output the recognized table frame picture, 1 means yes, otherwise no. The save location is excel-result

Example request:
insert image description here

Incoming image:

insert image description here

Recognized boxes:

insert image description here

Recognized forms:

insert image description here

5. How to mark the form

Form marking uses the labelme tool, official address: https://github.com/wkentaro/labelme

1. Install labelme

pip install labelme==5.0.1

After the installation is complete, you need to modify the source code to ensure that the tool can mark images of png type

Modify as follows:

insert image description here

2. Use labelme for marking

Use the command to start labelme, open the terminal, enter labelme and press the Enter key.

insert image description here

After printing an annotation, click Finish to save it as a trainable annotation file.

3. How to train

Put all marked files into ./train/dataset-line/0/

Then execute the training script, wait for the training to complete, and save the new model as ./models/table-line-fine.h5

python train/train.py

4. Use the new model to make predictions

Just modify the tableModeLinePath in the configuration file config.py to the new model path.

Test after starting the service

python api_server.py

Guess you like

Origin blog.csdn.net/haeasringnar/article/details/124556486