draft-detect
Operating environment:
Windows 10 system 6 cores 6 threads 16G memory
Based on Python3.8 version vitualenv virtual environment
1. Installation dependencies
Create a virtual environment first, based on Python3.8, and then activate the virtual environment.
Install dependencies in requirements
pip install -r requirements.txt
Install a specific library for layout analysis, note: only this library can be installed, other versions of the library will have problems
pip install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
2. Analysis process
1. Related services
api_server interface service, used to externally upload bills for bill recognition, and return the recognition content.
slice_table is used to classify bills and separate form data from ticket numbers and other data.
table_ceil is used to identify and parse the table into Excel, and at this time, it can be parsed into structured data according to the table.
ocr_dect is used to recognize the text in the image and output the text to the specified position. It is mainly to identify the ticket number data and the data in the form.
focus_draft bill data aggregation service, used to aggregate all incoming data into a whole bill structured data.
2. Identification process
1. The image is received by the web service.
2. Use slice_table to do layout analysis, and remove the table part and ticket number part.
3. Recognize the ticket number part and the table part separately, and finally merge them. I tried multi-thread and multi-process separate asynchronous recognition. Because it is computationally intensive, the recognition time has increased.
4. Use the aggregation service to aggregate the data into a whole bill structured data.
5. The result is returned by the web interface.
3. Related models
inference/ch_PP-OCRv2_det_infer flying paddle OCR text recognition model
inference/ch_PP-OCRv2_rec_infer flying paddle OCR text detection model
inference/ch_ppocr_mobile_v2.0_cls_infer flying paddle OCR text direction classification model
models/table-line-fine-last.h5 TensorFlow-based table detection model
lp://TableBank/ppyolov2_r50vd_dcn_365e_tableBank_latex/config in slice_table.py Layout analysis model based on layoutparser
3. Start the service
python api_server.py
4. Testing service
Interface address: http://127.0.0.1:8080/table/predict
Request method: POST
Request parameter: form-data pass parameter
- file The ticket image to be recognized
- isToExcel Whether to output the recognized Excel, 1 means yes, otherwise no. The save location is ceil-result
- isToCeil Whether to output the recognized table frame picture, 1 means yes, otherwise no. The save location is excel-result
Example request:
Incoming image:
Recognized boxes:
Recognized forms:
5. How to mark the form
Form marking uses the labelme tool, official address: https://github.com/wkentaro/labelme
1. Install labelme
pip install labelme==5.0.1
After the installation is complete, you need to modify the source code to ensure that the tool can mark images of png type
Modify as follows:
2. Use labelme for marking
Use the command to start labelme, open the terminal, enter labelme and press the Enter key.
After printing an annotation, click Finish to save it as a trainable annotation file.
3. How to train
Put all marked files into ./train/dataset-line/0/
Then execute the training script, wait for the training to complete, and save the new model as ./models/table-line-fine.h5
python train/train.py
4. Use the new model to make predictions
Just modify the tableModeLinePath in the configuration file config.py to the new model path.
Test after starting the service
python api_server.py