基于PaddleOCR的AIWIN 手写体OCR识别竞赛

一、基于PaddleOCR的AIWIN 手写体OCR识别竞赛

1.赛题背景

ailab.aiwin.org.cn/competition…

银行日常业务中涉及到各类凭证的识别录入,例如身份证录入、支票录入、对账单录入等。以往的录入方式主要是以人工录入为主,效率较低,人力成本较高。近几年来,OCR相关技术以其自动执行、人为干预较少等特点正逐步替代传统的人工录入方式。但OCR技术在实际应用中也存在一些问题,在各类凭证字段的识别中,手写体由于其字体差异性大、字数不固定、语义关联性较低、凭证背景干扰等原因,导致OCR识别率准确率不高,需要大量人工校正,对日常的银行录入业务造成了一定的影响。

2. 赛题任务

本次赛题将提供手写体图像切片数据集,数据集从真实业务场景中,经过切片脱敏得到,参赛队伍通过识别技术,获得对应的识别结果。即:

输入:手写体图像切片数据集

输出:对应的识别结果

赛题在赛程中分设为两个独立任务,各自设定不同条件的训练集、测试集和建模环境,概述如下:

  • 任务一:提供开放可下载的训练集及测试集,允许线下建模或线上提供 Notebook 环境及 Terminal 容器环境(脱网)建模,输出识别结果完成赛题。
  • 任务二:提供不可下载的训练集,要求线上通过 Terminal 容器环境(脱网)建模后提交模型,由系统输入测试集(即对选手不可见),输出识别结果完成赛题。

3.数据基本情况

任务一

任务二

训练集(含验证集,请自行划分)

8 千张图像,包含年份、金额2种信息

扫描二维码关注公众号,回复: 13664388 查看本文章

3 万张图像,包含银行名称、年份、月份、日期、金额5 种信息。

测试集

2 千张图像

设定 AB榜:

A 榜:5 千张图像

B 榜:5 千张图像

原始手写体图像共分为三类,分别涉及银行名称、年月日、金额三大类,分别示意如下:

相应图片切片中可能混杂有一定量的干扰信息,分别示例如下;

二、环境设置

PaddleOCR github.com/paddlepaddl… 是一款全宇宙最强的用的OCR工具库,开箱即用,速度杠杠的。

# 从gitee上下载PaddleOCR代码,也可以从GitHub链接下载
!git clone https://gitee.com/paddlepaddle/PaddleOCR.git --depth=1
复制代码
Cloning into 'PaddleOCR'...
remote: Enumerating objects: 1229, done.
remote: Counting objects: 100% (1229/1229), done.
remote: Compressing objects: 100% (1098/1098), done.
remote: Total 1229 (delta 202), reused 707 (delta 80), pack-reused 0
Receiving objects: 100% (1229/1229), 100.43 MiB | 5.50 MiB/s, done.
Resolving deltas: 100% (202/202), done.
Checking connectivity... done.
复制代码
# 升级pip
!pip install -U pip 
# 安装依赖
%cd ~/PaddleOCR
%pip install -r requirements.txt
复制代码
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting pip
[?25l  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a4/6d/6463d49a933f547439d6b5b98b46af8742cc03ae83543e4d7688c2420f8b/pip-21.3.1-py3-none-any.whl (1.7MB)
     |████████████████████████████████| 1.7MB 55.4MB/s eta 0:00:01
[?25hInstalling collected packages: pip
  Found existing installation: pip 19.2.3
    Uninstalling pip-19.2.3:
      Successfully uninstalled pip-19.2.3
Successfully installed pip-21.3.1
/home/aistudio/PaddleOCR
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting shapely
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ae/20/33ce377bd24d122a4d54e22ae2c445b9b1be8240edb50040b40add950cd9/Shapely-1.8.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB)
     |████████████████████████████████| 1.1 MB 7.6 MB/s            
[?25hCollecting scikit-image
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/9a/44/8f8c7f9c9de7fde70587a656d7df7d056e6f05192a74491f7bc074a724d0/scikit_image-0.19.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.3 MB)
     |████████████████████████████████| 13.3 MB 4.0 MB/s            
[?25hCollecting imgaug==0.4.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/66/b1/af3142c4a85cba6da9f4ebb5ff4e21e2616309552caca5e8acefe9840622/imgaug-0.4.0-py2.py3-none-any.whl (948 kB)
     |████████████████████████████████| 948 kB 14.6 MB/s            
[?25hCollecting pyclipper
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c5/fa/2c294127e4f88967149a68ad5b3e43636e94e3721109572f8f17ab15b772/pyclipper-1.3.0.post2-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (603 kB)
     |████████████████████████████████| 603 kB 67.9 MB/s            
[?25hCollecting lmdb
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/4d/cf/3230b1c9b0bec406abb85a9332ba5805bdd03a1d24025c6bbcfb8ed71539/lmdb-1.3.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (298 kB)
     |████████████████████████████████| 298 kB 11.0 MB/s            
[?25hRequirement already satisfied: tqdm in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 6)) (4.36.1)
Requirement already satisfied: numpy in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 7)) (1.20.3)
Requirement already satisfied: visualdl in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from -r requirements.txt (line 8)) (2.2.0)
Collecting python-Levenshtein
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/2a/dc/97f2b63ef0fa1fd78dcb7195aca577804f6b2b51e712516cc0e902a9a201/python-Levenshtein-0.12.2.tar.gz (50 kB)
     |████████████████████████████████| 50 kB 23.9 MB/s            
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting opencv-contrib-python==4.4.0.46
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/08/51/1e0a206dd5c70fea91084e6f43979dc13e8eb175760cc7a105083ec3eb68/opencv_contrib_python-4.4.0.46-cp37-cp37m-manylinux2014_x86_64.whl (55.7 MB)
     |█▊                              | 3.1 MB 11.7 MB/s eta 0:00:0done
[?25h  Created wheel for python-Levenshtein: filename=python_Levenshtein-0.12.2-cp37-cp37m-linux_x86_64.whl size=171682 sha256=210cbc3beb9de6384f8dfd64ea9c1f7e7d4bd57233af34afe6994b95ae346f01
  Stored in directory: /home/aistudio/.cache/pip/wheels/38/b9/a4/3729726160fb103833de468adb5ce019b58543ae41d0b0e446
Successfully built python-Levenshtein
Installing collected packages: tifffile, PyWavelets, shapely, scikit-image, lxml, cssutils, cssselect, python-Levenshtein, pyclipper, premailer, opencv-contrib-python, lmdb, imgaug
Successfully installed PyWavelets-1.2.0 cssselect-1.1.0 cssutils-2.3.0 imgaug-0.4.0 lmdb-1.3.0 lxml-4.7.1 opencv-contrib-python-4.4.0.46 premailer-3.10.0 pyclipper-1.3.0.post2 python-Levenshtein-0.12.2 scikit-image-0.19.1 shapely-1.8.0 tifffile-2021.11.2
Note: you may need to restart the kernel to use updated packages.
复制代码

三、数据准备

主要任务有:

  • 数据解压缩
  • det数据集格式化和数据集划分
  • rec数据集格式化和数据集划分
# 解压缩
%cd ~
!unzip -qoa 2021A_T1_Task1_数据集含训练集和测试集.zip
复制代码
/home/aistudio
复制代码
# 查看数据集
from PIL import Image
img=Image.open("训练集/amount/images/8bb39447774eb21a01777a9efa890543.jpg")
img
复制代码

output_6_0.png

1.数额数据处理

%cd ~
# 查看数据集
!head 训练集/amount/gt.json
复制代码
/home/aistudio
{
  "8bb39426774ee53f017770203bab0bc5.jpg": "肆佰肆拾贰元整",
  "8bb39447760a31c801762283f9dd63cb.jpg": "贰仟壹佰壹拾贰元整",
  "8bb1943d774eb211017784b7af783c23.jpg": "壹仟肆佰贰拾元整",
  "8bb194277657bb0501768d5379a4262b.jpg": "伍佰叁拾捌元叁角捌分",
  "8bb3942b7657bb83017674d349786868.jpg": "肆佰元整",
  "8bb1943d760a31b70176275a31832557.jpg": "壹万贰仟贰佰元整",
  "8bb19437760a2b5f017641f9743b41b4.jpg": "叁万捌仟肆佰伍拾捌元捌角捌分",
  "8bb1941c7657bb01017674b446cc2a2e.jpg": "贰仟肆佰捌拾陆元整",
  "8bb39441760a31b601764a13149e3008.jpg": "玖仟肆佰伍拾元整",
复制代码
import glob, codecs, json, os
import numpy as np

amount_jpgs = glob.glob('./训练集/amount/images/*.jpg')
lines = codecs.open('./训练集/amount/gt.json', encoding='utf-8').readlines()
lines = ''.join(lines)
amount_gt = json.loads(lines.replace(',\n}', '}'))
复制代码
%cd ~/
# 划分train和eval
# 写入列表文件
f_train=open("./训练集/amount/train_list.txt", 'w')
f_val=open("./训练集/amount/val_list.txt", 'w')

i=0
for key in amount_gt:
    if i%10==0:
        f_val.write(key+ '\t'+amount_gt[key]+'\n')
    else:
        f_train.write(key+ '\t'+amount_gt[key]+'\n')
    i=i+1
复制代码
/home/aistudio
复制代码
!head ./训练集/amount/train_list.txt
复制代码
8bb39447760a31c801762283f9dd63cb.jpg    贰仟壹佰壹拾贰元整
8bb1943d774eb211017784b7af783c23.jpg    壹仟肆佰贰拾元整
8bb194277657bb0501768d5379a4262b.jpg    伍佰叁拾捌元叁角捌分
8bb3942b7657bb83017674d349786868.jpg    肆佰元整
8bb1943d760a31b70176275a31832557.jpg    壹万贰仟贰佰元整
8bb19437760a2b5f017641f9743b41b4.jpg    叁万捌仟肆佰伍拾捌元捌角捌分
8bb1941c7657bb01017674b446cc2a2e.jpg    贰仟肆佰捌拾陆元整
8bb39441760a31b601764a13149e3008.jpg    玖仟肆佰伍拾元整
8bb3943c774eb20601775cf697f0456b.jpg    壹万柒仟贰佰叁拾壹元整
8bb194207657bb1201765fd7645934b5.jpg    肆仟伍佰壹拾叁元贰角整
复制代码
date_gt.update(amount_gt)
s = ''
for x in date_gt:
    s += date_gt[x]
char_list = list(set(list(s)))
char_list = char_list

with open('./训练集/amount/vocabulary.txt', 'w') as up:
    for x in char_list:
        up.write(x + '\n')
复制代码
!cat ./训练集/amount/vocabulary.txt
复制代码
玖
万
拾
亿
整
壹
叁
正
柒
仟
陆
元
分
贰
佰
零
圆
角
捌
肆
伍
复制代码

2.日期数据处理

%cd ~
date_jpgs = glob.glob('./训练集/date/images/*.jpg')

lines = codecs.open('./训练集/date/gt.json', encoding='utf-8').readlines()
lines = ''.join(lines)
date_gt = json.loads(lines.replace(',\n}', '}'))
复制代码
/home/aistudio
复制代码
# 划分train和eval
# 写入列表文件
f_train=open("./训练集/date/train_list.txt", 'w')
f_val=open("./训练集/date/val_list.txt", 'w')

i=0
for key in date_gt:
    if i%10==0:
        f_val.write(key+ '\t'+date_gt[key]+'\n')
    else:
        f_train.write(key+ '\t'+date_gt[key]+'\n')
    i=i+1
复制代码
date_gt.update(date_gt)
s = ''
for x in date_gt:
    s += date_gt[x]
char_list = list(set(list(s)))
char_list = char_list

with open('./训练集/date/vocabulary.txt', 'w') as up:
    for x in char_list:
        up.write(x + '\n')
复制代码
!cat ./训练集/date/vocabulary.txt
复制代码
玖
叁
柒
拾
陆
捌
零
贰
肆
伍
壹
复制代码

四、金额训练与评估

PaddleOCR/configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml为基准进行配置

1. 金额训练

Global:
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/rec_chinese_common_v2.0
  save_epoch_step: 3
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [0, 2000]
  cal_metric_during_train: True
  pretrained_model:
  checkpoints:
  save_inference_dir:
  use_visualdl: False
  infer_img: doc/imgs_words/ch/word_1.jpg
  # for data or label process
  character_dict_path: /home/aistudio/训练集/amount/vocabulary.txt
  max_text_length: 25
  infer_mode: False
  use_space_char: True
  save_res_path: ./output/rec/predicts_chinese_common_v2.0.txt


Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.001
    warmup_epoch: 5
  regularizer:
    name: 'L2'
    factor: 0.00004

Architecture:
  model_type: rec
  algorithm: CRNN
  Transform:
  Backbone:
    name: ResNet
    layers: 34
  Neck:
    name: SequenceEncoder
    encoder_type: rnn
    hidden_size: 256
  Head:
    name: CTCHead
    fc_decay: 0.00004

Loss:
  name: CTCLoss

PostProcess:
  name: CTCLabelDecode

Metric:
  name: RecMetric
  main_indicator: acc

Train:
  dataset:
    name: SimpleDataSet
    data_dir: /home/aistudio/训练集/amount/images
    label_file_list: ["/home/aistudio/训练集/amount/train_list.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - RecAug: 
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 320]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: True
    batch_size_per_card: 256
    drop_last: True
    num_workers: 8

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/aistudio/训练集/amount/images
    label_file_list: ["/home/aistudio/训练集/amount/val_list.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 320]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 256
    num_workers: 8
复制代码

2.下载预训练模型

%cd ~/PaddleOCR/
# server模型
!wget https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar
!tar -xf ch_ppocr_server_v2.0_rec_pre.tar
复制代码
/home/aistudio/PaddleOCR
--2022-01-09 19:54:27--  https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_server_v2.0_rec_pre.tar
Resolving paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)... 182.61.200.229, 182.61.200.195, 2409:8c04:1001:1002:0:ff:b001:368a
Connecting to paddleocr.bj.bcebos.com (paddleocr.bj.bcebos.com)|182.61.200.229|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 490184704 (467M) [application/x-tar]
Saving to: ‘ch_ppocr_server_v2.0_rec_pre.tar’

ch_ppocr_server_v2. 100%[===================>] 467.48M  48.1MB/s    in 13s     

2022-01-09 19:54:40 (35.6 MB/s) - ‘ch_ppocr_server_v2.0_rec_pre.tar’ saved [490184704/490184704]
复制代码

3.金额训练

# 覆盖配置文件
!cp -f ../rec_chinese_common_train_v2.0.yml ./configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml 
复制代码
# server模型
%cd ~/PaddleOCR/

!python tools/train.py -c ./configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml
复制代码

训练日志

[2022/01/09 20:21:27] root INFO: epoch: [35/500], iter: 480, lr: 0.000992, loss: 0.322123, acc: 0.972652, norm_edit_dis: 0.994506, reader_cost: 0.57496 s, batch_cost: 0.89715 s, samples: 1280, ips: 142.67360
[2022/01/09 20:21:43] root INFO: epoch: [35/500], iter: 489, lr: 0.000992, loss: 0.256427, acc: 0.974606, norm_edit_dis: 0.995029, reader_cost: 0.00015 s, batch_cost: 0.55269 s, samples: 2304, ips: 416.86967
[2022/01/09 20:21:47] root INFO: save model in ./output/rec_chinese_common_v2.0/latest
[2022/01/09 20:21:47] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/amount/train_list.txt']
[2022/01/09 20:21:55] root INFO: epoch: [36/500], iter: 490, lr: 0.000992, loss: 0.249966, acc: 0.974606, norm_edit_dis: 0.995029, reader_cost: 0.63610 s, batch_cost: 0.69958 s, samples: 256, ips: 36.59348
[2022/01/09 20:22:13] root INFO: epoch: [36/500], iter: 500, lr: 0.000991, loss: 0.246262, acc: 0.968746, norm_edit_dis: 0.995142, reader_cost: 0.00045 s, batch_cost: 0.61365 s, samples: 2560, ips: 417.17631
eval model:: 100%|████████████████████████████████| 2/2 [00:02<00:00,  1.04s/it]
[2022/01/09 20:22:15] root INFO: cur metric, acc: 0.9699975750060625, norm_edit_dis: 0.9909632884925447, fps: 456.3268190347603
复制代码

3.金额模型评估

#  server模型
!python  -m paddle.distributed.launch tools/eval.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml \
    -o Global.checkpoints=./output/rec_chinese_common_v2.0/best_accuracy.pdparams
复制代码
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def convert_to_list(value, n, name, dtype=np.int):
-----------  Configuration Arguments -----------
gpus: None
heter_worker_num: None
heter_workers: 
http_port: None
ips: 127.0.0.1
log_dir: log
nproc_per_node: None
server_num: None
servers: 
training_script: tools/eval.py
training_script_args: ['-c', 'configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml', '-o', 'Global.checkpoints=./output/rec_chinese_common_v2.0/best_accuracy.pdparams']
worker_num: None
workers: 
------------------------------------------------
WARNING 2022-01-09 20:25:44,789 launch.py:316] Not found distinct arguments and compiled with cuda. Default use collective mode
launch train in GPU mode
INFO 2022-01-09 20:25:44,793 launch_utils.py:471] Local start 1 processes. First process distributed environment info (Only For Debug): 
    +=======================================================================================+
    |                        Distributed Envs                      Value                    |
    +---------------------------------------------------------------------------------------+
    |                       PADDLE_TRAINER_ID                        0                      |
    |                 PADDLE_CURRENT_ENDPOINT                 127.0.0.1:45561               |
    |                     PADDLE_TRAINERS_NUM                        1                      |
    |                PADDLE_TRAINER_ENDPOINTS                 127.0.0.1:45561               |
    |                     FLAGS_selected_gpus                        0                      |
    +=======================================================================================+

INFO 2022-01-09 20:25:44,793 launch_utils.py:475] details abouts PADDLE_TRAINER_ENDPOINTS can be found in log/endpoints.log, and detail running logs maybe found in log/workerlog.0
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def convert_to_list(value, n, name, dtype=np.int):
[2022/01/09 20:25:46] root INFO: Architecture : 
[2022/01/09 20:25:46] root INFO:     Backbone : 
[2022/01/09 20:25:46] root INFO:         layers : 34
[2022/01/09 20:25:46] root INFO:         name : ResNet
[2022/01/09 20:25:46] root INFO:     Head : 
[2022/01/09 20:25:46] root INFO:         fc_decay : 4e-05
[2022/01/09 20:25:46] root INFO:         name : CTCHead
[2022/01/09 20:25:46] root INFO:     Neck : 
[2022/01/09 20:25:46] root INFO:         encoder_type : rnn
[2022/01/09 20:25:46] root INFO:         hidden_size : 256
[2022/01/09 20:25:46] root INFO:         name : SequenceEncoder
[2022/01/09 20:25:46] root INFO:     Transform : None
[2022/01/09 20:25:46] root INFO:     algorithm : CRNN
[2022/01/09 20:25:46] root INFO:     model_type : rec
[2022/01/09 20:25:46] root INFO: Eval : 
[2022/01/09 20:25:46] root INFO:     dataset : 
[2022/01/09 20:25:46] root INFO:         data_dir : /home/aistudio/训练集/amount/images
[2022/01/09 20:25:46] root INFO:         label_file_list : ['/home/aistudio/训练集/amount/val_list.txt']
[2022/01/09 20:25:46] root INFO:         name : SimpleDataSet
[2022/01/09 20:25:46] root INFO:         transforms : 
[2022/01/09 20:25:46] root INFO:             DecodeImage : 
[2022/01/09 20:25:46] root INFO:                 channel_first : False
[2022/01/09 20:25:46] root INFO:                 img_mode : BGR
[2022/01/09 20:25:46] root INFO:             CTCLabelEncode : None
[2022/01/09 20:25:46] root INFO:             RecResizeImg : 
[2022/01/09 20:25:46] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 20:25:46] root INFO:             KeepKeys : 
[2022/01/09 20:25:46] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 20:25:46] root INFO:     loader : 
[2022/01/09 20:25:46] root INFO:         batch_size_per_card : 256
[2022/01/09 20:25:46] root INFO:         drop_last : False
[2022/01/09 20:25:46] root INFO:         num_workers : 8
[2022/01/09 20:25:46] root INFO:         shuffle : False
[2022/01/09 20:25:46] root INFO: Global : 
[2022/01/09 20:25:46] root INFO:     cal_metric_during_train : True
[2022/01/09 20:25:46] root INFO:     character_dict_path : /home/aistudio/训练集/amount/vocabulary.txt
[2022/01/09 20:25:46] root INFO:     checkpoints : ./output/rec_chinese_common_v2.0/best_accuracy.pdparams
[2022/01/09 20:25:46] root INFO:     debug : False
[2022/01/09 20:25:46] root INFO:     distributed : False
[2022/01/09 20:25:46] root INFO:     epoch_num : 500
[2022/01/09 20:25:46] root INFO:     eval_batch_step : [100, 100]
[2022/01/09 20:25:46] root INFO:     infer_img : doc/imgs_words/ch/word_1.jpg
[2022/01/09 20:25:46] root INFO:     infer_mode : False
[2022/01/09 20:25:46] root INFO:     log_smooth_window : 20
[2022/01/09 20:25:46] root INFO:     max_text_length : 25
[2022/01/09 20:25:46] root INFO:     pretrained_model : ./ch_ppocr_server_v2.0_rec_pre/best_accuracy
[2022/01/09 20:25:46] root INFO:     print_batch_step : 10
[2022/01/09 20:25:46] root INFO:     save_epoch_step : 3
[2022/01/09 20:25:46] root INFO:     save_inference_dir : None
[2022/01/09 20:25:46] root INFO:     save_model_dir : ./output/rec_chinese_common_v2.0
[2022/01/09 20:25:46] root INFO:     save_res_path : ./output/rec/predicts_chinese_common_v2.0.txt
[2022/01/09 20:25:46] root INFO:     use_gpu : True
[2022/01/09 20:25:46] root INFO:     use_space_char : True
[2022/01/09 20:25:46] root INFO:     use_visualdl : False
[2022/01/09 20:25:46] root INFO: Loss : 
[2022/01/09 20:25:46] root INFO:     name : CTCLoss
[2022/01/09 20:25:46] root INFO: Metric : 
[2022/01/09 20:25:46] root INFO:     main_indicator : acc
[2022/01/09 20:25:46] root INFO:     name : RecMetric
[2022/01/09 20:25:46] root INFO: Optimizer : 
[2022/01/09 20:25:46] root INFO:     beta1 : 0.9
[2022/01/09 20:25:46] root INFO:     beta2 : 0.999
[2022/01/09 20:25:46] root INFO:     lr : 
[2022/01/09 20:25:46] root INFO:         learning_rate : 0.001
[2022/01/09 20:25:46] root INFO:         name : Cosine
[2022/01/09 20:25:46] root INFO:         warmup_epoch : 5
[2022/01/09 20:25:46] root INFO:     name : Adam
[2022/01/09 20:25:46] root INFO:     regularizer : 
[2022/01/09 20:25:46] root INFO:         factor : 4e-05
[2022/01/09 20:25:46] root INFO:         name : L2
[2022/01/09 20:25:46] root INFO: PostProcess : 
[2022/01/09 20:25:46] root INFO:     name : CTCLabelDecode
[2022/01/09 20:25:46] root INFO: Train : 
[2022/01/09 20:25:46] root INFO:     dataset : 
[2022/01/09 20:25:46] root INFO:         data_dir : /home/aistudio/训练集/amount/images
[2022/01/09 20:25:46] root INFO:         label_file_list : ['/home/aistudio/训练集/amount/train_list.txt']
[2022/01/09 20:25:46] root INFO:         name : SimpleDataSet
[2022/01/09 20:25:46] root INFO:         transforms : 
[2022/01/09 20:25:46] root INFO:             DecodeImage : 
[2022/01/09 20:25:46] root INFO:                 channel_first : False
[2022/01/09 20:25:46] root INFO:                 img_mode : BGR
[2022/01/09 20:25:46] root INFO:             RecAug : None
[2022/01/09 20:25:46] root INFO:             CTCLabelEncode : None
[2022/01/09 20:25:46] root INFO:             RecResizeImg : 
[2022/01/09 20:25:46] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 20:25:46] root INFO:             KeepKeys : 
[2022/01/09 20:25:46] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 20:25:46] root INFO:     loader : 
[2022/01/09 20:25:46] root INFO:         batch_size_per_card : 256
[2022/01/09 20:25:46] root INFO:         drop_last : True
[2022/01/09 20:25:46] root INFO:         num_workers : 8
[2022/01/09 20:25:46] root INFO:         shuffle : True
[2022/01/09 20:25:46] root INFO: profiler_options : None
[2022/01/09 20:25:46] root INFO: train with paddle 2.0.2 and device CUDAPlace(0)
[2022/01/09 20:25:46] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/amount/val_list.txt']
W0109 20:25:46.365166  6137 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0109 20:25:46.370265  6137 device_context.cc:372] device: 0, cuDNN Version: 7.6.
[2022/01/09 20:25:52] root INFO: resume from ./output/rec_chinese_common_v2.0/best_accuracy
[2022/01/09 20:25:52] root INFO: metric in ckpt ***************
[2022/01/09 20:25:52] root INFO: acc:0.9699975750060625
[2022/01/09 20:25:52] root INFO: norm_edit_dis:0.9909632884925447
[2022/01/09 20:25:52] root INFO: fps:456.3268190347603
[2022/01/09 20:25:52] root INFO: best_epoch:36
[2022/01/09 20:25:52] root INFO: start_epoch:37

eval model::   0%|          | 0/2 [00:00<?, ?it/s]
eval model::  50%|█████     | 1/2 [00:01<00:01,  1.72s/it]
eval model:: 100%|██████████| 2/2 [00:02<00:00,  1.29s/it]
eval model:: 100%|██████████| 2/2 [00:02<00:00,  1.11s/it]
[2022/01/09 20:25:54] root INFO: metric eval ***************
[2022/01/09 20:25:54] root INFO: acc:0.9699975750060625
[2022/01/09 20:25:54] root INFO: norm_edit_dis:0.9909632884925447
[2022/01/09 20:25:54] root INFO: fps:439.9205808964122
INFO 2022-01-09 20:25:56,825 launch.py:240] Local processes completed.
复制代码

五、日期训练与评估

# 覆盖配置文件
!cp ./configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0_date.yml ~/rec_chinese_common_train_v2.0_date.yml
复制代码

1.配置

Global:
  use_gpu: true
  epoch_num: 500
  log_smooth_window: 20
  print_batch_step: 10
  save_model_dir: ./output/rec_chinese_common_v2.0_date
  save_epoch_step: 3
  # evaluation is run every 5000 iterations after the 4000th iteration
  eval_batch_step: [100, 100]
  cal_metric_during_train: True
  pretrained_model: ./ch_ppocr_server_v2.0_rec_pre/best_accuracy
  checkpoints:
  save_inference_dir:
  use_visualdl: False
  infer_img: doc/imgs_words/ch/word_1.jpg
  # for data or label process
  character_dict_path: /home/aistudio/训练集/date/vocabulary.txt
  max_text_length: 25
  infer_mode: False
  use_space_char: True
  save_res_path: ./output/rec/predicts_chinese_common_v2.0.txt


Optimizer:
  name: Adam
  beta1: 0.9
  beta2: 0.999
  lr:
    name: Cosine
    learning_rate: 0.001
    warmup_epoch: 5
  regularizer:
    name: 'L2'
    factor: 0.00004

Architecture:
  model_type: rec
  algorithm: CRNN
  Transform:
  Backbone:
    name: ResNet
    layers: 34
  Neck:
    name: SequenceEncoder
    encoder_type: rnn
    hidden_size: 256
  Head:
    name: CTCHead
    fc_decay: 0.00004

Loss:
  name: CTCLoss

PostProcess:
  name: CTCLabelDecode

Metric:
  name: RecMetric
  main_indicator: acc

Train:
  dataset:
    name: SimpleDataSet
    data_dir: /home/aistudio/训练集/date/images
    label_file_list: ["/home/aistudio/训练集/date/train_list.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - RecAug: 
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 320]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: True
    batch_size_per_card: 256
    drop_last: True
    num_workers: 8

Eval:
  dataset:
    name: SimpleDataSet
    data_dir: /home/aistudio/训练集/date/images
    label_file_list: ["/home/aistudio/训练集/date/val_list.txt"]
    transforms:
      - DecodeImage: # load image
          img_mode: BGR
          channel_first: False
      - CTCLabelEncode: # Class handling label
      - RecResizeImg:
          image_shape: [3, 32, 320]
      - KeepKeys:
          keep_keys: ['image', 'label', 'length'] # dataloader will return list in this order
  loader:
    shuffle: False
    drop_last: False
    batch_size_per_card: 256
    num_workers: 8
复制代码

2.日期模型训练

# server模型
%cd ~/PaddleOCR/

!python tools/train.py -c ./configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0_date.yml
复制代码

训练日志

[2022/01/09 21:15:41] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/date/train_list.txt']
[2022/01/09 21:15:56] root INFO: epoch: [69/500], iter: 410, lr: 0.000963, loss: 0.282951, acc: 0.978514, norm_edit_dis: 0.991211, reader_cost: 0.46780 s, batch_cost: 0.81243 s, samples: 1536, ips: 189.06251
[2022/01/09 21:16:06] root INFO: epoch: [69/500], iter: 413, lr: 0.000962, loss: 0.272608, acc: 0.979490, norm_edit_dis: 0.992269, reader_cost: 0.00150 s, batch_cost: 0.34571 s, samples: 1536, ips: 444.29798
[2022/01/09 21:16:10] root INFO: save model in ./output/rec_chinese_common_v2.0_date/latest
[2022/01/09 21:16:13] root INFO: save model in ./output/rec_chinese_common_v2.0_date/iter_epoch_69
[2022/01/09 21:16:13] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/date/train_list.txt']
[2022/01/09 21:16:39] root INFO: epoch: [70/500], iter: 419, lr: 0.000961, loss: 0.259662, acc: 0.983397, norm_edit_dis: 0.993701, reader_cost: 0.51926 s, batch_cost: 1.21329 s, samples: 3072, ips: 253.19623
[2022/01/09 21:16:43] root INFO: save model in ./output/rec_chinese_common_v2.0_date/latest
[2022/01/09 21:16:43] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/date/train_list.txt']
[2022/01/09 21:16:52] root INFO: epoch: [71/500], iter: 420, lr: 0.000961, loss: 0.259662, acc: 0.984373, norm_edit_dis: 0.994189, reader_cost: 0.55519 s, batch_cost: 0.67084 s, samples: 512, ips: 76.32220
[2022/01/09 21:17:09] root INFO: epoch: [71/500], iter: 425, lr: 0.000960, loss: 0.241655, acc: 0.985350, norm_edit_dis: 0.995117, reader_cost: 0.00232 s, batch_cost: 0.57741 s, samples: 2560, ips: 443.35555
复制代码

3.日期模型评估

#  server模型
!python  -m paddle.distributed.launch tools/eval.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0_date.yml \
    -o Global.checkpoints=./output/rec_chinese_common_v2.0_date/best_accuracy.pdparams
复制代码
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def convert_to_list(value, n, name, dtype=np.int):
-----------  Configuration Arguments -----------
gpus: None
heter_worker_num: None
heter_workers: 
http_port: None
ips: 127.0.0.1
log_dir: log
nproc_per_node: None
server_num: None
servers: 
training_script: tools/eval.py
training_script_args: ['-c', 'configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0_date.yml', '-o', 'Global.checkpoints=./output/rec_chinese_common_v2.0_date/best_accuracy.pdparams']
worker_num: None
workers: 
------------------------------------------------
WARNING 2022-01-09 21:17:26,064 launch.py:316] Not found distinct arguments and compiled with cuda. Default use collective mode
launch train in GPU mode
INFO 2022-01-09 21:17:26,068 launch_utils.py:471] Local start 1 processes. First process distributed environment info (Only For Debug): 
    +=======================================================================================+
    |                        Distributed Envs                      Value                    |
    +---------------------------------------------------------------------------------------+
    |                       PADDLE_TRAINER_ID                        0                      |
    |                 PADDLE_CURRENT_ENDPOINT                 127.0.0.1:50745               |
    |                     PADDLE_TRAINERS_NUM                        1                      |
    |                PADDLE_TRAINER_ENDPOINTS                 127.0.0.1:50745               |
    |                     FLAGS_selected_gpus                        0                      |
    +=======================================================================================+

INFO 2022-01-09 21:17:26,068 launch_utils.py:475] details abouts PADDLE_TRAINER_ENDPOINTS can be found in log/endpoints.log, and detail running logs maybe found in log/workerlog.0
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:26: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  def convert_to_list(value, n, name, dtype=np.int):
[2022/01/09 21:17:27] root INFO: Architecture : 
[2022/01/09 21:17:27] root INFO:     Backbone : 
[2022/01/09 21:17:27] root INFO:         layers : 34
[2022/01/09 21:17:27] root INFO:         name : ResNet
[2022/01/09 21:17:27] root INFO:     Head : 
[2022/01/09 21:17:27] root INFO:         fc_decay : 4e-05
[2022/01/09 21:17:27] root INFO:         name : CTCHead
[2022/01/09 21:17:27] root INFO:     Neck : 
[2022/01/09 21:17:27] root INFO:         encoder_type : rnn
[2022/01/09 21:17:27] root INFO:         hidden_size : 256
[2022/01/09 21:17:27] root INFO:         name : SequenceEncoder
[2022/01/09 21:17:27] root INFO:     Transform : None
[2022/01/09 21:17:27] root INFO:     algorithm : CRNN
[2022/01/09 21:17:27] root INFO:     model_type : rec
[2022/01/09 21:17:27] root INFO: Eval : 
[2022/01/09 21:17:27] root INFO:     dataset : 
[2022/01/09 21:17:27] root INFO:         data_dir : /home/aistudio/训练集/date/images
[2022/01/09 21:17:27] root INFO:         label_file_list : ['/home/aistudio/训练集/date/val_list.txt']
[2022/01/09 21:17:27] root INFO:         name : SimpleDataSet
[2022/01/09 21:17:27] root INFO:         transforms : 
[2022/01/09 21:17:27] root INFO:             DecodeImage : 
[2022/01/09 21:17:27] root INFO:                 channel_first : False
[2022/01/09 21:17:27] root INFO:                 img_mode : BGR
[2022/01/09 21:17:27] root INFO:             CTCLabelEncode : None
[2022/01/09 21:17:27] root INFO:             RecResizeImg : 
[2022/01/09 21:17:27] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 21:17:27] root INFO:             KeepKeys : 
[2022/01/09 21:17:27] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 21:17:27] root INFO:     loader : 
[2022/01/09 21:17:27] root INFO:         batch_size_per_card : 256
[2022/01/09 21:17:27] root INFO:         drop_last : False
[2022/01/09 21:17:27] root INFO:         num_workers : 8
[2022/01/09 21:17:27] root INFO:         shuffle : False
[2022/01/09 21:17:27] root INFO: Global : 
[2022/01/09 21:17:27] root INFO:     cal_metric_during_train : True
[2022/01/09 21:17:27] root INFO:     character_dict_path : /home/aistudio/训练集/date/vocabulary.txt
[2022/01/09 21:17:27] root INFO:     checkpoints : ./output/rec_chinese_common_v2.0_date/best_accuracy.pdparams
[2022/01/09 21:17:27] root INFO:     debug : False
[2022/01/09 21:17:27] root INFO:     distributed : False
[2022/01/09 21:17:27] root INFO:     epoch_num : 500
[2022/01/09 21:17:27] root INFO:     eval_batch_step : [100, 100]
[2022/01/09 21:17:27] root INFO:     infer_img : doc/imgs_words/ch/word_1.jpg
[2022/01/09 21:17:27] root INFO:     infer_mode : False
[2022/01/09 21:17:27] root INFO:     log_smooth_window : 20
[2022/01/09 21:17:27] root INFO:     max_text_length : 25
[2022/01/09 21:17:27] root INFO:     pretrained_model : ./ch_ppocr_server_v2.0_rec_pre/best_accuracy
[2022/01/09 21:17:27] root INFO:     print_batch_step : 10
[2022/01/09 21:17:27] root INFO:     save_epoch_step : 3
[2022/01/09 21:17:27] root INFO:     save_inference_dir : None
[2022/01/09 21:17:27] root INFO:     save_model_dir : ./output/rec_chinese_common_v2.0_date
[2022/01/09 21:17:27] root INFO:     save_res_path : ./output/rec/predicts_chinese_common_v2.0.txt
[2022/01/09 21:17:27] root INFO:     use_gpu : True
[2022/01/09 21:17:27] root INFO:     use_space_char : True
[2022/01/09 21:17:27] root INFO:     use_visualdl : False
[2022/01/09 21:17:27] root INFO: Loss : 
[2022/01/09 21:17:27] root INFO:     name : CTCLoss
[2022/01/09 21:17:27] root INFO: Metric : 
[2022/01/09 21:17:27] root INFO:     main_indicator : acc
[2022/01/09 21:17:27] root INFO:     name : RecMetric
[2022/01/09 21:17:27] root INFO: Optimizer : 
[2022/01/09 21:17:27] root INFO:     beta1 : 0.9
[2022/01/09 21:17:27] root INFO:     beta2 : 0.999
[2022/01/09 21:17:27] root INFO:     lr : 
[2022/01/09 21:17:27] root INFO:         learning_rate : 0.001
[2022/01/09 21:17:27] root INFO:         name : Cosine
[2022/01/09 21:17:27] root INFO:         warmup_epoch : 5
[2022/01/09 21:17:27] root INFO:     name : Adam
[2022/01/09 21:17:27] root INFO:     regularizer : 
[2022/01/09 21:17:27] root INFO:         factor : 4e-05
[2022/01/09 21:17:27] root INFO:         name : L2
[2022/01/09 21:17:27] root INFO: PostProcess : 
[2022/01/09 21:17:27] root INFO:     name : CTCLabelDecode
[2022/01/09 21:17:27] root INFO: Train : 
[2022/01/09 21:17:27] root INFO:     dataset : 
[2022/01/09 21:17:27] root INFO:         data_dir : /home/aistudio/训练集/date/images
[2022/01/09 21:17:27] root INFO:         label_file_list : ['/home/aistudio/训练集/date/train_list.txt']
[2022/01/09 21:17:27] root INFO:         name : SimpleDataSet
[2022/01/09 21:17:27] root INFO:         transforms : 
[2022/01/09 21:17:27] root INFO:             DecodeImage : 
[2022/01/09 21:17:27] root INFO:                 channel_first : False
[2022/01/09 21:17:27] root INFO:                 img_mode : BGR
[2022/01/09 21:17:27] root INFO:             RecAug : None
[2022/01/09 21:17:27] root INFO:             CTCLabelEncode : None
[2022/01/09 21:17:27] root INFO:             RecResizeImg : 
[2022/01/09 21:17:27] root INFO:                 image_shape : [3, 32, 320]
[2022/01/09 21:17:27] root INFO:             KeepKeys : 
[2022/01/09 21:17:27] root INFO:                 keep_keys : ['image', 'label', 'length']
[2022/01/09 21:17:27] root INFO:     loader : 
[2022/01/09 21:17:27] root INFO:         batch_size_per_card : 512
[2022/01/09 21:17:27] root INFO:         drop_last : True
[2022/01/09 21:17:27] root INFO:         num_workers : 8
[2022/01/09 21:17:27] root INFO:         shuffle : True
[2022/01/09 21:17:27] root INFO: profiler_options : None
[2022/01/09 21:17:27] root INFO: train with paddle 2.0.2 and device CUDAPlace(0)
[2022/01/09 21:17:27] root INFO: Initialize indexs of datasets:['/home/aistudio/训练集/date/val_list.txt']
W0109 21:17:27.794884 15547 device_context.cc:362] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 10.1, Runtime API Version: 10.1
W0109 21:17:27.800077 15547 device_context.cc:372] device: 0, cuDNN Version: 7.6.
[2022/01/09 21:17:33] root INFO: resume from ./output/rec_chinese_common_v2.0_date/best_accuracy
[2022/01/09 21:17:33] root INFO: metric in ckpt ***************
[2022/01/09 21:17:33] root INFO: acc:0.991173555371896
[2022/01/09 21:17:33] root INFO: norm_edit_dis:0.9953431509515168
[2022/01/09 21:17:33] root INFO: fps:443.3070479891199
[2022/01/09 21:17:33] root INFO: best_epoch:67
[2022/01/09 21:17:33] root INFO: start_epoch:68

eval model::   0%|          | 0/2 [00:00<?, ?it/s]
eval model::  50%|█████     | 1/2 [00:01<00:01,  1.59s/it]
eval model:: 100%|██████████| 2/2 [00:01<00:00,  1.17s/it]
eval model:: 100%|██████████| 2/2 [00:01<00:00,  1.02it/s]
[2022/01/09 21:17:35] root INFO: metric eval ***************
[2022/01/09 21:17:35] root INFO: acc:0.991173555371896
[2022/01/09 21:17:35] root INFO: norm_edit_dis:0.9953431509515168
[2022/01/09 21:17:35] root INFO: fps:441.1924785284007
INFO 2022-01-09 21:17:38,107 launch.py:240] Local processes completed.
复制代码

六、结果预测

修改 tools/infer_rec.py

    with open(save_res_path, "w") as fout:
        for file in get_image_file_list(config['Global']['infer_img']):
            logger.info("infer_img: {}".format(file))
            with open(file, 'rb') as f:
                img = f.read()
                data = {'image': img}
            batch = transform(data, ops)
            if config['Architecture']['algorithm'] == "SRN":
                encoder_word_pos_list = np.expand_dims(batch[1], axis=0)
                gsrm_word_pos_list = np.expand_dims(batch[2], axis=0)
                gsrm_slf_attn_bias1_list = np.expand_dims(batch[3], axis=0)
                gsrm_slf_attn_bias2_list = np.expand_dims(batch[4], axis=0)

                others = [
                    paddle.to_tensor(encoder_word_pos_list),
                    paddle.to_tensor(gsrm_word_pos_list),
                    paddle.to_tensor(gsrm_slf_attn_bias1_list),
                    paddle.to_tensor(gsrm_slf_attn_bias2_list)
                ]
            if config['Architecture']['algorithm'] == "SAR":
                valid_ratio = np.expand_dims(batch[-1], axis=0)
                img_metas = [paddle.to_tensor(valid_ratio)]

            images = np.expand_dims(batch[0], axis=0)
            images = paddle.to_tensor(images)
            if config['Architecture']['algorithm'] == "SRN":
                preds = model(images, others)
            elif config['Architecture']['algorithm'] == "SAR":
                preds = model(images, img_metas)
            else:
                preds = model(images)
            post_result = post_process_class(preds)
            info = None
            if isinstance(post_result, dict):
                rec_info = dict()
                for key in post_result:
                    if len(post_result[key][0]) >= 2:
                        rec_info[key] = {
                            "label": post_result[key][0][0],
                            "score": float(post_result[key][0][1]),
                        }
                info = json.dumps(rec_info)
            else:
                if len(post_result[0]) >= 2:
                    info = post_result[0][0] + "\t" + str(post_result[0][1])

            if info is not None:
                logger.info("\t result: {}".format(info))
                fout.write(os.path.basename(file) + "\t" + post_result[0][0]+"\n")
    logger.info("success!")
复制代码
!cp ~/infer_rec.py ~/PaddleOCR/tools/infer_rec.py
复制代码

1.金额预测出

# server模型
!python tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0.yml \
    -o Global.infer_img="/home/aistudio/测试集/amount/images" \
    Global.checkpoints=./output/rec_chinese_common_v2.0/best_accuracy
复制代码

输出日志

[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d01762c63a2220571.jpg
[2022/01/09 21:33:25] root INFO:      result: 壹仟捌佰贰拾肆元贰角叁分    0.974654
[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d01762cc9ce6e6029.jpg
[2022/01/09 21:33:25] root INFO:      result: 壹万叁仟元整    0.9725123
[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d017640031a39469f.jpg
[2022/01/09 21:33:25] root INFO:      result: 玖万柒仟陆佰伍拾柒元伍角肆分    0.9835057
[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d0176415a9ec807fe.jpg
[2022/01/09 21:33:25] root INFO:      result: 伍万柒仟元整    0.9003706
[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d017645417b3b5c5e.jpg
[2022/01/09 21:33:25] root INFO:      result: 叁万叁仟陆佰肆拾伍元肆角玖分    0.9679618
[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d017646019cf7387d.jpg
[2022/01/09 21:33:25] root INFO:      result: 伍仟零伍拾玖元肆角整    0.98454416
[2022/01/09 21:33:25] root INFO: infer_img: /home/aistudio/测试集/amount/images/8bb1941c760a2c1d01764b77a53735a0.jpg
[2022/01/09 21:33:25] root INFO:      result: 柒佰肆拾壹元柒角叁分    0.9320029
复制代码
!head ./output/rec/predicts_chinese_common_v2.0.txt
复制代码
8bb1941c760a2c1d017626c361da6c4d.jpg    壹万伍仟叁佰柒拾元正
8bb1941c760a2c1d01762b943a624421.jpg    壹拾捌万伍仟元整
8bb1941c760a2c1d01762c63a2220571.jpg    壹仟捌佰贰拾肆元贰角叁分
8bb1941c760a2c1d01762cc9ce6e6029.jpg    壹万叁仟元整
8bb1941c760a2c1d017640031a39469f.jpg    玖万柒仟陆佰伍拾柒元伍角肆分
8bb1941c760a2c1d0176415a9ec807fe.jpg    伍万柒仟元整
8bb1941c760a2c1d017645417b3b5c5e.jpg    叁万叁仟陆佰肆拾伍元肆角玖分
8bb1941c760a2c1d017646019cf7387d.jpg    伍仟零伍拾玖元肆角整
8bb1941c760a2c1d01764b77a53735a0.jpg    柒佰肆拾壹元柒角叁分
8bb1941c7657bb0101765edc72b01d52.jpg    壹万贰仟贰佰柒拾肆元整
复制代码

2.日期预测

# server模型
!python tools/infer_rec.py -c configs/rec/ch_ppocr_v2.0/rec_chinese_common_train_v2.0_date.yml \
    -o Global.infer_img="/home/aistudio/测试集/date/images" \
    Global.checkpoints=./output/rec_chinese_common_v2.0_date/best_accuracy
复制代码

输出日志

[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248693017a421e33087817.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.94455576
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248693017a5653c6d32277.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.9286475
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248693017a56bc0a886d10.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.9824974
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248693017a8955798e0c20.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.9235918
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248693017aa87d11e203b6.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.97518396
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248694017a31b07a3a1b2c.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.9815266
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248694017a3cfacc781d85.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.9835255
[2022/01/09 21:34:48] root INFO: infer_img: /home/aistudio/测试集/date/images/0_8bb1941c7a248694017a987384ab04e8.jpg
[2022/01/09 21:34:48] root INFO:      result: 贰零贰壹    0.9853095
复制代码
!head ./output/rec/predicts_chinese_common_v2.0_date.txt
复制代码
0_8bb1941c7a248693017a377ec36606b7.jpg    贰零贰壹
0_8bb1941c7a248693017a421e33087817.jpg    贰零贰壹
0_8bb1941c7a248693017a5653c6d32277.jpg    贰零贰壹
0_8bb1941c7a248693017a56bc0a886d10.jpg    贰零贰壹
0_8bb1941c7a248693017a8955798e0c20.jpg    贰零贰壹
0_8bb1941c7a248693017aa87d11e203b6.jpg    贰零贰壹
0_8bb1941c7a248694017a31b07a3a1b2c.jpg    贰零贰壹
0_8bb1941c7a248694017a3cfacc781d85.jpg    贰零贰壹
0_8bb1941c7a248694017a987384ab04e8.jpg    贰零贰壹
0_8bb1941c7a248694017a9ee81daf390b.jpg    贰零贰壹
复制代码

合并 ./output/rec/predicts_chinese_common_v2.0.txt /output/rec/predicts_chinese_common_v2.0_date.txt ,并提交即可取得成绩。 目前该比赛已经关闭。

猜你喜欢

转载自juejin.im/post/7051208679831371813