CTPN CRNN bank card identification

For text positioning by CTPN, CRNN character recognition and realization of Flask Web bank card number identification
Github Address

Since I am not a machine learning direction, to complete this project only school curriculum needs
so the article may be just getting started and completed this project, as deep-seated principle, recommended two Chinese Bowen

[OCR] Technical Series Five Survey natural scene text detection (CTPN, SegLink, EAST)
[OCR] Technical Series Seven-end variable-length character recognition algorithm Detailed CRNN

Construction of the foundation operating environment

And the hardware device driver dependent portion and the following:
Ubuntu18.04 the CUDA 8.0.61 + + + the NVIDIA the GeForce GTX 960M Driver 430.14 + + Tensorflow-GPU Python3.6

NVIDIA 430.14 driver download
CUDA 8.0 download

After the installation is complete, you can check in your device in the following order

VswX3F.png

Clone source code, and generates a virtual environment Python3

git clone https://github.com/bay1/card-crnn-ctpn.git

python3 -m virtualenv venv

source venv/bin/activate # 激活虚拟环境

pip install -r requirements.txt # 安装项目依赖

Configuring warpctc-pytorch

Project uses warpctc-pytorch, we need to manually install
Note that the command to be executed in Python virtual environment

git clone https://github.com/SeanNaren/warp-ctc.git
cd warp-ctc
mkdir build; cd build
cmake ..
make

You may encounter the following error, it is because your gcc version too, need less than 5.0 version

/usr/local/cuda-8.0/include/host_config.h:119:2: error: #error -- unsupported GNU version! gcc versions later than 5 are not supported!
 #error -- unsupported GNU version! gcc versions later than 5 are not supported!
  ^~~~~

If there are multiple versions of gcc on your system, you can execute the following command specifies the "gcc" command to link the specific instructions
for example, I specify a different gcc version I present in the system: gcc-4.9

sudo rm /usr/bin/gcc
sudo ln -s /usr/bin/gcc-4.9  /usr/bin/gcc

VsDgtx.png

You may also encounter the following error

/usr/bin/ld: CMakeFiles/test_gpu.dir/tests/test_gpu_generated_test_gpu.cu.o: relocation R_X86_64_32S against `.bss' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: 最后的链结失败: 输出不可表示的节
collect2: error: ld returned 1 exit status
CMakeFiles/test_gpu.dir/build.make:98: recipe for target 'test_gpu' failed
make[2]: *** [test_gpu] Error 1
CMakeFiles/Makefile2:146: recipe for target 'CMakeFiles/test_gpu.dir/all' failed
make[1]: *** [CMakeFiles/test_gpu.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

According to the contents of error, we can directly modify CMakeCache.txt directory

CMAKE_CXX_FLAGS:STRING=-fPIC # 39 行

VsrcrQ.png

Then we explained according warp-ctc, execute the following command

cd ../pytorch_binding
python setup.py install

At this point you may encounter the following error

src/binding.cpp:6:10: fatal error: torch/extension.h: 没有那个文件或目录
 #include <torch/extension.h>
          ^~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

According to previous experience, https://github.com/SeanNaren/warp-ctc/issues/101
we switch directly to the previous version

git checkout ac045b6

CTPN

Text position location, I direct borrowing of this project and its training model text-detection-ctpn
actually model the effect of the training is not very good, but my hands did not model data, their own training or training need to download this author data
so simply direct use of the trained model data can be downloaded from the author of two ways of ckpt file

The disk in drive GOOGL
to Baidu yun

This folder into ctpn /, and then execute the following command

cd ctpn/utils/bbox
chmod +x make.sh
./make.sh

PS: If you want to train yourself CTPN model data, you can perform train.py file ctpn folder

Which text-detection-ctpn only for text positioning position
and our specific positioning for the bank card number, they still need to do some custom processing of
my idea is chosen out of all Box calculate length and width, and then based on long to width ratio interception area card
course, the premise is that this effect, we were able to partially detected the card number, but the effect achieved is still quite good

def get_wh(box_coordinate):
    """
    计算box坐标宽高
    box格式: [xmin, ymin, xmax, ymin, xmax, ymax, xmin, ymax, score]
    """
    xmin = box_coordinate[0]
    xmax = box_coordinate[2]
    ymin = box_coordinate[1]
    ymax = box_coordinate[5]
    width = xmax - xmin
    height = ymax - ymin
    return width, height

murmur

At this point need to return to the root directory of the project, the first picture I got a simple data processing
is about the original picture data, that data / images of pictures, converted into the form needed to generate lmdb

python crnn/handle_images.py

Pictures path and the correct label text: crnn / to_lmdb / train.txt
picture new path after treatment: crnn / to_lmdb / train_images

traintxt.png

Then we need to convert our data into the hands of the train lmdb files needed to
execute the following command

python crnn/to_lmdb/to_lmdb_py3.py # python crnn/to_lmdb/to_lmdb_py2.py 

Generated lmdb file directory: crnn / to_lmdb / lmdb

This time we can train our hands specific data

python crnn/train.py

Model save directory: crnn / expr

Custom parameters

This project has a variety of custom data directory, as well as training the model parameters
if you want to modify these parameters or data path, you can go to the following two files

  • ctpn/params.py
  • murmur / params.py

Training model in which the parameters crnn Detailed

--random_sample      是否使用随机采样器对数据集进行采样, action='store_true'
--keep_ratio         设置图片保持横纵比缩放, action='store_true'
--adam               使用adma优化器, action='store_true'
--adadelta           使用adadelta优化器, action='store_true'
--saveInterval       设置多少次迭代保存一次模型
--valInterval        设置多少次迭代验证一次
--n_test_disp        每次验证显示的个数
--displayInterval    设置多少次迭代显示一次
--experiment         模型保存目录
--alphabet           设置检测分类
--crnn               选择预训练模型
--beta1            
--lr                 学习率
--niter              训练回合数
--nh                 LSTM隐藏层数
--imgW               图片宽度
--imgH               图片高度, default=32
--batchSize          设置batchSize大小, default=64
--workers            工作核数, default=2
--trainroot          训练集路径
--valroot            验证集路径
--cuda               使用GPU, action='store_true'

Visualization

After completion of training CRNN, crnn test load model default path is: crnn / trained_models / crnn_Rec_done.pth
that we need to put our training into good model to rename the directory
and then we can execute the following command in the root directory of the project

python run.py

Browser to open links: http://127.0.0.1:5000

index.png

uploading.png

Show results

This is a local iterations 60 times the effect of
test pictures can fully identify a bank card number which

result.png

PS:
参考crnn.pytorch
why my accuracy is always 0

Guess you like

Origin www.cnblogs.com/bay1/p/10994600.html