Build CRNN model (based on windows and tensorflow)

3.1.1.CRNN Introduction

After CNN by the feature extracted from image sequences using RNN predict the final result obtained by a final CTC translation layer. It means CNN + RNN + CTC structure.
Git address https://github.com/bgshih/crnn
paper: paper http://arxiv.org/abs/1507.05717.

3.1.2.CNN Introduction

CNN structure is used in VGG structure and VGG article on the network to do some fine-tuning
Here Insert Picture Description

3.1.3.RNN Introduction

Wherein RNN sequence for the CNN network output, each input has an output yt. In order to prevent the disappearance of the gradient of training, the article uses LSTM nerve cells as a unit of RNN. Article that predicted for the sequence, the sequence information before and after the information to help predict the sequence, so the article uses a dual RNN network. LSTM bidirectional neuronal structure and configuration as shown in FIG RNN.

Here Insert Picture Description

3.1.4.CTC translation layer

Test, translation is divided into two types, one is with a dictionary, one is no dictionary.

With the dictionary is in the test, the test is set to the dictionary, the output probability is calculated for all the test dictionary, taking the maximum of the string is the final predicted

Without a dictionary, the test means comprising a current collector which is not given the test string, selects the output the probability that the predicted maximum as the final prediction string.

3.1.5. Debugging on tensorflow of crnn

1. First download from the git
Git address: HTTPS: //github.com/MaybeShewill-CV/CRNN_Tensorflow
2. Download the pre-trained models, their own training, then do not download, training data from several GB of it.
The pretrained crnn model weights on Synth90k dataset can be found here

3. downloaded can be used directly, and use the following commands:
Python Tools / test_shadownet.py --image_path Data / test_images / test_01.jpg --weights_path Model / crnn_synth90k / shadownet.ckpt --char_dict_path Data / char_dict / char_dict_en.json Data --ord_map_dict_path / char_dict / ord_map_en.json
4. Some pits
(1) a modified tools py file, add the following code. I am running windows directly, mainly to find relevant directories.
os.getcwd () method returns the current working directory.
Here Insert Picture Description

2.windows creation process does not fork method, the default is to spawn, and the process of creating the default linux is fork method. You will be reported the following error:
at The "freeze_support ()" Line Program at The CAN BE Omitted IF
IS not going to to Produce Frozen BE AN Executable.
Here Insert Picture Description
Modify data_provider / tf_io_pipline_fast_tools.py file, add "IF name == ' main ':", as follows Fig.

if name == ‘main’:
_SAMPLE_INFO_QUEUE = Manager().Queue()
_SENTINEL = ("", [])

3.1.6. English OCR successful operation

python tools/test_shadownet.py --image_path data/test_images/test_01.jpg --weights_path model/crnn_synth90k/shadownet.ckpt --char_dict_path data/char_dict/char_dict_en.json --ord_map_dict_path data/char_dict/ord_map_en.json

Here Insert Picture Description
It found that only with the training data, identifies go.
Here Insert Picture Description

3.1.7. Chinese OCR successful operation

1.下载预训练好的模型
I have uploaded a newly trained crnn model on chinese dataset which can be found here. Sorry for not knowing the owner of the dataset. But thanks for his great work. If someone knows it you’re welcome to let me know. The pretrained weights can be found here
2.修改配置文件config/global_config.py:

__C.ARCH.NUM_CLASSES = 5825 # cn dataset
#__C.ARCH.NUM_CLASSES = 37 # synth90k dataset

3.运行demo
python tools/recongnize_chinese_pdf.py -c ./data/char_dict/char_dict_cn.json -o ./data/char_dict/ord_map_cn.json --weights_path model/crnn_chinese/shadownet.ckpt --image_path data/test_images/test_pdf.png --save_path data/test_images/pdf_recognize_result.txt

Here Insert Picture Description

I do not git recognition effect on a good show. Prepare the data themselves, their training is even more right.
Here Insert Picture Description

Published 21 original articles · won praise 18 · views 1456

Guess you like

Origin blog.csdn.net/zephyr_wang/article/details/104200666