Installation OCR recognition library python

(1) installation process

Reference of this blog: https://blog.csdn.net/lanxianghua/article/details/100516187?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task

(2) Chinese font installed

Recognition Chinese need to install fonts, reference this blog: https://www.cnblogs.com/jiyu-hlzy/p/12191463.html

Error (3) during Installation

After the installation the following error when executing the program:

 This error is caused by not find tesseract.exe, but I added a path in pytesseract.py, the final discovery is due to the following reasons:

(4) test

Reference other blog, write a few simple lines of test code, the code is as follows:

 1 # encoding: utf-8
 2 
 3 import pytesseract
 4 from PIL import Image
 5 import os
 6 
 7 
 8 if __name__ == "__main__":
 9     print (os.getcwd())
10     im_ch = Image.open('test.png')
11 
12     print('========识别中文========')
13     print(pytesseract.image_to_string(im_ch, lang='chi_sim ' ))

Test images and results are as follows:

      

We can see the test images are not clearly separated, there have been a lot of typos test.

(5) Improved

Then he went to look at some of the blog, you are saying that AI can be used to improve the accuracy of Baidu, refer to this blog achieve a bit: https://www.cnblogs.com/adam012019/p/11440353.html

Results are as follows:

      

 You can see, the effect has been very good.

 So far, this blog to write here so far, in fact, I was suddenly a Xiangnong this because the Internet looking for online photo identification are to be charged, I thought why not write your own, on the Internet looking for a few blog post to see. Power users really powerful, thanks to friends above blog again.

appendix:

Baidu cloud intelligent character recognition API Profile: https://cloud.baidu.com/doc/OCR/s/Ek3h7xypm .

You can go to the opening of a day of free credit or enough personal use.

Guess you like

Origin www.cnblogs.com/mrlayfolk/p/12617077.html