Several awesome OCR open source projects
OCR
OCR (optical character recognition) text recognition refers to the process in which electronic devices (such as scanners or digital cameras) check characters printed on paper, and then use character recognition methods to translate shapes into computer text.
Click to view the encyclopedia: OCR text recognition introduction
History background
The concept of optical text recognition was first proposed by German scientist Tausheck in 1929, and later American scientist Handel also proposed the idea of using technology to recognize text. Casey and Nagy of IBM Corporation were the first to study the recognition of printed Chinese characters. In 1966, they published the first article on Chinese character recognition, using the template matching method to recognize 1000 printed Chinese characters.
In the early 1970s, Japanese scholars began to study Chinese character recognition and did a lot of work.
After 1986, my country's OCR research has made great progress, innovations have been made in Chinese character modeling and recognition methods, and fruitful results have been achieved in system development and development applications. Many units have launched Chinese OCR products one after another.
Early OCR software structure
1. Image input and preprocessing
2. Binarization
3. Noise removal
4. Tilt correction
5. Layout Analysis
6. Character cutting
7. Character recognition
8. Layout recovery
9. Post-processing and proofreading
Introduction to several awesome OCR open source projects
-
First place: PaddleOCR
-
PaddleOCR aims to create a rich, leading, and practical OCR tool library to help developers train better models and implement applications.
-
Open source address: https://github.com/PaddlePaddle/PaddleOCR
-
Features
Support a variety of OCR-related cutting-edge algorithms, on this basis to create industrial-level characteristic models PP-OCR and PP-Structure, and open up the entire process of data production, model training, compression, and forecast deployment.
-
Online website experience:
ultra-lightweight PP-OCR mobile model experience address: https://www.paddlepaddle.org.cn/hub/scene/ocr -
Mobile demo experience:
installation package DEMO download address (based on EasyEdge and Paddle-Lite, supports iOS and Android systems) https://ai.baidu.com/easyedge/app/openSource?from=paddlelite -
Text detection algorithm effect
-
Text recognition algorithm effect
-
-
Second place: EasyOCR
-
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including: Latin, Chinese, Arabic, Devanagari, Cyrillic, etc.
-
Open source address: https://github.com/JaidedAI/EasyOCR
-
algorithm effect
-
DEMO address
https://www.jaided.ai/easyocr
-
-
Third place: chineseocr
-
This project is based on yolo3 and crnn to realize Chinese natural scene text detection and recognition
-
The project provides the data set
ocr ctc training data set (compressed package decoding: chineseocr)
Baidu network disk address: link: https://pan.baidu.com/s/1UcUKUUELLwdM29zfbztzdw extraction code: atwn -
Realized function
Text direction detection 0, 90, 180, 270 degree detection (support dnn/tensorflow)
support (darknet/opencv dnn /keras) text detection, support darknet/keras training
variable length OCR training (English, Chinese and English) crnn\dense OCR recognition and training, add pytorch to keras model code (tools/pytorch_to_keras.py) -
Other instructions:
https://github.com/chineseocr/chineseocr#readme
-
-
Fourth place: YCG09/chinese_ocr
-
End-to-end variable-length Chinese character detection and recognition based on Tensorflow and Keras
Text Detection: CTPN
Text Recognition: DenseNet + CTC -
The project provides a dataset:
https://pan.baidu.com/s/1QkI7kjah8SPHwOQ40rS1Pw (password: lu7m)A total of about 3.64 million pictures are divided into training set and verification set according to 99:1. The
data uses Chinese corpus (news + classical Chinese) to randomly generate
Chinese characters, English and A total of 5990 characters of letters, numbers and punctuation.
Each sample is fixed at 10 characters, and the characters are randomly intercepted from the sentences in the corpus.
The image resolution is uniformly 280x32
-
Other open source projects
https://github.com/eragonruan/text-detection-ctpn
https://github.com/senlinuc/caffe_ocr
https://github.com/chineseocr/chinese-ocr
https://github.com/xiaomaxiao/keras_ocr
https://github.com/alisen39/TrWebOCR
https://github.com/da03/Attention-OCR
https://github.com/JinpengLI/deep_ocr