The size is only 8.6M! Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

Optical character recognition (OCR) refers to a process in which electronic devices (such as scanners or digital cameras) check characters printed on paper, determine their shapes by detecting dark and bright patterns, and then use character recognition methods to translate the shapes into computer text.

A few months ago, Yuan Mei shared with you a popular Chinese OCR project——chineseocr_lite. Only in the past two days did Yuan Mei know that Baidu has also open sourced an ultra-lightweight Chinese OCR, the total model size is only 8.6M, only Chineseocr_lite, that is really an ultra-lightweight god-level OCR.

PaddleOCR aims to create a rich, leading, and practical OCR tool library to help users train better models, support iOS and Android systems, with such complete functions, it is no wonder that they dominate the Github hot list:

The size is only 8.6M!  Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

 

PaddleOCR has the following characteristics:

  • Ultra-lightweight Chinese OCR model, the total model is only 8.6M
  • Use general Chinese OCR model
  • A variety of predictive inference deployment solutions, including service deployment and end-side deployment
  • A variety of text detection training algorithms, EAST, DB
  • A variety of text recognition training algorithms, Rosetta, CRNN, STAR-Net, RARE
  • Can run on Linux, Windows, MacOS and other systems

Having said that, let's take a look at the effect together. Let's take a look at the general Chinese OCR effect display:

The size is only 8.6M!  Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

 

Let's take a look at the display of the super-lightweight Chinese OCR effect, whether it is horizontal or vertical text, it is not a problem, and the recognition accuracy is quite high.

The size is only 8.6M!  Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

 

Of course, it would be a bit exaggerated to say that he made zero mistakes. For example, in the following one, there was a word recognition error:

The size is only 8.6M!  Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

 

Chinese OCR effect display supporting spaces

The size is only 8.6M!  Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

 

General model

The size is only 8.6M!  Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

 

In addition to the rich functions, the documentation tutorial is also very comprehensive.

The size is only 8.6M!  Baidu's open source ultra-lightweight Chinese and English OCR model hits Github

 

If there is anything you don’t understand, it’s okay to find the documentation directly. Don’t you know that Baidu’s open source artifact is a fan?

Finally, attach the Github address: https://github.com/PaddlePaddle/PaddleOCR

Guess you like

Origin blog.csdn.net/GYHYCX/article/details/108798195