incredible! Alibaba Security Turing Lab refreshed the world's best results in the ICDAR2017 MLT competition

Recently, the ATL Cangjie OCR algorithm of Alibaba Turing Lab refreshed the world's best score in the ICDAR2017 MLT (Competition on Multi-lingual scene text detection) natural scene multilingual text detection competition, with 73.52% Hmean ranked first. (Contest results page: )

It is understood that ICDAR (International Conference on Document Analysis and Recognition) is known as the Oscar event in the field of OCR, and it is one of the most authoritative competitions in the field of OCR in the world.

Currently, OCR technology is widely used in many fields. From the digitization of business cards, invoices, bank cards and other bills to outdoor street store indexing, road sign recognition; to picture and video text content understanding and content security, OCR technology is playing an increasingly important role.

According to the researchers of Ali Security Turing Lab, technically, text detection and recognition need to deal with various tests and challenges, such as the influence of lighting in natural scenes, the occlusion of objects, the changes in text size, proportion, and angle, and images and videos in Chinese. blurry words, etc. The ICDAR2017 MLT competition includes 9 languages ​​including Chinese, Japanese, Korean, Latin (English, French, German, Italian), Arabic and Bengali. , the color is ever-changing, and it also contains many real scene noises including lighting, occlusion, tilt, text stacking, text mosaic, perspective changes, etc., which is more challenging for the adaptability of the OCR algorithm.

To overcome these problems, researchers at Alibaba Security Turing Lab designed deep learning-based network models and algorithms.

According to reports, in terms of text detection models, researchers in Turing Lab used deep convolutional neural networks to obtain deeper image features; A variety of environments, text of various sizes, proportions, and angles. In addition, since the framework adopts an innovative detection strategy, the detection speed is greatly improved compared to the traditional RCNN-based scheme. In the text recognition model, the researchers of Turing Lab made a brand-new exploration and research on the basis of the mainstream solution, so as to obtain a more effective recognition model, and improve the recognition efficiency under the premise of ensuring the recognition accuracy. .

ATL Cangjie OCR provides online synchronous and asynchronous general OCR text detection and recognition services, as well as offline ODPS services, providing strong technical support for image and text content understanding and content security, according to researchers at Alibaba Security Turing Lab. This model has fully supported multiple business scenarios such as commodity content security, business security, platform governance, evaluation, interaction, and authentication in the Alibaba ecosystem. At the same time, it is also exported to third parties through Alibaba Cloud Shield-Content Security (Green Net) products. customer use.

At present, the average daily call volume of ATL Cangjie OCR service is hundreds of millions, which can provide customers with stable technical guarantee. (Author: Hua Meng)

