China's first OCR white paper is released, and OCR based on deep learning has become mainstream

China's first OCR white paper is released, and OCR based on deep learning has become mainstream
The text can be recognized by scanning, which is a new feature that has appeared in many applications in recent years. For example, when you enter a bank card number, you can directly scan it with the phone camera, and the software can extract the bank card information. The technology used here is the optical character recognition technology (Optical Character Recognition).
OCR is the abbreviation of Optical Character Recognition, which refers to the use of a machine to convert handwritten or printed text in an image into a format that can be directly processed by a computer. As an important branch in the field of computer vision, the typical application of OCR is to realize information entry through image text recognition. At the same time, because text and symbols contain rich semantic information, extracting text information based on OCR and then analyzing it can help machines better understand images.
On September 28, at the 2020 AIIA Artificial Intelligence Developers Conference guided by the Ministry of Industry and Information Technology, the Beijing Municipal People’s Government, and the International Telecommunication Union ITU-T, the organizers officially released the first domestic intelligent text recognition (OCR) capability evaluation and Apply the white paper.
The white paper provides a detailed review of the current domestic OCR industry from multiple dimensions including OCR development background, technological evolution, industrial development status, technical standardization, and development trends, and comprehensively promotes the accelerated implementation and sustainable development of OCR technology industrialization. 
It is understood that the white paper was jointly drafted by relevant departments of China Academy of Information and Communications Technology, China Artificial Intelligence Industry Development Alliance, and Tencent. With the help of artificial intelligence technology, the continuous improvement of OCR performance in recent years has provided solid support for more complex OCR application scenarios spawned by industrial digitalization. At the same time, more diversified service carriers covering mobile phones, electronic products, and cloud services have further accelerated the popularization of OCR and continued to spread to more areas of social production and life***. Especially in April 2020, the National Development and Reform Commission clearly included artificial intelligence infrastructure in the scope of "new infrastructure", as the most "local climate" in the application of artificial intelligence and a more mature field for commercial promotion. The OCR industry is under the background of "new infrastructure" Undoubtedly, new development opportunities will be ushered in, and related technologies will also usher in a new round of changes. The report pointed out that the three major directions for the future development of OCR technology mainly include integrated end-to-end OCR models, high-performance and efficient OCR, and intelligent OCR from perception to cognition. In detail, building an integrated end-to-end network and training text detection and recognition at the same time will become one of the important trends in the development of OCR technology. The end-to-end network design can not only reduce repeated calculations, but also improve the quality of features and promote the improvement of task performance.
At the same time, a large number of OCR applications need to run on resource-constrained mobile devices. The current mobile OCR algorithms mostly sacrifice a certain algorithm accuracy in exchange for running speed. A lightweight OCR model designed for mobile devices that takes into account both performance and efficiency will be An important direction for future development. In addition, in terms of intelligent OCR from perception to cognition, OCR technology usually starts from the field of computer vision, and will be cross-integrated with natural language processing technology, knowledge graph and other broader fields in the future. It is to improve OCR performance through deep mining of semantics and knowledge. Important trends.
In addition, the introduction of new learning paradigms such as reinforcement learning and meta-learning into OCR, allowing machines to learn how to recognize text independently, will also become a research hotspot. Many industries have deep and mature applications. In the future, with the digital transformation of traditional industries, the scope and scenarios of OCR will further expand, and the market scale will further increase. Authoritative organizations predict that the global OCR market will reach 13.381 billion US dollars in 2025. Limited by the level of technological development in the early days, OCR manufacturers usually cut in from specific applications, such as license plate recognition systems, and formed a series of special equipment. In recent years, more and more terminal devices and applications are embedded with OCR technology, and a complete industry chain ecology from infrastructure, basic capabilities to terminal applications has gradually formed, and a series of subdivisions such as cards and bills have also been derived OCR capabilities serve various industries through a combination. Figure | OCR Industry Ecological Map
It is not difficult to see that OCR technology is gradually "sinking" as a basic capability that provides underlying technical support for different upper-level business applications. Technology giants and cloud computing vendors are accelerating the deployment of OCR. While meeting their own internal business needs, they continue to open up advanced OCR capabilities to the outside world. OCR has become the standard for technology giants. At the level of specific landing applications, the current standard scene text recognition such as card recognition and bill recognition has been relatively mature, and the application of handwritten text recognition in education, logistics and other industries is also expanding. OCR technology and applications in complex dynamic scenes have become a hot research direction in the past two years. For example, in scenes such as unmanned driving and robots, OCR is used to recognize characters in the field of view. In the white paper released this time, Tencent Cloud also announced a number of typical landing cases in the field of OCR. It is worth mentioning that in order to comprehensively lower the application threshold of OCR related fields and avoid a mixed situation, the white paper also announced the OCR evaluation standards and specifications for the first time. In April 2020, the China Artificial Intelligence Industry Development Alliance formulated the "Intelligent Grading Technical Requirements and Evaluation Methods for OCR Services", which stipulated the technical requirements and evaluation methods for OCR services in terms of functions, performance, and safety. In July, the OCR service requirements and evaluation methods were successfully established in the ITU-T SG16 group of the International Telecommunication Union, marking that the OCR evaluation method under the background of deep learning has gradually been accepted by the International Standards Organization. At present, under the guidance of the China Artificial Intelligence Industry Development Alliance, the Tianjian OCR service engine automated evaluation platform developed by Tencent Cloud can not only provide technical testing services for OCR technology suppliers, but also release the test results of OCR technology or products for the demand side. Provide an objective and fair basis for selection.

Daohan Tianqiong CiGril Robot API
Daohan Tianqiong CiGril Cognitive Intelligent Robot API users need to follow the steps to obtain basic information:
1. Register an account on the platform
2. Log in to the platform, enter the background management page, create an application, then view the application, view the application Related Information.
3. On the application information page, find the appid, appkey secret key and other information, and then write the interface code to access the robot application.
Start access
request address: http://www.weilaitec.com/cigirlrobot.cgr
Request method: post
request parameters:
parameter type default value description
userid String no platform registered account
appid String no platform created application id
key String no platform application The generated secret key
msg String "" Message content
ip String "" The client ip requires uniqueness. If no ip, it can be replaced by QQ account, WeChat account, mobile phone MAC address, etc.

Example of interface connection: http://www.weilaitec.com/cigirlrobot.cgr?key=UTNJK34THXK010T566ZI39VES50BLRBE8R66H5R3FOAO84J3BV&msg=Hello&ip=119.25.36.48&userid=jackli&appid=52454214552

Note: The parameter name must be lowercase, the five parameters must not be omitted, the parameter name must be written correctly, and the value of each parameter cannot be an empty string. Otherwise, the request cannot be successful. The three parameters of userid, appid, and key must be registered on the platform after the application is created, and then you can see the application details. Userid is the platform registered account.
Sample code JAVA:

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class apitest {

    /*
    
Get请求,获得返回数据
     @param urlStr
    
@return
     /
    private static String opUrl(String urlStr)
    {        
        URL url = null;
        HttpURLConnection conn = null;
        InputStream is = null;
        ByteArrayOutputStream baos = null;
        try
        {
            url = new URL(urlStr);
            conn = (HttpURLConnection) url.openConnection();
            conn.setReadTimeout(5
10000);
            conn.setConnectTimeout(5 * 10000);
            conn.setRequestMethod("POST");
            if (conn.getResponseCode() == 200)
            {
                is = conn.getInputStream();
                baos = new ByteArrayOutputStream();
                int len = -1;
                byte[] buf = new byte[128];

                while ((len = is.read(buf)) != -1)
                {
                    baos.write(buf, 0, len);
                }
                baos.flush();
                String result = baos.toString();
                return result;
            } else
            {
                throw new Exception("服务器连接错误!");
            }

        } catch (Exception e)
        {
            e.printStackTrace();
        } finally
        {
            try
            {
                if (is != null)
                    is.close();
            } catch (IOException e)
            {
                e.printStackTrace();
            }

            try
            {
                if (baos != null)
                    baos.close();
            } catch (IOException e)
            {
                e.printStackTrace();
            }
            conn.disconnect();
        }
        return "";
    }
    
    
    public static void main(String args [] ){        
            //The msg parameter is the content of the past conversation.            
            System.out.println (opUrl ( " http://www.weilaitec.com/cigirlrobot.cgr?key=UTNJK34THXK010T566ZI39VES50BLRBE8R66H5R3FOAO84J3BV&msg= IP = 119.25.36.48 & Hello the userid & jackli & AppID = 52,454,214,552 = "));
            
    }
}

Guess you like

Origin blog.51cto.com/14864650/2539992