[Share] Huawei Huawei cloud cloud service key character recognition technology, and product landing should be noted that matters (OCR Series II)

Abstract This article OCR text recognition The second series. First, a brief look at Huawei's cloud character recognition services, then focus on key technology products, key capabilities, optimization of the product the way, do products need to be aware of the problem and stepped pit. Many whole point is artificial intelligence or data-driven products need attention.

First, following a brief introduction about Huawei cloud character recognition product, then highlight some of our key technologies, key capabilities, optimization of the road products, products do need to be aware of the problem and stepped pit. Many whole point is artificial intelligence or data-driven products need attention.

Huawei cloud OCR technology team is doing the product, but also participate in some games, or write papers, to enhance the impact of technology, such as 2019 ICDAR SROIE sheet recognition competition, Central China joint collaboration involved in this competition together. In the end character recognition on the track, with superior accuracy has made the world's first 96.43, more than 2-5 is probably about two points, but also a number of patent applications. Because our innovation, we get a fair number of the leading 2019 Technology Achievement Award New Product Award.

image.png

This is a panorama of our character recognition Huawei cloud services, mainly includes five major categories, including generic class, class notes, class documents, industry type, and custom categories, including general category common language, common forms, screenshots and other networks. Notes category it refers to a variety of tickets, for instance, it is just a VAT invoice, train, taxi tickets. Documents like it, it is all kinds of documents, such as identity cards, driving license, driving license, passport and so on. Industry categories are industry-specific, such as electronic surface by the logistics industry, the paper surface by, for example, gas meter aspects of life, such as the medical industry and other medical check list. Custom class does, it is the customer special customized, such as identity cards overseas. Help enterprises to enhance productivity, reduce operating costs

image.png

Huawei OCR processing flow, integration of a variety of image processing techniques, with high accuracy, robustness and adaptability. Character recognition accuracy is particularly high, support the wrong line, stamp, text overlay and other complex scenarios, supports multiple types of documents, different adaptive picture quality. Then the entire process, including pre-processing images, tables extraction, there is no further treatment forms, text positioning, which may be a correct character, character recognition, the word processing, and finally returned to the client is structured json data.

So these capabilities is how to achieve it. First of all it is from the beginning of the hardware, we coordinated Huawei cloud rising (Ascend) chip optimized low-level. For example LSTM multi-operator integration, such as operator rewritten. In fact, the depth of learning to do, it is best to start from the ground up to optimize the hardware, of course, many small companies can not do this, it would need in the top of the optimization algorithm. Another point is that we have a variety of image pre-processing, in order to improve the training speed, for example, we'll grow different degrees text, based on multi-threaded multiple queues, and then increase the speed of reading the data. Polymerization will cumulative gradient optimization, slice data of one million, 10 hours to complete the training.

image.png

When will the product do encounter a variety of scenarios, such as under the seal of the detection of the map. This is a client of our actual scene of Guangdong, need to identify the contents of the stamp. In addition to the oval seal Chapter I of FIG, there is also a circular section, square section, triangular section and the like. We use a variety of curved text detection algorithm to detect performance is not up to the needs of customers, then we did some optimization after the character-based, about the accuracy rate of about 96%, basically meet the needs of customers. So they say. Whether you are writing a paper or do children products, be sure to continue targeted optimization algorithm. The effect of many of the original thesis, even if proved good performance, but in practice often fail scene described in the paper. Character recognition algorithm is not optimized, it is better to go home selling sweet potatoes.

image.png

In depth study of the times, the key data, the demand for data is very large. But the data is always limited, the cost of tagging data is also great, but it takes time, composite data basically became a Choice. The far left is syntext, I believe that many teams are in use, the first to use segmentation to extract geometry and segmentation results, and then calculate the depth of field, and then find a smooth area generated text.

We also use traditional methods, such as opencv, pillow, etc. used to synthesize the whole picture. Our research for the text from a rich set of enhanced operator library, added a lot of open source projects do not have operator. At the same time it, we will also convert some sections by GAN. Of course, GAN sometimes converted content would be more baffling. This is also the depth of learning, because not a lot of time to explain, before the results come out. You never know what you get will be.

image.png

Automatic learning, can be icing on the cake, for example, we enhancement algorithm based on population PBA. It can quickly and efficiently learn a neural network training data enhancement method. In some scenes, everyday situations training takes three days, generated by the PBA generated after data enhancement, increased to about half a day in practice.

Meanwhile it on some algorithm model, performance has also been improved to some extent, say yes. In our model, it's about performance increased by about three points. We will conduct a search based on Huawei's self-developed NAS of ModelArts platform. Then automatically finds the optimal model. One of our overseas research institute, during the pruning model of optimization.

image.png

Automation is the future, because now many products are tailored to the requirements in the AI ​​community has spread the word called how much artificial intelligence, there has, so that is a lot of work that requires a lot of labor. For rapid iteration of our products, we build custom platform. After you enter the amount of the original picture, the data will be quickly enhanced with these pictures children, and model-based training, get a API interface can be deployed. We will put our model by model market share out.

Customers can continue our model fine-tuning, they form a unique model, or a model to provide these services.

image.pngTaking into account the need for speed and performance. Here is our actual scene, that is, from the identification of text video, frame by frame basis if we detect, then backward and forward linkages, it can greatly improve performance. We can look at this this video. Which part of the video text is missed, or basic identification does not come out, but we based on backward and forward linkages, you can fix a portion of erroneous results. But there is a problem, because most of the video, one second about 25, if we frame by frame to identify it, will greatly increase our costs. So, we will draw a frame-based optimization measures to improve the speed and so on. If we do not consider the product cost and speed, it often does not make sense.

image.png

Data in awe, deep learning era of data is important, but it was not to greedy, in a proper way. We strictly abide by local laws in overseas markets, such as privacy protection regulations follow EU GDPR other pictures after running in memory, returns a result, the picture directly destroyed. This is Huawei's 30 years of experience left the experience and lessons learned, if we pay no attention to this, will often cause significant economic losses and reputation on.

Even more frightening is that many times it is possible to convert this into a political event for our company, our team have had an impact and immeasurable loss. And China in recent years more and more attention to privacy protection, recently have the relevant legislation. This may be with my work experience is also related to security, before we give France Telecom, Deutsche Telecom Huawei cloud PaaS goods and services when any privacy issues that are likely to involve a lot of things. Another example, say 5g, Huawei is now just repeatedly proved to the world that we are safe, it is great respect for customer privacy.

image.png

Program to go along with the demand. This is the actual situation of our education. The very beginning, our service is in the form of API, for customers to call more in the cloud. But we later met a lot of customers, such as financial, such insurance, for example is health care. Although they believe that Huawei will abide by the Data Protection Act. But they still will not put data out of the system. So this time we need to consider some other mode of service, for example, is a side, for example, is an end side.

In addition to the cloud server side, we will provide server-based Atlas and other sides in the side. Customers do not need their data out of the system can be carried out at the side of reasoning, if the customer is not so high performance requirements, but also can use some of the equipment side of the end, for example Hilens box, such as smart cameras. Now we do is deploy collaborative cloud-side end. Based on customer needs adjust their business models, but also one of the key products of success.

image.png

Last point is the product many times just products. In the first half of this year, Huawei's cloud OCR assist the team organized a multi-cultural heritage of Chinese calligraphy competition scene recognition, which is part of the digital Chinese competition. Digital China Competition China Ministry of Industry and the Fujian Provincial People's Government guidance, the theme is software enabling digital economy, innovation-driven digital China. This is part of Digital China, Digital China is *** recent years to make is build China into a digital society.

Our contest topic and assistance, data, Q & A, site reviews, such as full participation. Our original intention was both publicity Huawei's products, but also see if you can get some new technology. But then, classical lines of many colleges and universities after the contest. Or museums have found us, our common hope to do some ancient text recognition program, helping them to solve the problem. Because many people are reluctant to engage in this work, and a lot of people know the word is not a lot, such as Xiaozhuan.

Of course, also encountered a lot of difficulties, for instance, is a lot of ancient calligraphy text, in order to evade, write more or less wrote several programs. Here I mention this particular thing, that is a lot of times we do the product, is used to make money. But many times can also be used to do some better things, for example, is to solve some social problems, such as to a number of cultural heritage. Of course, it may be that we as a technician feeling it. The things we are very pleased and proud that so specifically mention individually.

image.png

Author: BlackMoon

Guess you like

Origin www.cnblogs.com/huaweicloud/p/12525887.html