Hehe Information CCIG2022 technology sharing: document image quality enhancement is an important research direction for advanced OCR

 

Recently, the 2022 China Image and Graphics Conference (CCIG 2022) ended successfully in Chengdu. The conference is guided by the China Association for Science and Technology, sponsored by the Chinese Society of Image and Graphics, organized by Sichuan University, and co-organized by the University of Electronic Science and Technology of China. Academicians Pan Yunhe, Zheng Nanning, Gao Wen, Dai Qionghai, Wang Yaonan, Qiao Hong, etc. More than 100 well-known domestic scholars, as well as technical experts from companies such as Baidu, Huawei, OPPO, and Hehe Information, discussed the academic research and technological innovation trends of image and graphics, and sought new developments in the industry. The number of participants exceeded 1,500.

The Chinese Society of Image and Graphics was established in 1990. It is a national first-level society approved by the Ministry of Civil Affairs. It is composed of Chinese experts and scholars who are engaged in basic theory and application research of image graphics, software and hardware technology development and application promotion, and related technologies. workers composition.

The conference will be conducted in the form of keynote reports and theme/featured forums. Among them, in the keynote report, Academician Pan Yunhe, academician of the Chinese Academy of Engineering and professor of Zhejiang University, introduced the work related to visual knowledge and visual intelligence, and explained the important role of visual intelligence in the development of artificial intelligence.

The conference also set up academic forums covering OCR, image understanding, computer vision, human-computer interaction, brain-like vision, AR/VR, 3D vision, pattern recognition, etc. Dr. Ding Kai and Dr. Guo Fengjun, Director of Image Algorithm Research and Development of Hehe Information, were invited to participate in forums such as "OCR Frontier Technology and Industrial Application" to share the frontier progress in the field of OCR and the large-scale application of technology.

Dr. Ding Kai believes that although OCR technology has gone through a century of development, there are still problems that need to be solved, such as serious degradation of document image quality, difficulties in text detection and layout analysis, low recognition rate of unrestricted text, and poor ability to understand structured intelligence. question. On the road to advanced OCR technology, the enhancement of document image quality is an important research direction. It is necessary to overcome common interference conditions in modern text image processing such as page bending, shadow occlusion, moiré, and image blur. By introducing AI (artificial intelligence) technology, Hehe Information's intelligent text recognition and image processing technology can help various application fields simplify downstream document processing tasks and improve the efficiency and accuracy of text recognition.

Taking bending correction as an example, Dr. Ding Kai introduced the principle, advantages and disadvantages of the method based on text line fitting and coordinate transformation, and text line optimization correction, and mentioned the advantages and disadvantages of the method based on displacement field network learning adopted by Hehe Information. The system architecture can effectively solve the correction problem of various curved document images. In addition, in the field of education, Hehe Information's "handwriting erasing" technology integrates content segmentation, handwriting separation network, and document quality enhancement technology to accurately process complex scenes and realize "one-click erasing" of homework and test paper notes. remove".

Demonstration of "Writing Erasing" function: test paper covered with writing (left) and test paper after "erasing writing" (right)

 

On the other hand, the establishment of document digitalization process is the key to accelerating the digital transformation of enterprises, and it is also a pain point in the implementation of technology. In order to better solve the problems of complicated document layout, lack of training samples, long and low efficiency of model customization and tuning in different businesses, Hehe Information has launched the TextIn Studio intelligent text recognition training platform, which can integrate underlying resources, data , model training, integrated deployment, and service management application modules are integrated together to solve various problems in a targeted manner while establishing a closed loop between business processes to realize automatic model training and deployment. TextIn Studio has produced a large number of document digital models in different scenarios, involving nearly a hundred kinds of document image preprocessing, text recognition and understanding, document format conversion and other services, covering more comprehensively the types of documents related to the work and life of enterprises and individuals.

It is reported that the "writing erasing" function has been connected to Huawei PixLab V1 color inkjet multifunction printer. Related technologies have also won championships in more than ten competitions such as ICPR and ICFHR, and have been published at international top conferences such as CVPR, AAAI, ACL, and ACM MM.

                             The participating team of Hehe Information won the championship in the finals of the 3rd CSIG Image and Graphics Challenge

The conference also held the award ceremony of the third CSIG Image and Graphics Challenge Finals. The CSIG Image and Graphics Challenge aims to promote the development and application of my country's image and graphics technology and related industries, solve technical problems faced by enterprises, and help enterprises introduce more outstanding talents. The competition attracted hundreds of participating teams from universities, scientific research institutions and enterprises. Relying on its algorithmic advantages in the understanding of key visual information and the practical accumulation of multilingual bill recognition scenarios, the team composed of Hehe Information, universities and business ecological partners won the single championship in the "Chinese and English Shopping Receipt Information Understanding Track". Through the final round of on-site competition, won the championship of the CSIG Image and Graphics Technology Challenge.

Guess you like

Origin blog.csdn.net/INTSIG/article/details/126503985