AV Evening Talk #22: AI picture encoding is on the road, Khronos and multimedia

Last week, I communicated with Liu Dong from the University of Science and Technology of China on the current situation of AI coding. AI coding can be divided into two categories, the first is the combination of AI and traditional coding, and the second is end-to-end AI coding. The former has been widely used in various scenarios, and the giants have their own related products, which are mainly optimized through AI in the front and rear processing parts. The latter can be divided into picture coding and video coding.

International standards for end-to-end AI image encoding will be finalized soon, including JPEG-AI and IEEE 1857.11. Liu Dong introduced that at the standard meeting, a company demonstrated end-to-end AI image decoding and encoding running on mobile phones, which proved the feasibility of end-to-end AI image encoding and decoding from one aspect. Liu Dong also introduced an app that can experience end-to-end AI image encoding — Picture Album, which is currently only available for download on the iOS platform. I simply tried the small photo album, which can compress the original image of 1-2M to more than 100 KB. The subjective difference is not big, and the experience is still good. Liu Dong believes that end-to-end AI image coding will soon become popular. This is mainly because pictures do not need to be streamed like videos, and the requirements for latency and computing power are not high, so it is easy to implement on mobile devices. Liu Dong predicts that end-to-end AI image coding will first be applied in closed-loop scenarios, such as security monitoring, medical imaging, remote sensing, etc., including products from giants that control the entire link.

End-to-end AI image encoding can only be achieved by hardware acceleration on the mobile side, and Apple, Google, Huawei, etc. all have their own implementations. Although hardware, software and standards are available, many challenges remain. First of all, Liu Dong believes that the biggest problem is how to balance computing power and coding efficiency . Secondly, interconnection and interoperability are mainly due to the fact that the AI ​​acceleration chip is not unified, and the upper-layer software ecology is not unified. Even if the maximum compatibility is considered in the standard formulation process, a certain amount of image decoding consistency has to be lost. Third, end-to-end AI coding may have security loopholes leading to malicious attacks. From this point of view, end-to-end AI image coding must solve these difficult problems in engineering practice.

Speaking of the future, as the model is continuously optimized and gradually solidified, opportunities will be created for proprietary AI codec chips, and proprietary ASIC chips can be expected.

Regarding the quality assessment of end-to-end AI image coding, Liu Dong believes that this is a direction worth investing in.

Finally, Liu Dong talked about end-to-end AI video coding. End-to-end AI video coding is being researched by the MPAI organization (presided over by Leonardo Chiariglione, the former chairman of MEPG). He thinks the road is still a long way off. But maybe in specific scenarios, such as video conferencing, security monitoring, desktop sharing, etc., it can be applied faster.

More dialogue essence can be reviewed through the "Live Replay" tab in the LiveVideoStack video account.

98a0963860b58c78588e3e475261e0c6.jpeg

At 7 pm on August 11, we will invite Fu Shixiong, who is in charge of the Greater China region of Khronos Group, to talk about Khronos Group and multimedia, as well as business and plans in China.

Guess you like

Origin blog.csdn.net/vn9PLgZvnPs1522s82g/article/details/132241832