AV1: Provides an open, free video codec tool for the Internet

640?wx_fmt=jpeg&wxfrom=5&wx_lazy=1&retryload=1


From academic research to entering the industry, Zoe Liu has been working in the field of algorithms and audio and video, and is currently developing support for the codec AV1 in the Google codec team. Zoe talks about the criteria for evaluating codecs, and the latest progress on AV1. This article is one of the "Next Generation Encoders" series of interviews. You are welcome to recommend or recommend technicians to join the "Next Generation Encoders" series of interviews, please email [email protected].


Text / Ant


LiveVideoStack: Please briefly introduce yourself, your current main work direction, and which technologies or fields are you interested in? Is it a coincidence that you have been doing multimedia-related research and development for more than 10 years, or is it just a matter of interest?


Zoe Liu: I am currently a software engineer at Google, mainly engaged in the design and implementation of video coding and video communication algorithms. He has been engaged in theoretical algorithm research in Bell Labs, Nokia Research Center, and HP Labs, etc., and then transferred to the industrial field, and participated in the design and development of the following video calling products. Launched: Apple's FaceTime, TangoMe Video Calls, and Hangouts Video Calls for Google Glass. Currently working on the next generation of Open Source, Royalty Free (Open Source, Royalty Free) AV1 video coding standards at Google. I have been working on video coding and video communication since school. It was a coincidence and interest.


LiveVideoStack: What codec is a good codec? Video quality, bit rate, algorithm complexity, robustness to data loss or errors, etc.


Zoe Liu: The fundamental development of video coding and decoding lies in the continuous improvement of video compression efficiency, that is, the pursuit of the lowest possible bit rate under a certain video quality, or the pursuit of the best video quality under a certain video bit rate. The evaluation of video quality has traditionally used the peak signal-to-noise ratio, although in many cases this indicator cannot be consistent with the subjective evaluation results of the human eye. Video quality evaluation itself is a very active area of ​​research.


Different application scenarios of video products determine different evaluation strategies for video codecs. In the fields of video broadcasting, live video and other fields, the decoding efficiency and performance of the decoder are one of the key links. In application scenarios such as video calls and video conferences, the efficiency of the encoder is as critical as the performance. At present, video code streams have periodic embedding of key frames. The key frame adopts intra-frame prediction, and its codec is independent of other frames, so it can be used to synchronize frames, effectively recovering and correcting errors, but it usually consumes a lot of code rate. In addition to the use of key frames, another effective fault tolerance strategy is to use the ACK/NACK of the data link layer combined with the long-distance reference frame of video coding. Once the network has errors (packet loss due to network congestion), it can be used It is confirmed that the successfully transmitted reference frame is used for inter-frame prediction to generate a synchronous frame, and the coding efficiency will be significantly improved. However, the sending and receiving of ACK/NACK depends on state parameters such as the round-trip delay time of the network. Other error-tolerant transmissions, such as Forward Error Correction Coding (FEC), are very effective when the probability of network packet loss is less than a certain limit, but the error correction performance is greatly reduced when the packet loss is severe. The coding efficiency of a codec and its fault tolerance are often contradictory. Algorithms for coding efficiency mostly benefit from the adoption of multiple predictive and context-based coding tools that are very sensitive to network errors. Therefore, the development of any codec will enhance its fault tolerance at the expense of a certain coding efficiency.


LiveVideoStack: You shared AOM alliance and AV1 codec at LiveVideoStackCon, can you review it here?


Zoe Liu: First introduced our video encoding team at Google. Our team is part of the Chrome Media division. The mission of Chrome Media is to provide open and free multimedia compression technology for multimedia applications on the Internet. In terms of video, our products mainly include VP8, VP9, ​​and AV1, of which AV1 is currently jointly developed with partners of the Alliance for Open Media (AOM). In addition to video, our products include the still image compression standard WebP, the audio encoder Opus, and the Draco encoding software specially developed for 3D graphics data.


Diversified video applications have shown a trend of blowout in recent years. At present, the backgrounds of video application providers are also very different. When considering compression software, the consideration of cost and demand has also become diversified. This is why the compression industry is in a single In addition to the international standards, product diversification is also required, allowing users to make their own choices.


Google has always adhered to a concept: all technologies that lay the foundation for Internet applications should be open and free, such as the Chrome browser, and the Android system is an example of this concept. Advanced open-source, free video codec technology can bring the greatest possible development in video-related fields, especially for small content owners and corresponding enterprises in the era of fierce Internet competition, providing more equal opportunities and equal operation with large companies counterbalance, thereby promoting the development of a richer and more diverse Internet market.


In 2013, the birth of VP9, ​​in terms of compression efficiency, achieved a 50% reduction in bit rate performance compared with H264. In addition to the basic 8-bit, 420 format, it also supports higher pixel precision and multiple color space sampling formats. Up to now, billions of terminal devices support VP9; browsers such as Chrome, Firefox, Edge, and Opera all support VP9; on mobile phones, Android 4.4 or higher also supports VP9. VP9 also has a very wide range of support on home entertainment devices such as TVs, game consoles, and digital TV sticks.


VP9's original customer was Google's video-sharing site YouTube. From 2013 to the present, in addition to drastically reducing bandwidth costs, VP9 has created more opportunities for YouTube's business. In its first year of use, 2.5 billion hours of video based on VP9 compression were streamed on YouTube. At present, the average daily viewing of VP9 videos on YouTube reaches more than 2 billion times. The use of VP9 compression greatly reduces the playback start delay time (the time spent on the first screen) (by an average of 15%), and at the same time, the buffering efficiency is greatly improved. In the mature online video consumption market, VP9 has increased YouTube's market share by 25%, and in the immature market, it has increased by 100%. Especially in an immature market constrained by bandwidth constraints, the number of YouTube HD videos played increased by as much as 25% after VP9 became the dominant codec.


In 2015, Google promoted the establishment of the Alliance for Open Media (AOM), which is committed to developing a new generation of open source, copyright-free media formats, as well as corresponding codec technologies. At present, AOM board members have covered more than 33 technology giants such as Adobe, Amazon, AMD, Broadcom, Cisco, Facebook, Google, Hulu, IBM, Intel, Microsoft, Mozilla, nVIDIA, Netflix, nVidia, etc.


LiveVideoStack: What advice do you have for fresh graduates or technicians who have switched careers from other R&D fields to learn codec and multimedia development? Can you recommend some books and materials for systematic learning of encoding and decoding and multimedia development?


Zoe Liu: The basic framework of the current popular encoding and decoding technology is inter-frame motion vector prediction + two-dimensional transformation + entropy encoding. Of course, the development of artificial intelligence will inevitably update or subvert this framework. To understand the encoding and decoding technology, there are good summary articles on Weibo, WeChat and Zhihu platforms in China. If you need to learn more about the various modules and technical details of encoding and decoding, it is best to have some basic knowledge of image processing, signal processing, and information theory. It is recommended to read some summary papers on H264/HEVC/VP9 in IEEE journals. At present, there are many open source codes for video encoding, which can be downloaded for trial operation, so as to have a more intuitive concept of encoding and decoding.


About respondents


Google (Google) software engineer, mainly dedicated to the algorithm design and implementation of video coding and video communication. He has been engaged in theoretical algorithm research in Bell Labs, Nokia Research Center, and HP Labs, etc., and then transferred to the industrial field, especially participated in the design of the following video calling products And launch: Apple's FaceTime, TangoMe Video Calls, and Hangouts Video Calls for Google Glass. Currently working on the next generation of Open Source, Royalty Free (Open Source, Royalty Free) AV1 video coding standards at Google.


0?wx_fmt=jpeg

Zoe Liu Google (Google) Software Engineer


This article is the fifth in a series of interviews called "Next Generation Codecs", which will discuss the evolution and application of codecs around engineers from industry and academia in the field of video codecs. Welcome to recommend or recommend technicians to join the "Next Generation Coder" series of interviews, please email [email protected]


LiveVideoStack recruits community editors


LiveVideoStack is a technical community focused on audio, video and multimedia development. It helps technicians grow and solves technical problems in enterprise application scenarios by spreading the latest technology exploration and application practices. If you intend to contribute to the development of audio, video and multimedia development, you are welcome to become a member of the LiveVideoStack community editor. You can translate, contribute, interview, provide content leads, and more.


Contact via [email protected] , or reply to "community editor" on the LiveVideoStack official account for details.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324088958&siteId=291194637