Interview with Tencent technical expert Zhang Xianguo: A veteran of video coding for more than ten years, he still maintains awe of technology

e7a14508bbcc65185800f6b61904b8b9.png

Interviewees |
Planned and edited by Zhang Xianguo | Aris Wang


e044940bad88ecb7e270d5d91a701259.jpeg

When I saw Zhang Xianguo himself again, he was still frozen in the impression of our previous interviews, and he would talk endlessly every time he talked about topics in the technical circle.

Recently, the release of Vision Pro ignited a new era of spatial computing. As the technical director of Shannon Lab of Tencent Cloud Architecture Platform Department (hereinafter referred to as Shannon Lab), Zhang Xianguo shared with us the latest progress and layout of Shannon Lab in video codec and spatial media processing capabilities.

"Shannon Lab is considered to be relatively early in the layout (in the industry) of spatial video 8KHDR, MV-HEVC and other encoding capabilities in spatial media processing," Zhang Xianguo said . For example, the 8K high-speed high-definition real-time encoding capability. Long before the release of Vision Pro, Shannon Labs had already built a high-compression 8K, HDR, 422 format, and 130mbps real-time transcoding system for the large-screen field of radio and television. . In related project bidding, relying on this system, Tencent Cloud is the only company that can meet the objective quality indicators of transcoding.

Likewise, Shannon's lab was exploring MV-HEVC's experience with glasses-free 3D systems long before Apple publicly supported MV-HEVC hardware encoding. Various experimental results show that under Internet applications with large key frame intervals, MV-HEVC can further save 20% of the 3D video transmission bandwidth .

Zhang Xianguo has been involved in the video codec industry for more than ten years. He pointed out that what attracts him most in the field of video coding is that it is a field that is constantly pursuing perfection and has a very complete evaluation system, even a small optimization can be seen; secondly, it is closely related to everyone's life , Doing a good job in video encoding and decoding can bring you a tangible experience improvement.

For Zhang Xianguo, video codec has always been an area where he can fully pursue technology and work value.

In the past few years at Tencent, Zhang Xianguo's perception of technology has also continued to change. At first, Zhang Xianguo believed that since technology is to be developed, it must be industry-leading.

But now, with the in-depth integration of technology research and development and business departments, Zhang Xianguo began to realize that the correct direction of technology development is by no means to align with the "top industry" .

" It is one latitude for technology to win the first place in the competition, and another latitude for technology to help business solve problems ," Zhang Xianguo said. Whether technology can find the real pain points of the business - this is the most important issue that Zhang Xianguo and the entire team are currently most concerned about and consider the most important.

As Zhang Xianguo said: "Don't bind the business, but solve the pain points."

The following is a wonderful conversation we had with Zhang Xianguo recently, which was edited and edited according to the request of the interviewees, with some deletions:

6c730c5c9546fc9e990b76a3e1c7cca7.png

01

Shannon's landing progress and status

LiveVideoStack: I am very concerned about the latest situation of Shannon Lab. Whether it is the progress of the team or Tencent's self-developed codec, such as the recent progress of Tencent V265, can you tell us about it? Paying attention to the latest progress, can you tell us about the efforts of the teams behind it?

Zhang Xianguo: Since 2017, Shannon Lab has successively led the research and development of V265 server and terminal encoding, TXAV1 codec, and developed the HEVC Canghai chip in cooperation with brother teams, supporting the rapid development of Tencent Cloud MPS on-demand, live broadcast, and RTC services. Recently, Shannon's lab has also made some progress in solving the five technical difficulties mentioned later:

In terms of ultra-high definition, Tencent V265/TXAV1 already supports 8K@10bit@422@130mbps level 1U server-side real-time transcoding; while satisfying low-latency real-time live broadcast, the compression rate can still be kept 10% lower than that of x265-medium ; For ultra-realistic 3D applications, Tencent V265 first supports the hard-solvable MV-HEVC encoding of Vision Pro, which can save more than 20% bit rate compared with dual-viewpoint independent encoding;

In terms of ultra-low latency, we have realized high-quality zero-latency transcoding by optimizing the code rate control of the Canghai chip, improving the experience of cloud games and other scenes with high latency requirements;

In terms of ultra-real-time interaction, Shannon Lab has developed high-performance and high-compression terminal encoding software, which can meet the real-time requirements of various terminal RTC applications and save more than 25% of the code rate;

In terms of ultra-high compression rate, we not only continue to optimize the compression rate of V265/TXAV1 and Canghai, but also face long-term storage scenarios of video images, and self-developed a private format TVC, hoping to save massive video storage costs through lossless compression or shallow compression , and can also be used in other business scenarios that require high compression rates and can accept software decoding.

LiveVideoStack: Where will you and your team spend more energy and focus in the first half of this year? For example, the landing of the encoder?

Zhang Xianguo: The focus of work is divided into two directions - one is business implementation, and the other is new technology planning .

The first aspect of the team's focus this year is indeed the implementation of the encoder, including the promotion of the new generation of TXAV1 video and image encoding and decoding in the company's internal and external business, and the V265 encoding in various SDK cloud sales scenarios to meet business live broadcast, on-demand, RTC capability requirements. The latter is okay, because the ecology of H.265 is very sound. As long as the requirements are clear, the collaborative contributions of team members will definitely be completed on time. The investment in TXAV1 business landing is more.

LiveVideoStack: Next, let you talk about the implementation of TXAV1 and the new technology you have recently paid attention to.

Zhang Xianguo: The first is the compression rate requirement; compared with the 265 encoding that users have upgraded, TXAV1 must have obvious compression rate savings in various scenarios, especially in live broadcast scenarios. The live broadcast capability exceeds V265 in an all-round way. In the actual live broadcast scene that is much higher than the 30fps requirement of the MSU competition, TXAV1 can also save 10% bit rate compared with V265.

The second is the codec ecology; we are pleased to see that the proportion of AV1 decoding is rapidly increasing. Most of the MTK chips shipped after 2021, the newly released Qualcomm flagship chips in 2023, and the Samsung flagship chips after 2022 have already supported AV1 hard solution, andriod/browser kernel's soft solution system support for AV1 pictures and videos has also been fully rolled out.

But what is more realistic at the moment is that the native software solution capability of Android is not yet perfect, and iOS can only support video software solution under the AV1 real time tool, so we are required to invest more in codec collaborative optimization to provide decoding A software decoding library with faster speed and lower CPU consumption, the higher the playback coverage of the online TXAV1, the more motivated customers are to upgrade.

After long-term efforts, TXAV1 decoding has the performance equivalent to 265 soft decoding, and through collaborative optimization with the player app, it supports smooth playback of various playback cores including exoplayer. In the past six months, more than a dozen businesses have accessed our self-developed AV1 video and AVIF image codecs through Tencent Cloud MPS, and we are still working hard.

In addition, the new trends in the current industry, the combination technology of spatial multimedia processing and video coding represented by Vision Pro, is also the direction we are gradually exploring .

e02ccb398d86e4a4b1301e8b37ea375a.png

LiveVideoStack: So far, what technical difficulties have you and your team found to be overcome? Including what are the technical difficulties that are currently being overcome?

Zhang Xianguo: From the perspective of codec, in fact, we have been studying these five technical goals-ultra-high definition, ultra-realistic, ultra-low latency, ultra-real-time interaction, and ultra-high compression rate .

Ultra-high-definition services such as radio and television, VR, etc. require self-developed server encoders to have real-time low-latency encoding capabilities of 8K+HDR+high bit rate+non-distributed server real-time+high compression rate. quality compressed in real-time into a low-loss video feed for Internet distribution. During VR 4k/8k live broadcast, the encoder must ensure that even in the 10-40m bit rate range that the network can bear, the video quality can still be high-fidelity.

Realistic applications such as 3D video require self-developed encoders to support realistic extension formats such as MV-HEVC/3D-HEVC/360degee, and have sound and high-compression coding capabilities, as well as sound transcoding links.

Ultra-low-latency services such as cloud games and RTC require self-developed encoding chips with low-latency, high-concurrency and high-compression rate code control capabilities, and require self-developed RTC terminal encoders to have higher-definition real-time capabilities and higher compression rate format support And advanced rate control capabilities.

High-traffic long-short video and high-compression on-demand applications require a combination of processing and coding solutions that can support high-compression perceptual coding capabilities, require coding chips with low-cost and high-compression characteristics, and require advance research on the capabilities of commercial encoders. , Use self-developed codec in codec closed-loop business to achieve higher compression rate.

These technologies need to have sufficient technical reserves before the new applications are launched on a large scale, which poses a great challenge to the development of the coding team. Our focus is also on improving the commercialization capabilities of these scenarios.

LiveVideoStack: Under the current industry background, what is the development route and thinking of the codec team?

Zhang Xianguo: We will continue to inject impetus into the development of the team from the following three perspectives.

First of all, long-term persistence, group cooperation, and solid foundation: We will work together to consolidate the coding foundation of each standard, and take into account multiple optimization goals through group collaboration according to different research directions, and summarize in industry competition and business polishing Experience feeds back basic coding skills.

Secondly, customize and optimize close to the business: the codec has entered the stage of not only evaluating the encoder based on the objective results of the large test set, but also entering the stage of evaluating the video processing capability in multiple vertical fields combined with subjective and objective. Therefore, the coding team needs to have processing and coding collaborative optimization, CAE around the coder, and customization and optimization capabilities for subjective evaluation indicators, and subdivide typical vertical scenarios such as screen sharing, live broadcast with goods, games, short news videos, etc. for customization Subjective and objective optimization.

The third is to form a joint force with other technologies to create word-of-mouth products: to form a competitive advantage and give full play to the capabilities of codecs, it is necessary to comprehensively consider end-to-end systems, security services, transmission acceleration, and image quality enhancement. Taking 3D video services as an example, the entire 3D realistic content processing includes complex links such as image quality enhancement, parallax generation, HDR color correction, live on-demand transcoding optimization, 3D encoding rate control and standard support, decoding and rendering. Interdependence cannot be isolated.

As competition in the industry intensifies and new audio and video business opportunities including spatial computing begin to emerge, Tencent Cloud will integrate various media processing capabilities including Shannon Lab's codecs to provide one-stop edge security such as Tencent EdgeOne High-performance solutions that accelerate services to empower the entire industry. In the field of the overseas market, which is difficult to sell as a whole, the codec team will combine the component authorization service of Tencent Cloud MPS SDK to expand revenue and get rid of involution. The larger the market size, the greater the room for technology optimization and talent attraction.

02

The "unintelligent" captain also wants to lead the crew to see hope

LiveVideoStack: I believe that in order to continue to consolidate the leading position of Tencent Cloud's self-developed codecs in the industry this year, Tencent's internal business group team has a new goal plan. If you break down the goals, what are they divided into?

Zhang Xianguo:
The five major optimization goals mentioned above may be a bit big. In summary, they mainly include: the continuous optimization of V265/TXAV1 encoding in public cloud on-demand, privatized SDK and other fields, and the optimization of V265/TXAV1 in various live broadcast scenarios In-depth optimization, large-scale landing of Canghai chips in cloud games, live broadcasting and other fields, technical reserves for extended scenarios such as 8K\3D, optimization of encoding and decoding complexity of next-generation private formats, and terminal encoding at higher definition, AV1 Capability expansion on new encoding formats.

LiveVideoStack: Until now, how much homework do you think Shannon Lab still has to do on the road of video coding optimization? Have you reached the optimal and ideal video compression standard?

Zhang Xianguo:
In fact, it is still far away. First of all, the existing standard server and terminal encoding software, the compression rate of V265/TXAV1 has been continuously improved. Taking TXAV1 as an example, although AV1 has taken the lead of V265 in the live broadcast business, there is still room for optimization; in addition, we need to optimize AV1 for a long time in the terminal software coding, and make full use of the AV1 standard in compression High rate, screen video coding, variable resolution prediction and other functional advantages.

Secondly, for the next generation of encoding chip products, we are still in the process of collaborative development: In new products, we will further enhance HEVC capabilities, and at the same time increase investment to support the new generation of AV1, VVC standards and various VPU capabilities to support higher compression rate and customized rich multimedia transcoding services.

Take our newly disclosed TVC private encoding format as an example. The application scenarios of the private format are relatively limited, and the entire decoding complexity must be within a controllable range. Therefore, TVC does not use high-consumption intelligent coding tools on the decoding side, and pays great attention to the complexity of software decoding during the iterative process. TVC has fully absorbed the team's successful experience in V265 and TXAV1 research and development in the past six years, and on the basis of fully investigating the new technologies of the latest MPEG and AOM standard organizations, as well as the low decoding consumption optimization experience of new standards such as intelligent coding and shallow compression. The software solution complexity rather than the hardware decoding complexity is optimized for tool design to achieve a balance between encoding compression rate and software encoding and decoding complexity. Although the theoretical complexity of this format is currently controlled within twice the AV1 standard, it still takes time to optimize software decoding, and the encoding compression rate also has a lot of room for improvement, and there is still a long way to go before formal commercial use.

LiveVideoStack: Does Shannon Lab have a set of research paths or a methodology summed up through practice in pursuit of the five goals of ultra-high definition, ultra-realistic 3D, ultra-low latency, ultra-real-time interaction, and ultra-high compression rate? ?

Zhang Xianguo: Different teams actually pursue these five technical goals with different routes. Since Shannon Lab is a team hidden behind the major business teams, the technical route will be longer-term. Try to prepare in advance and lay a solid foundation Then go to accept business tempering and optimization.

Therefore, for each generation of standards, we generally aim at an ultra-high compression rate, optimize the compression rate for offline encoding scenarios, and consolidate multi-threaded design, assembly, data structure access, bit rate control, and pre-analysis under offline encoding. Processing and cost-effective fast algorithms; then based on offline coding, through fast algorithm iteration, real-time coding architecture compatibility, parallel architecture optimization, tool addition, etc., support real-time coding under the same set of codes and extend to 8K\HDR\screen video compression\3D Coding and other ultra-high-definition, ultra-realistic real-time scenes;

Also based on offline encoding, we will build the algorithm prototype of the encoding chip, including defining chip specifications and pipeline architecture, redesigning hardware fast algorithms, hardware-based pre-analysis processing and code rate control implementation, etc., to achieve a balance between compression rate and chip capabilities .

Based on real-time coding, we will choose an appropriate time to start terminal coding. Since terminal coding requires much more complexity than server-side coding, we will create a new code warehouse to tailor functions that are not needed in server-side coding. The data structure is restructured to support the new requirements of terminal encoding, and at the same time, the small-step iteration completes the acceleration of more than several times, and finally achieves high coverage of terminal equipment.

After such a practice process, full support for various scenarios of a new standard from offline encoding, real-time encoding to chip encoding and terminal encoding can be completed in about three years. However, support is one aspect. To achieve industry leadership, it is necessary to continuously improve the speed gears, bit rate control, and data structure corresponding to these encoders to achieve continuous improvement in compression rate, speed, and capability support.

LiveVideoStack: Where does your team's research work come from? What is the motivation?

Zhang Xianguo: Team research and development is divided into two aspects: new field research and business technology optimization. At the level of new field research, academic literature has always been an important source of inspiration.

In terms of business technology optimization, experimental analysis and communication within the team are our main sources of inspiration. By adding a large amount of test and analysis logic to the code, new ideas and methods can be continuously generated based on these experimental data, and finally in the implementation process, the false and the true will be eliminated. In addition, business development needs, industry summits, etc. will also give us a lot of inspiration, which is why we participate in MSU and Livevideostack conferences every year. Through these methods, we can get a glimpse of the leopard, discover new technical directions in time, implement them and make breakthroughs.

Since the establishment of Shannon Labs, the team members have been particularly stable. The main reason is that everyone has the same core motivation and collective honor - to provide industry-leading video codec services. With this as a driving force, everyone can work together to make several types of codec services better. good.

83d892295df61d6f5cd44edc599ccb4c.png

Shannon Lab Team Members

LiveVideoStack: The release of Vision Pro marks the official arrival of a new era of spatial computing. I believe that Apple's influence has inspired your thinking. What spatial media processing capabilities do you think Shannon Lab needs to accumulate until now?

Zhang Xianguo:
For spatial video 8KHDR, MV-HEVC and other encoding capabilities in spatial media processing, I believe our team is relatively early in the layout .

For example, the 8K ultra-fast high-definition real-time encoding capability. Before the release of Vision Pro, we have built a high-compression 8K, HDR, 422 format, and 130mbps real-time transcoding system with software encoding for the large-screen broadcasting field . The system is built with only one 1U server and does not use distributed transcoding based on multiple servers, thus ensuring low transcoding delay and low deployment cost of the transcoding system. In related project bidding, relying on this system, Tencent Cloud is also the only company that can meet the objective quality indicators of transcoding. With the advent of spatial computing, the system will have a wider application space. We will also optimize the product capabilities of 8K ultra-high-speed high-definition products according to the specific spatial computing capability requirements to adapt to more services.

Another example is the support for MV-HEVC encoding. The decoding support for MV-HEVC supported by Vision Pro is actually a hard solution. In theory, the MV-HEVC code stream can be decoded only by supporting reference frame replacement at the chip firmware layer. It is the ingenuity of the MV-HEVC standard design .

Shannon Labs was exploring the support of MV-HEVC software encoding and decoding in the naked-eye 3D system long before Apple publicly supported MV-HEVC hardware decoding, aiming to improve the conference experience of naked-eye 3D. In the context of the advent of space, we timely added MV-HEVC support to our V265-based server-side transcoding system to further meet the needs of 3D video content on-demand and live broadcast. The experimental results show that when the key frame interval is large Under Internet applications, MV-HEVC can further save 20% of 3D video transmission bandwidth. Next, we will optimize these core capabilities from various angles such as bit rate control and rate-distortion optimization.

Shannon is still working on the generation and processing capabilities of spatial video, such as repairing and improving the quality of input 8K video, visual correction of 3D stereoscopic video, etc. The high-quality processing of these spatial videos is not only conducive to the final image quality presentation , will also be closely related to the final encoding compression rate, we still need to continue learning to make up for shortcomings.

e9cf71f36886fa745220825d9c9d8832.jpeg


03

Evolution and Upgrade

LiveVideoStack: How do you view the evolution of video coding technology application scenarios? Like shallow compression?

Zhang Xianguo:
Generally speaking, we believe that the application scenarios of video coding technology must develop in the direction of higher definition, more realistic, lower latency, more real-time interaction, and lower bit rate, but there will be certain twists and turns in the process. For example, since the cost reduction and efficiency increase last year, many businesses have lowered the video transmission resolution. This not only includes low-resolution and high-definition technical optimization, but also considers cost savings for each business.

But the root cause is that existing video applications on mobile devices have no obvious demand for applications such as 8K/3D. From radio and television, Internet TV, long video APP to short video, video applications have gone through several stages, and the evolution of applications must be accompanied by changes in terminal equipment and communication technologies. If spatial computing devices can emerge, I believe there will be new application scenarios, and higher demands will be placed on video coding technology in these five directions.

Secondly, shallow compression is actually a manifestation of subdivision. When most standards are pursuing higher compression rates and higher image quality for each bit rate segment, shallow compression proposes to separately compress the high-fidelity image quality range. Therefore, it is also an important development direction of video coding technology to study the optimization of vertical categories, specific resolutions, and image quality ranges.

LiveVideoStack: How do you predict the future of end-to-end AI codecs, from image encoding to video encoding?

Zhang Xianguo: Shannon Lab has been doing some end-to-end coding research since 2020. With the support of GPU capabilities, we built an ultra-low bit rate "non-fidelity" AI face video as early as 2021. The codec prototype supports 720p, and the code rate is 1/10 of the traditional codec.

Recently, Shannon Lab has also developed multi-camera naked-eye 3D conference live encoding, giving users a super-realistic conference experience. Another example is our private codec TVC, which is also an end-to-end encoding that applies low-complexity AI capabilities to meet the company's closed-loop video and picture storage cost savings needs. These are some application directions of end-to-end AI coding, and there are undoubtedly many possibilities in the application of new products.

2a3669bf27f3a2bdbfbccf894a082331.png

But we should be soberly aware that end-to-end AI encoding and decoding actually has a basic premise—the foundation of end-to-end computing power. To do AI-based end-to-end video image encoding, we must carry out careful and careful scene evaluation. For systems with limited terminal decoding capabilities and non-closed-loop ecology, traditional non-end-to-end coding has the advantages of low computing power consumption and strong device compatibility, and it is still the direction we should focus on investing .

LiveVideoStack: Is it possible to do a transformative video coding framework?

Zhang Xianguo: Nothing is impossible. To do transformative video coding, I think that breakthroughs will first be made in a specific field, such as conference scenarios based on special equipment, such as pulse video coding that can be used for high frame rate monitoring.

We need to see that the video scene is complex and changeable. The existing video coding framework is derived from version 1.0 broadcast video based on satellite signals, set-top boxes and televisions, and is also suitable for version 2.0 based on desktops with limited terminal capabilities. Long-term video-on-demand for low-bandwidth networks, version 3.0 based on mobile terminals, 4G and higher bandwidth live broadcast and short videos.

However, in the next 4.0 era, new equipment and network capabilities, new video shooting and display conditions may stimulate the birth of new coding frameworks, such as pulse cameras, VR headsets, and point cloud videos.

LiveVideoStack: Artificial intelligence technology is becoming more and more mature, how will the future of audio and video technology develop?

Zhang Xianguo: Obviously, the development of artificial intelligence will definitely promote the improvement of audio and video technology in all dimensions from industry application to optimization technology. The development of artificial intelligence technology will significantly increase the video traffic, and will also promote the birth of more audio and video applications. For example, the improvement of AIGC capabilities will generate more self-media content, and the improvement of large model capabilities will enhance the development of video-related applications such as autonomous driving, machine vision, and human-computer interaction.

I dare not talk about the development direction of the entire audio and video technology, but at least in the field of codec, more video bandwidth and more video applications will definitely put forward higher requirements for video codec and processing efficiency. In addition, the encoded information will not be limited to video textures, including video depth information, spatial information, feature information, etc., will also be combined with video textures to pursue a higher compression rate while pursuing a higher realistic experience.

The number of videos that need to be compressed will increase, and the single video information that needs to be compressed will become larger and larger, which will definitely bring new opportunities to us practitioners .

04

Remain in awe of technology

LiveVideoStack: How many years have you been working in video coding research and development? Are you satisfied with your current job status?

Zhang Xianguo: It has been 17 years since he enrolled as a graduate student in 2007. I can't talk about how satisfied I am, but I can only say that I am more gratified that I have never slacked off in the past ten years and have been persisting in this field. At present, dozens of like-minded students can be found in the Shannon Lab to continuously cultivate and polish in the direction of video encoding and decoding. While achieving some achievements, they can also work with industry colleagues to optimize China's commercial encoder capabilities to the international leading level. . In a short life of dozens of years, being able to live up to your youth and benefit the industry at work can also be regarded as living up to the cultivation of schools, teachers and seniors.

LiveVideoStack: In your working life at Tencent, what is the most fulfilling thing in terms of work content?

Zhang Xianguo: If you only point out one thing, I think we are not only leading the industry in technology, but also pursuing social value. The beneficiaries of technology are not only the customers of Tencent and Tencent Cloud, but also the real cost savings of using Tencent services. Consumers who enhance the experience. Video codecs are like the water and electricity of audio and video technology. More advanced codec capabilities can save people’s daily Internet expenses and improve fluency and All kinds of subjective experience including image quality, and save the company's operating costs. Our team worked together with the industry to improve the infrastructure capabilities of "water and electricity" in the domestic audio and video industry. This is the greatest sense of accomplishment .

225fbf6fd8c2063882c1aced07ddb530.png

LiveVideoStack: What is the criteria for your team to recruit talents? It’s another graduation season, what experience do you want to share with students who are preparing to engage in audio and video technology?

Zhang Xianguo: Smartness, learning ability, and sense of responsibility are the three most important requirements we have for recruiting students. Codec technology has also been seeking innovations and changes. The knowledge learned from school is only the foundation. Whether it can keep up with the pace of the predecessors and have the potential to make new ideas is the focus of our investigation. For socially recruited students, engineering ability, comprehensiveness and technical expertise are my key considerations.

I dare not say what ability is required to engage in the entire audio and video technology, but to build a leading coding team, it is necessary to recruit students and train team members after two or three years to have outstanding characteristics in one or more of the following aspects, specifically including: data structure And frame optimization, assembly and data flow optimization, parallel mechanism and loss control, fast algorithm for mode decision, code rate control and pre-processing algorithm, understanding and implementation of standard tools, non-standard compression rate improvement algorithm, codec chip architecture design, codec Chip CMODEL implementation, decoder design and optimization, terminal programming and development capabilities, business and transcoding system building capabilities, quality evaluation and subjective optimization skills, and flexible application of machine learning tools.

LiveVideoStack: Given another chance, would you still choose this major? Would you still choose to work in this field?

Zhang Xianguo: Majors and industries are different. Generally, people will take the initiative to choose different industries such as computer, biopharmaceutical, battery technology, etc., but when it comes to specific majors, it often chooses you. For example, it is also a coincidence that I do video codec: When I was an undergraduate student in the Department of Computer Science and Technology of Peking University, the professional course I was most interested in and got good grades in was digital logic. Structural Laboratory - the early domestic team engaged in self-developed CPU chips. Since the multimedia coprocessor in the CPU chip is also an important component, there are many mistakes. My first project work is to learn the principle of video encoding and decoding, and do some optimization of MPEG-4 encoding and assembly instructions.

When I found that I was still more suitable for this technical direction of back-end detail optimization, I was determined to lay a solid foundation and then joined the National Engineering Laboratory for Digital Video Codec established by Academician Gao Wen at Peking University. The first Ph.D. Therefore, in the choice of major, I believe that opportunities will be equally distributed to everyone. The key lies in whether you have enough discrimination to judge what is the most suitable direction for you, and then use your sense of responsibility, learning ability and hard work to grasp these opportunities. Chance.

Talking about the end, Zhang Xianguo is deeply touched by the recent "technological explosion". For non-professional fields, Zhang Xianhui suddenly discovered that technologies such as large-scale models and brain-computer interfaces will explode to generate extensive market value. Secondly, for the fields he is familiar with, he also found that competitors are making very fast progress. This also confirms that in the Internet industry, new optimization techniques always have a short shelf life, and it is necessary to keep investing and accumulating more new technologies in order to continue to stay ahead.

At the same time, he ended with this sentence——

"Technology is sometimes a breakthrough in an instant, so we must always be in awe of technology and maintain sufficient awe and attention to competitors."

Hi, LVS has recently rethought the official account's original content column. In the future, we hope to use this column to discover new changes in the multimedia ecology, pain points and how to promote the continued prosperity and development of the industry, and will produce more original in-depth report articles.

If you are curious about the multimedia circle, or are fighting at the forefront of industry development, and are willing to share your insights and thinking, especially if you have clues to select topics to break the news or seek reports, please contact the author. WeChat: XinWell0709  
Email: [email protected]

809cd3066e1524a5ca7c00324e89f1f7.png

9f32b729582a57b3174c856ae65817f3.jpeg

Click " Read the original text " 

Jump to the official website of LiveVideoStackCon 2023 Shenzhen Station for more information

Guess you like

Origin blog.csdn.net/vn9PLgZvnPs1522s82g/article/details/132157703