In the future Spring Festival Gala, you may not need a live host

https://mp.weixin.qq.com/s/NXGSLylqaItkcCmfUBbIuA

By 超神经

在 1 月 28 日的 2019 网络春晚上,AI 虚拟主持人登台主持节目,撒贝宁作为与 AI 替身同台的主持人之一,当场宣布准备准备「隐退」, AI 取代主持人的时刻真的来临了吗?

On this year's Internet Spring Evening, four virtual image hosts appeared for the first time, based on the appearance of Sa Beining, Zhu Xun, Gao Bo, and Long Yang. The four hosts all hosted on the same stage as their virtual host, which really added a lot of fun to the party.

The quick-mouthed Xiaosa is almost out of lines in front of AI Xiaosa

It can be seen from the video that there is a great similarity between the virtual AI host and the entity. In addition to not losing humans in answering and shaking the phone, there is also corresponding body language, and each virtual host has a different The expression characteristics.

What is a virtual host

According to reports, this time the virtual host of the Internet Spring Festival Gala is ObEN through 3D image reconstruction and electronic sound simulation technology to build a virtual image, creating a personalized artificial intelligence virtual image-PAI (Personal AI) ).

In addition to the virtual host of this online Spring Festival Gala, ObEN also cooperated with celebrities and star companies. SM Corporation, Korea's largest entertainment company, is one of ObEN's earliest angel investors.

In June 2017, ObEN cooperated with South Korea's SM Entertainment Company to establish the world's first artificial intelligence star copyright company AI Star in Hong Kong, which is called Magic Star in Chinese, to create a virtual idol. The domestic female idol group SNH48 also announced a collaboration with ObEN to create an exclusive image of artificial intelligence.

In the future Spring Festival Gala, you may not need a live host

The core technology of this product is mainly in three aspects:

First, it is the establishment of a visual image. Using photos of the host's body, using 3D scanners, 3DS MAX, MAYA and other 3D modeling tools, let AI build an algorithm model, and finally build a face and body shape by understanding the relationship between the color distribution of the photo and the depth of the structure.

The second is the synthesis of sound. The AI ​​voice technology they use does not require a large collection of voice libraries, only a dozen sentences of voice recording, and the establishment of a voice model can be achieved through methods such as feature parameter extraction and transfer learning.

Finally, make the virtual host as realistic as possible. Not only does this require visuals to match their sounds, but it also needs to personalize them. Through sensors and motion tracking equipment, combined with AI and motion capture training, the virtual host can simulate spoken language, facial expressions, gestures, body movements and scene interactions in accordance with the corresponding body.

Although this is the first time that AI has hosted a crossover, there are really many appearances in the AI ​​virtual anchor world.

The first Chinese AI news anchor

At the Fifth Internet Conference held in November 2018, Xinhua News Agency showed one of their new reporters. This is the first news anchor of AI virtual synthesis. Qiu Hao, the prototype of AI, said: "Image It’s my image, and the sound suits my voice, but I have never said the words that were broadcast..."

In this video, we can see the anchor in the video, introducing himself in his accent, with the voice, his face and lips have corresponding movements. I have to say that there are surprises, but there are still some disappointments. Compared with live anchors, people can tell at a glance that it is a product of mechanization. The way it works requires humans to write press releases.

This co-technical support comes from Sogou’s "clone technology." The key behind this technology is speech synthesis and image generation.

In the future Spring Festival Gala, you may not need a live host

Speech synthesis technology can use a small amount of audio data to allow the machine model to learn the speaking characteristics of the input object, grasp the timbre, rhythm, emotion and other aspects, and finally realize the audio information of the input text.

The image generation uses the learning and construction of face recognition, three-dimensional face reconstruction, expression modeling, etc., and finally achieves the correspondence between output audio and output visual information.

Regardless of those slots, the biggest publicity point of this technology is that it can realize the clone function on the screen, which is probably that you can use the TV, tablet and mobile phone to see a person broadcasting three different content at the same time.

Neon Country not only builds Hatsune, but also builds anchors

Japan had reported earlier, and their AI anchors have also been made.

In April last year, in a program called "NEWS CHECK 11" on NHK TV, the anchor was a cute cartoon AI anchor-"News anchor Yomiko".
In the future Spring Festival Gala, you may not need a live host

This anchor is made using CG technology. It learns to split into phonemes from a large number of recordings, then learns text recognition and reading, and finally reads news.

In addition, robots as news anchors have also been realized.

The beauty robot "Elika" developed by Osaka University and Kyoto University also served as the news anchor of Japan's NNN TV station in April 2018.

"Elika" is set as a 23-year-old beautiful girl. It is given a standard female face. The voice of "Elika" is synthesized based on the recording of the voice actors, which can be called very natural.

In the future Spring Festival Gala, you may not need a live host

In addition, it also has an advanced dialogue system. When talking with people, it collects information through microphones and sensors, perceives the voice and actions of the other party, and then turns to the other party and conducts a smooth conversation.

Its eyes, mouth, neck and other 19 places can be moved by air pressure, showing a variety of expressions, and can also do some simple movements, vividly.

When will the host be replaced?

Speaking of the Internet Spring Festival Gala, the appearance of the four naughty virtual hosts has received a lot of attention, and it can be seen from the attitude of the audience that they are quite fond of them.

So will they be an opportunity for the host to be laid off? Perhaps neither Xiaosa nor Xiao Xiaosa would agree.

In the future Spring Festival Gala, you may not need a live host
"Xiao Sa farewell stage"

As for the accent of AI news broadcast and the inconsistent facial expressions, there is still room for improvement. Japan’s lively news broadcast robots did not replace the local news industry on a large scale. At most, they are still at the level of auxiliary and novelty.

So looking at it this way, technological progress has indeed brought us a novel and beautiful experience, but perhaps in such an era, gimmicks go faster than they actually are.

Perhaps the day that is replaced will come eventually, but it is certainly not today. We should believe that when that day comes, humans may have solved the way of getting along with AI. During the Spring Festival at that time, we will see not only AI hosting the Spring Festival Gala, but also AI performances.

We, just wait for AI to feed us with your mouth open.
In the future Spring Festival Gala, you may not need a live host

Super Nervous Encyclopedia

Transfer Learning

Transfer learning is a method of using existing knowledge to learn new knowledge.

In transfer learning, the existing knowledge is called the source domain, and the new knowledge to be learned is called the target domain.

The purpose of migration learning is to extract knowledge and experience from one or more source tasks, and then apply it to a target domain.

Basic methods of transfer learning

1) The sample migration
finds data similar to the target domain in the source domain, and adjusts the weight of this data to make the new data match the data of the target domain.

2) Model migration
assumes that the source domain and the target domain share model parameters, which means that a model that has been trained in the source domain through a large amount of data is applied to the target domain for prediction.

3) Relationship migration
Assuming that two domains are similar, then they will share a certain similarity relationship, and the logical network relationship in the source domain is applied to the target domain for migration.

In the future Spring Festival Gala, you may not need a live host

Guess you like

Origin blog.51cto.com/14929242/2535613