Microsoft's intelligent voice multi-emotion technology upgrade, Xiaoxiao Chinese voice breakthrough 14 styles

Recently, Microsoft released the latest Chinese Xiaoxiao voice upgrade, Chinese Xiaoxiao added 10 new styles. The upgraded voice model has as many as 14 styles, leading the industry. The new styles are voices suitable for chat scenes, as well as calm (Calm), happy (cheerful), sad (Sad), angry (Fearful), dissatisfaction (Disgruntled), severe ), Affectionate, Gentle, etc. This time, Microsoft's upgraded multi-emotion technology greatly enriches the listening experience of the listeners, especially in the process of listening to long texts, which can greatly alleviate auditory fatigue and improve listening comfort.

Xiaoxiao Voice adds 10 emotional styles

Before the upgrade, Microsoft Chinese Xiaoxiao Voice has 4 different styles, namely newscast, customer service, assistant, and lyrical. After the upgrade, it supports up to 14 different styles. Feel free to switch between emotions and scenes, such as multi-emotional audiobooks, news, customer service, assistant, chat, etc. It can meet the diverse customization needs of customers in different fields.

According to reports, the intelligent speech synthesis technology released by Microsoft applies voice expressive transfer technology and only uses a small amount of expressive corpus data to train a source model with high-quality, high-natural expressive speech generation capabilities. The source model obtains highly stable and adaptable speech emotion representations through in-depth exploration of the expressive characteristics of human speech, which greatly enriches the expressiveness and controllability of synthesized speech, and gives the synthesized speech anthropomorphic emotions. Sad and Le, made up for the shortcomings of the "human touch" in traditional artificial intelligence speech synthesis technology.

The ideal multi-emotion technology will have dozens or even hundreds of rich and delicate emotional expressions for each voice, which can control different scenes and automatically adapt emotions according to the content expression.

The release of Microsoft's intelligent voice multi-emotion technology marks a new trend in the development of speech synthesis, or will become the "standard configuration" of intelligent voice applications, realizing new breakthroughs in user experience.

Guess you like

Origin blog.51cto.com/14308903/2550976