2021年语音合成论文统计(1~2月)

论文统计每月第一周更新一次,主要跟踪语音合成的发展状况(很多文章都是在会议后才发出,但不影响统计。统计过程难免存在疏漏,因此统计结果仅供参考。读者有什么建议可以直接向我发消息,我将不断修改该统计。历年文章统计可访问 http://yqli.tech/page/tts_paper.html)。如有转载,请注明出处。欢迎关注微信公众号:低调奋进。


语音合成文章情况表(单位:篇)

    1月 2月
前端 多音字,韵律,g2p等等。 1 0
声学模型 语言特征转声学特征,attention工作以及双重学习 1 7
声码器 波形生成 1 3
个性化 少数据,脏数据应用等 1 1
多语言 多语言多说话人模型 0 0
歌唱合成 歌唱和音乐合成 0 1
情感 风格和情感 2 2
多模态 talking head等等 2 1
声音转换 基于GAN方案和特征解耦方案 4 2
其它 基于EEG合成,数据,MOS评测以及语音合成的应用 1 1


文章列表:

1月

    类型
1 Supervised and Unsupervised Approaches for Controlling Narrow Lexical Focus in Sequence-to-Sequence Speech Synthesis am
2 Polyphone Disambiguition in Mandarin Chinese with Semi-Supervised Learning frontend
3 Generating coherent spontaneous speech and gesture from text multimodality
4 Creating Song From Lip and Tongue Videos With a Convolutional Vocoder multimodality
5 On Interfacing the Brain with Quantum Computers: An Approach to Listen to the Logic of the Mind other
6 Whispered and Lombard Neural Speech Synthesis expression
7 Expressive Neural Voice Cloning expression/

personalization

8 High-Quality Vocoding Design with Signal Processing for Speech Synthesis and Voice Conversion vc
9 EmoCat: Language-agnostic Emotional Voice Conversion vc
10 Hierarchical disentangled representation learning for singing voice conversion vc
11 Adversarially learning disentangled speech representations for robust multi-factor voice conversion vc
12 Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss vocoder

2月

1 Triple M: A Practical Neural Text-to-speech System With Multi-guidance Attention And Multi-band Multi-time Lpcnet am
2 Mixture Density Network for Phone-Level Prosody Modelling in Speech Synthesis am
3 VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention am
4 Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input am

5

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech am
6 LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search am
7 Data-Efficient Training Strategies for Neural TTS Systemsmatch am
8 Model architectures to extrapolate emotional expressions in DNN-based text-to-speech expression
9 Model architectures to extrapolate emotional expressions in DNN-based text-to-speech expression
10 SPEAK WITH YOUR HANDS Using Continuous Hand Gestures to control Articulatory Speech Synthesizer modal
11 MBNet: MOS Prediction for Synthesized Speech with Mean-Bias Network other
12 Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning

personalization

13 Anyone GAN Sing sing
14 Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram vc
15 Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion vc
16 Universal Neural Vocoding with Parallel WaveNet vocoder
17 LVCNet: Efficient Condition-Dependent Modeling Network for Waveform Generation vocoder
18 High-Quality Vocoding Design with Signal Processing for Speech Synthesis and Voice Conversion vocoder

猜你喜欢

转载自blog.csdn.net/liyongqiang2420/article/details/114940548