Ow! Tiger Tiger Wine's first topic: Xiaolong bilingual TTS

0. Description

About half a year’s mission, the project is to produce a stable Xiaolong bilingual TTS

  • Not limited to Xiaolong, it can be applied to others, at least some conditions can be put forward as sufficient conditions, and then stable, high-quality bilingual TTS can be obtained
  • I just started to spend three weeks to investigate the feasibility of this matter; including the normative: read a lot of papers, how other companies did it, what are the current company's accumulations, and what the company's colleagues have done; and personalized: Design a small experiment to verify by yourself, use similar data in the laboratory to verify, and ask classmates and friends if they have done similar things; if others have done it, then there is a high probability that they can do it here
  • Report more demos, technical selection and details, which can be weakened when reporting. The most taboo is done for six months, the shortcomings are not exposed in advance, and nothing is done
  • When formulating the project, ask questions and the final effect, but should not include specific ideas in it. For example, the title should be: bilingual TTS, instead of using one-hot bilingual TTS or timbre migration bilingual TTS
  • The technology generally goes from the shallower to the deeper, because when doing simple methods, many phenomena can be observed in detail, as well as increased experience

1. Frame selection

  • Unified as DurING, convenient for docking with colleagues, and processing data can not take the repeated path
  • Theoretically there is little difference between DurING and Tacotron. If Tacotron has something to learn from, then you can get an optimized version of DurING.

Guess you like

Origin blog.csdn.net/u013625492/article/details/114878053