TTS most comprehensive Chinese regularization Code: English processing, digital processing, prosody prediction, Pinyin

TTS most complete front-end processing: English processing, digital processing, prosody prediction

Brief ideas

  • English alphabet (common letters such as Chinese revolution: google - Google, baidu- Baidu, etc.
  • And digital processing time: for example, 2020, February 18: One II II II May 2010 picked up eight days
固话:0595-23865596或23880880。 固话:零五九五二三八六五五九六或二三八八零八八零。
手机:+86 19859213959或15659451527。 手机:八六一九八五九二一三九五九或一五六五九四五一五二七。
分数:32477/76391。 分数:七万六千三百九十一分之三万两千四百七十七。
百分数:80.03%。 百分数:百分之八十点零三。
编号:31520181154418。 编号:三一五二零一八一一五四四一八。
纯数:2983.07克或12345.60米。 纯数:二九八三.零七克或一二三四五.六十米。
日期:1999年2月20日或09年3月15号。 日期:一九九九年二月二十日或零九年三月十五号。
金钱:12块5,34.5元,20.1万 金钱:十二块五,三十四点五元,二十点一万
特殊:O2O或B2C。 特殊:O2O或B2C。
  • Prosody prediction: suitable for parametric approach, you can not turn off
0 霍思燕#1露背#0秀乳沟#1性感#0惹火
1 也#0可以#1给#0本委员#0反映哟
2 摇篮牌#1钙维健#1婴儿#1配方#0奶粉
3 鄂豫#1鲁皖苏#1局地#0大暴雨
4 以下#1为#0薛蛮子#1观点#0摘要
  • Pinyin

Core source code

Digital core source code

# -*- coding:utf-8 -*-
# /usr/bin/python
'''
@Author  :  Errol 
@Describe:  
@Evn     :  
@Date    :   - 
'''
CN_NUM = {
    '〇': 0, '一': 1, '二': 2, '三': 3, '四': 4, '五': 5, '六': 6, '七': 7, '八': 8, '九': 9, '零': 0,
    '壹': 1, '贰': 2, '叁': 3, '肆': 4, '伍': 5, '陆': 6, '柒': 7, '捌': 8, '玖': 9, '貮': 2, '两': 2,
}

CN_UNIT = {
    '十': 10,
    '拾': 10,
    '百': 100,
    '佰': 100,
    '千': 1000,
    '仟': 1000,
    '万': 10000,
    '萬': 10000,
    '亿': 100000000,
    '億': 100000000,
    '兆': 1000000000000,
}


def chinese_to_arabic(cn: str) -> int:
    unit = 0  # current
    ldig = []  # digest
    for cndig in reversed(cn):
        if cndig in CN_UNIT:
            unit = CN_UNIT.get(cndig)
            if unit == 10000 or unit == 100000000:
                ldig.append(unit)
                unit = 1
        else:
            dig = CN_NUM.get(cndig)
            if unit:
                dig *= unit
                unit = 0
            ldig.append(dig)
    if unit == 10:
        ldig.append(10)
    val, tmp = 0, 0
    for x in reversed(ldig):
        if x == 10000 or x == 100000000:
            val += tmp * x
            tmp = 0
        else:
            tmp += x
    val += tmp
    return val


# TODO: make a full unittest


def test():
    test_dig = ['八',
                '十一',
                '一百二十三',
                '一千二百零三',
                '一万零一',
                '十万零三千六百零九',
                '一百二十三万四千五百六十七',
                '一千貮百二十三万四千五百六十七',
                '一亿一千一百二十三万四千五百六十七',
                '一百零二亿五千零一万零一千零三十八',
                '壹万零捌佰玖拾叁']
    for cn in test_dig:
        x = chinese_to_arabic(cn)
        print(cn, x)
    #assert x == 10250011038


if __name__ == '__main__':
    test()

Rhythm core source code

# logits
logits_iph = tf.matmul(h_iph, w_iph) + b_iph  # shape of logits:[batch_size*max_time, 3]
logits_normal_iph = tf.reshape(  # logits in an normal way:[batch_size,max_time_stpes,3]
   tensor=logits_iph,
   shape=(-1, self.max_sentence_size, 3),
   name="logits_normal_iph"
)
logits_iph_masked = tf.boolean_mask(  # [seq_len1+seq_len2+....+,3]
   tensor=logits_normal_iph,
   mask=self.mask,
   name="logits_iph_masked"
)

# prediction
pred_iph = tf.cast(tf.argmax(logits_iph, 1), tf.int32, name="pred_iph")  # pred_iph:[batch_size*max_time,]
pred_normal_iph = tf.reshape(  # pred in an normal way,[batch_size, max_time]
   tensor=pred_iph,
   shape=(-1, self.max_sentence_size),
   name="pred_normal_iph"
)
pred_iph_masked = tf.boolean_mask(  # logits_iph_masked [seq_len1+seq_len2+....+,]
   tensor=pred_normal_iph,
   mask=self.mask,
   name="pred_iph_masked"
)
pred_normal_one_hot_iph = tf.one_hot(  # one-hot the pred_normal:[batch_size, max_time,class_num]
   indices=pred_normal_iph,
   depth=self.class_num,
   name="pred_normal_one_hot_iph"
)

# loss
self.loss_iph = tf.losses.softmax_cross_entropy(
   labels=y_p_iph_masked,
   logits=logits_iph_masked
)

Prosody prediction

Reference: https: //github.com/quadrismegistus/prosodic
Reference: https: //github.com/Helsinki-NLP/prosody

The two model results were pretty good, but need to make changes to the algorithm. Comments welcome exchange.

Published 299 original articles · won praise 129 · views 80000 +

Guess you like

Origin blog.csdn.net/weixin_32393347/article/details/104367230