0. Description
ForcePPG: A PPG based on ForceAlignment
- The ASR trained with the results of Aishell-1+Librispeech ForceAlignment is not sufficiently trained. In order to compare with the previous results, the replacement of ckpt early stop is not performed.
- F0 is not used in ASR, see the structure used: [1] https://blog.csdn.net/u013625492/article/details/109670529 [2] https://blog.csdn.net/u013625492/article/details/ 109206085 [3] https://blog.csdn.net/u013625492/article/details/109201157
- The effect may not be the best, but it can be used as a PPG
1. Extraction process
1.1. Wav
- DataBaker
- LJSpeech
Simple RAW WAV will do
1.2. Extract files
Let's talk about CN-ASR first, and then talk about EN-ASR. Create folders in two places separately, and merge them into B-PPG when they are finally called.
You can refer to this: https://github.com/ruclion/ppgs_extractor_10ms_sch_lh_aishell1/blob/master/extract_ppg_generate_DataBaker_ForcePPG.py
1.2.1. CN ASR
# 超参数个数:16
hparams = {
'sample_rate': 16000,
'preemphasis': 0.97,
'n_fft': 400,
'hop_length': 160,
'win_length': 400,
'num_mels': 80,
'n_mfcc': 13,
'window': 'hann',
'fmin': 30.,
'fmax': 7600.,
'ref_db': 20,
'min_db': -80.0,
'griffin_lim_power': 1.5,
'griffin_lim_iterations': 60,
'silence_db': -28.0,
'center': True,
}
assert hparams == audio_hparams
MFCC_DIM = 39
PPG_DIM = 218
# in
meta_path = '*.txt'
wav_dir = '*/wavs_16000'
# out1
ppg_dir = './LJSpeech-1.1-Mandarin-PPG/ppg_generate_10ms_by_audio_hjk2'
mfcc_dir = './LJSpeech-1.1-Mandarin-PPG/mfcc_10ms_by_audio_hjk2'
mel_dir = './LJSpeech-1.1-Mandarin-PPG/mel_10ms_by_audio_hjk2'
spec_dir = './LJSpeech-1.1-Mandarin-PPG/spec_10ms_by_audio_hjk2'
rec_wav_dir = './LJSpeech-1.1-Mandarin-PPG/rec_wavs_16000'
os.makedirs(ppg_dir, exist_ok=True)
os.makedirs(mfcc_dir, exist_ok=True)
os.makedirs(mel_dir, exist_ok=True)
os.makedirs(spec_dir, exist_ok=True)
os.makedirs(rec_wav_dir, exist_ok=True)
# out2
STARTED_DATESTRING = "{0:%Y-%m-%dT%H-%M-%S}".format(datetime.now())
good_meta_path = './LJSpeech-1.1-Mandarin-PPG/meta_good_' + STARTED_DATESTRING + '_v3.txt'
f_good_meta = open(good_meta_path, 'w')
# NN->PPG
ckpt_path = './aishell1_ckpt_model_dir/aishell1ASR.ckpt-128000'
- The code is located in /ceph/home/hujk17/ppgs_extractor_10ms_sch_lh_aishell1/extract_ppg_generate_LJSpeech_ForcePPG.py
- And /ceph/home/hujk17/ppgs_extractor_10ms_sch_lh_aishell1/extract_ppg_generate_DataBaker_ForcePPG.py
- The generated mel and PPG are in the corresponding positions
1.2.2. IN
slightly
1.3. PPG folder
2. Normalized documents
The third operation of the laboratory has standardized documents, the website is: https://github.com/thuhcsi/dpss-exp3-VC-PPG
Job document link: https://drive.google.com/file/d/1C1Md176LKIkiO9s3VNssQ0hJzvWmZ0gZ/view?usp=sharing
[Don’t look, I sorted it out, it’s a bit messy] PPG ins and outs: https://drive.google.com/file/d/1BUYsOtiaPzvee1Hrs77X71SjWWi-Zy3A/view?usp=sharing
Thanks to Lu Hui, Changhe, Wang Jie, teachers, and classmates. Your documents are really concise and level.