语音开源项目优选：自动语音标注库VCTUBE : A Library for Automatic Speech Data Annotation

其他 2021-03-25 21:52:31 阅读次数: 0

声明：语音开源项目主要分享优秀的开源项目，促进开源社区的壮大。所写的内容主要是我个人看法，如有错误，还望指正。如有转载，请标注来源。欢迎关注微信公众号：低调奋进

VCTUBE : A Library for Automatic Speech Data Annotation

本文是自动语音标注库，主要为了自动生成语音合成系统的训练语料<audio, text>，相关的文章发表在interspeech2020，具体的文章链接

https://www.isca-speech.org/archive/Interspeech_2020/pdfs/4004.pdf

该python库的使用方法

https://dsail-skku.github.io/VCTUBE.github.io/

1 项目简介

通常语音合成系统需要大量的训练语料<audio, text>，本文提出的python库VCTUBE是从youtube上自动下载视频和字幕，然后生成语音合成的训练语料。详细的体统结构如图1所示，该系统为三模块组成：1）audio 下载模块；2）字幕下载模块；3）音频分割模块。

2，使用方法

How To Use VCTUBE?

Requirment for VCTUBE

Currently requires python >= 3.6
FFmpeg

At first you need to install VCTUBE library by pip install command

1

pip3 install vctube

Command for VCTUBE

1

2

3

4

5

6

7

8

9

10

from vctube import VCtube

playlist_name = ""

playlist_url = ""

lang = "" # ex) ko, en, fr, de ...

vc = VCtube(playlist_name, playlist_url, lang)

vc.download_audio() #download audios from youtube

vc.download_captions() #download captions from youtube

vc.audio_split() #split audio with captions

VCTUBE Example

Setting for VCTUBE

1

2

3

4

5

from vctube import VCtube

playlist_url = "https://www.youtube.com/watch?v=fj5BcN6Blks"

playlist_name="TEST"

lang = "en" #ex) ko, en, fr, de...

vc = VCtube(playlist_name, playlist_url, lang)

Result of this process

1

2

3

vc.download_audio()

vc.download_captions()

vc.audio_split()

3 实验

本文给出了英文和韩语的实验，从youtube上自动获取数据然后在tacotron上进行测试。table2为合成音频的WER测试，图2为对齐情况，说明自动生成的训练语料可用。

猜你喜欢

转载自blog.csdn.net/liyongqiang2420/article/details/112790144

语音开源项目优选：自动语音标注库VCTUBE : A Library for Automatic Speech Data Annotation

语音合成论文优选: A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music

Building an Automatic Speech Recognition System with De

Kafka The production data （automatic）

Support Annotation Library 总结

Automatic Management of Data and Computation in Datacenters

DFSMN-SAN WITH PERSISTENT MEMORY MODEL FOR AUTOMATIC SPEECH RECOGNITION翻译

Support Annotation Library使用详解

Speech语音播报

语音识别（Speech Recognition）

Siri 语音识别 Speech

Python Data Analysis Library

Data Binding Library

Android Data Binding Library

window speech实现语音控制

speech模块实现语音识别

语音识别（Web Speech API）

Speech Synthesis(文字转语音)

A Bayesian Approach to Deep Neural Network Adaptation with Applications to Robust Automatic Speech Recognition

论文阅读笔记“Attention-based Audio-Visual Fusion for Rubust Automatic Speech recognition”

Annotation 标注

Python Data Analysis Library －－ Pandas

Android--Data Binding Library

VIM Java Automatic 自动补全

自动微分(Automatic Differentiation)简介

SAP自动付款(Automatic payment)

新导入的maven项目名上有红叉Project XXX is missing required annotation processor library: ‘XXX.jar‘

Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentatio

利用Speech框架创建你的语音应用

ROS baidu_speech 语音控制

今日推荐

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

周排行

[编程题]学英语

[codeforces 1288A] Deadline 约数+模

Python的web开发

Docker在Centos 7上的部署

python编码

解决Ubuntu16.04 fatal error: json/json.h: No such file or directory

mysql并发插入

rest接口如何适应jsonp的方案

linux 终端上网设置

高数——等号两边同时求导、积分的解释

每日归档

更多

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)