Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion - 代码天地

Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion

其他 2021-12-14 18:16:18 阅读次数: 0

作者：Samuel J. Broughton
单位：新加坡国立，英国谢菲尔德
会议：2021 interspeech

文章目录

abstract

研究GAN生效的细节：
（1）对于一个特定的GAN结构，重复层学到的特征和他们随机初始化的参数非常像
（2）对于在一个指定数据集训好的网络，transfer到另外一个数据集合，repeating layers的参数也呈现出高度的一致性。
结论：repeating layers的层数对于好的表征是更重要的。

SVCCA分析向量

Singular Vector Canonical Correlation Analysis (SVCCA) 奇异向量标准关联分析，对比两个高维向量的一致性。谷歌提出的model insight的方法。
SVCCA的方法介绍

experiments

想要探究的问题

训练好模型的latent representation和随机初始化的相似性；

计算模型初始化时各层输出与不同训练step时输出的相关度 CCA

在这里插入图片描述

结论：repeating layer (R1—R9)学到的特征非常相像；D1非常接近初始状态，去除GLU结果也一样

模型自适应之前和之后，latent representations的相似性；

pre-trained model作为初始态，换一个数据集进行transfer learning，得到的结果和1类似；

带有各种frozen repeating layer网络学学到特征的相似性；

不同的实验对照：（1）freeze R2/3/4；（2）freeze R5/6；（3）freeze R7/8
结论：不同对照试验的中间层输出相似，最终结果相似；

不同数量repeating layers （1D CNN）的网络得到输出特征的质量有何区别；

对比实验：含有不同的repeating layer，3/6/9/12/15
结论：（1）层数越多，conversion的音色迁移越明显；
（2）音色迁移明显，有时候也会伴随可懂度的下降；
（3）层数增加，噪声也更明显。

猜你喜欢

转载自blog.csdn.net/qq_40168949/article/details/120176136

Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion

Unsupervised Cross-Domain Singing Voice Conversion

FASTSVC: FAST CROSS-DOMAIN SINGING VOICE CONVERSION WITH FEATURE-WISE LINEAR MODULATION论文理解

The Voice Conversion Challenge 2018

Towards a Robust Deep Neural Network in Text Domain A Survey

【 2020ResearchGate】On Interpretability of Artificial Neural Networks

018 Interpretability of Neural Networks (Lecture 7)

SINGING-VOICE SEPARATION FROM MONAURAL RECORDINGS USING DEEP RECURRENT NEURAL NETWORKS论文翻译

【论文学习笔记】《Deep Voice: Real-time Neural Text-to-Speech》

tensorflow.python.framework.errors_impl.OutOfRangeError？ GitHub上andabi/ deep-voice-conversion/ 的解决方法

【论文】Deep3D: Fully automatic 2D-to-3D video conversion with deep convolutional neural networks

论文笔记：Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World

Whether To Pretrain DNN or Not?: An Empirical Analysis for Voice Conversion

李宏毅，语音转换，voice conversion

Self-Supervised Representations for Singing Voice Conversion

【论文精读】Benchmarking Deep Learning Interpretability in Time Series Predictions

what is conversion exit defined in ABAP domain

Deep domain confusion: Maximizing for domain invariance

深度学习神经网络学习笔记-多模态方向-11-Deep Voice: Real-time Neural Text-to-Speech

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Domain Adaptation and Graph Neural Networks

2019trans--Sequence-to-Sequence Acoustic Modeling for Voice Conversion

One-shot Voice Conversion with Disentangled Representations by Leveraging Phonetic Posteriorgrams

Non-parallel Voice Conversion using Weighted Generative Adversarial Networks

ON the study of generative adversarial network for corss-lingual voice conversion

One-shot Voice Conversion with Global Speaker Embeddings

2021-3-13组会 Overview of Voice Conversion

Transferring Source Style in Non-Parallel Voice Conversion

Neural Network and Deep Learning

Improving Deep Neural Networks

今日推荐

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

报告：Django 仍然是 74% 开发者的首选

《2024 年一季度互联网投融资运行情况》研究报告

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

周排行

curl的POST请求，封装方法

8.1.1. Integer Types

Java基础 Day05(个人复习整理)

Python - Django - 中间件 process_exception

小L的试卷

【Shell编程】（函数）判断用户是否存在

python(css样式)

spring ant path 匹配原则 - 【笔记】

《JavaScript与JScript从入门到精通》(美)James.Jaworski.中译本.扫描版.pdf

Eclipse运行带参数的java程序

每日归档

更多

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)