[翻译]Review——How to do Speech Recognition with Deep Learning

其他 2019-03-22 10:14:41 阅读次数: 0

原文地址：https://medium.com/@ageitgey/machine-learning-is-fun-part-6-how-to-do-speech-recognition-with-deep-learning-28293c162f7a

How to do Speech Recognition with Deep Learning

如何用深度学习做语音识别

Andrew Ng 说语音识别从让人恼怒的不可靠到令人难以置信的有用中间只有4%的距离，是深度学习让这一切成为可能。

机器学习的过程不总是黑盒，我们将语音记录喂给神经网络，就可以得到纯文本输出。其过程如下如所示：

但问题是，每个人发音的习惯不同，同样说‘Hello’，有人语速极快，有人说的很慢。因此建立可靠的识别模型就需要一些小技巧。

一、将声音转换成比特

我们可以记录声波，然后将其用数字形式表示，并形成二维数组。

上面是最终效果。但声音被采集的原始形式是声波，比如下图就是‘Hello’的声音片段。

‘Hello’的声音片段比较复杂，先看一个简单的声音片段：

虽然声音是一维的，但加上时间属性后，我们可以将它转为二维图像：

这就是“采样”。我们对样本进行每秒千次的阅读便可以准确的记录它的数据。下图是“Hello”的前100个采样数据：

但又有一个问题，采样的数据就一定等于原数据吗？

理论上来说，只要以我们所需采集的数据最高频的两倍来采集数据，就可以完美呈现近似原音的效果。很多人以为采集数据次数越多，数据点越紧密效果越高，其实这是错误的。

二、预处理声音数据

拿到数据后，我们要对其进行预处理，这个过程会面临很多问题。比如，声音片段并不都是纯粹的标准样本，现实环境复杂多变，说话者可能是在嘈杂的环境下讲话，并且伴有严重的连读和口音，这都给语音识别增加了困难。

—— 待续 ——

猜你喜欢

转载自www.cnblogs.com/bbcfive/p/10576422.html

[翻译]Review——How to do Speech Recognition with Deep Learning

Deep Learning for Environmentally Robust Speech Recognition-An Overview of Recent Developments

DEEP-FSMN FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION翻译

翻译：Deep Residual Learning for Image Recognition

【论文学习笔记】《A Review of Deep Learning Based Speech Synthesis》

基于深度学习的语音识别（Deep Learning-based Speech Recognition）

Deep Learning of Action Recognition

Deep Residual Learning for Image Recognition论文翻译（非google翻译）

Deep Speech 2: End-to-End Speech Recognition in English and Mandarin

ResNets - Deep Residual Learning for Image Recognition全文翻译

Git Github and python speech-recognition learning

How to do Semantic Segmentation using Deep learning.pdf

How to do Deep Learning on Graphs with Graph Convolutional Networks Part 2:

How to do Deep Learning on Graphs with Graph Convolutional Networks

Deep Residual Learning for Image Recognition

Deep Reidual Learning for Image Recognition

Deep Graph random Process for Relational-Thinking-based Speech Recognition

[speech recognition]Speech Recognition Technology

ResNet(Deep Residual Learning for Image Recognition)

Deep Residual Learning for Image Recognition笔记

Deep Residual Learning for Image Recognition（译）

ResNet: Deep Residual Learning for Image Recognition详解

Deep Residual Learning for Image Recognition（ResNet）阅读

A Discriminative Feature Learning Approach for Deep Face Recognition

Deep Residual Learning for Image Recognition(ResNet)

Paper | Deep Residual Learning for Image Recognition

ResNet-Deep Residual Learning for Image Recognition

Deep Residual Learning for Image Recognition (ResNet)

论文-Deep Residual Learning for Image Recognition

ResNet：Deep Residual Learning for Image Recognition

今日推荐

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

《2024 年一季度互联网投融资运行情况》研究报告

报告：Django 仍然是 74% 开发者的首选

周排行

返回指定时间格式

fopen函数中的mode参数

Java 单例模式探讨

Flex remoteobject工作原理探讨

寻找mplayer的便捷安装方法

30天了解30种技术系列---(26)MySQL自动化运维工具Inception

关于Jboss/Tomcat/Jetty的JNDI定义123

程序减肥，strip，eu-strip 及其符号表

AsyncTask、View.post(Runnable)、ViewTreeObserver三种方式总结frame animation自动启动

Json和Bean的互相转换

每日归档

更多

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)