Voiceprint recognition
Voiceprint recognition of interest " who said ," to solve biological identification and recognition ; and voice recognition "say what" Caring for solving recognition of what is said.
1. Theory
1.1 Basics of Voiceprint Recognition
-
Audio features (2): time domain diagram, spectrogram, and spectrogram (time-spectrogram)
-
The principle behind "sound" (1): sound waves, sound collection, sound storage
-
The principle behind "sound" (2): sampling, quantization and coding
-
Overview of Voiceprint Recognition (2) Principle and Process of Voiceprint Recognition
-
Overview of Voiceprint Recognition (3) Voiceprint Recognition System
-
Summary of Voiceprint Recognition Technology (1): Voiceprint Modeling Technology
1.2 Voiceprint recognition algorithm
- Voiceprint recognition: the principle of x-vector feature extraction
- Kaldi speaker recognition: plda adaptive based on x-vector
1.3 Introduction to Voiceprint Recognition Data
Introduction to common data sets for voiceprint recognition
2. Resources
2.1 Data (Chinese/English)
(1) Chinese data set
SLR33
SLR85
SLR82
AISHELL-2
SLR18
(2) The foreign data set
Voxceleb2
can be downloaded by clicking here (Note: Decompression method)
Supporting paper: Paper sharing VoxCeleb2: Deep Speaker Recognition
2.2 Tools
Left
-
Kaldi Voiceprint Recognition·General Chapter
Python + kaldi
PYTORCH-KALDI speech recognition toolkit
TensorFlow
TensorFlow-based Deep Speaker
PyTorch
PyTorch-based Deep Speaker
Hard
2.3 Resource summary
- Explain the principle, evolution and application selection of audio codec in detail -a very comprehensive series of audio related series!
- Speech Recognition (8)-Voiceprint Recognition, Geography
- Exploring Xiaobai Voiceprint Recognition (Speaker Recognition)
- 20190510 Speech recognition resource finishing
- iamxiaoyubei /Voice-Tech-Study
- Kaldi/speech recognition ASR/voiceprint recognition SRE/resource summary -summarizes many comprehensive speech recognition/voiceprint recognition resources
Voiceprint recognition application
- The application practice of voiceprint recognition -Dr. Li's knowledge column, well written!
- Leon Jin’s voiceprint/ASR/diarization/Kaldi fan -you can see the answer from the big guy in Zhihu , maybe there will be new gains
Voiceprint recognition learning path
- Sorting out the information of the speaker recognition/voiceprint recognition learning path, learning voiceprint recognition from zero -it's awesome! If you have time and want to study systematically, you can refer to this learning path: GMM-UBM -> JFA -> Ivector-PLDA -> DNN embeddings -> E2E
2.4 Voiceprint recognition Daniel
- Wang Yun
- Wang Quan -This is very powerful, with introductions of many theories and tools, as well as the project maintained by Mr. Wang Quan, and the new book: "Voiceprint Technology: From Core Algorithms to Engineering Practice"
3. Actual combat: theory to code
data
Speech library analysis and
evaluation method of audio quality
index
- Basic indicators: FAR, FRR
face recognition model evaluation indicators: a complete review - How to obtain the best effect threshold for ERR
voiceprint recognition (by calculating ERR)?
test
- Kaldi project test (1) Smoothly extract features and calculate similarity scores
Code
papers with code: Speaker Verification + papers + code! ! !
Voiceprint recognition project
-
Python + tensorflow: Overview of voiceprint recognition + a simple model implemented by tensorflow ( https://github.com/RDShi/voiceprint )
-
Python + Keras: Chinese and English voiceprint recognition based on Kersa
-
Supporting github address: https://github.com/jcfszxc/Project
reference: