The VAD algorithm research webRTC

Summary:

     In the previous document, analyze many shortcomings in unimrcp vad algorithm, but is there a better algorithm to replace it. There are two ways 1. GMM 2. DNN.

    Where the famous WebRTC VAD is the use of GMM algorithm to complete voice active dector. Today we focus on WebRTC VAD algorithm. In a later article,

    We are planing analysis DNN application of the VAD. The following chapters describe the detection principle WebRTC.

 

principle:

    First, let's look to the spectral range of the human voice and musical instruments, below is the audio spectrum.

  

                                     This comes from the network of FIG.

    The audio frequency spectrum divided into six sub-bands, 80Hz ~ 250Hz, 250Hz ~ 500Hz, 500Hz ~ 1K, 1K ~ 2K, 2K ~ 3K, 3K ~ 4K, were calculated for each sub-band feature.

 

step:

    The first step: down

      WebRTC support audio 8kHz 16kHz 32kHz 48kHz, but the first WebRTC will first 16kHz 32kHz 48kHz down to 8kHz, and then processed.

   

 1         int16_t speech_nb[240];  // 30 ms in 8 kHz.
 2         const size_t kFrameLen10ms = (size_t) (fs / 100);
 3         const size_t kFrameLen10ms8khz = 80;
 4         size_t num_10ms_frames = frame_length / kFrameLen10ms;
 5         int i = 0;
 6         for (i = 0; i < num_10ms_frames; i++) {
 7             resampleData(audio_frame, fs, kFrameLen10ms, &speech_nb[i * kFrameLen10ms8khz],
 8                          8000);
 9         }
10         size_t new_frame_length = frame_length * 8000 / fs;
11         // Do VAD on an 8 kHz signal
12         vad = WebRtcVad_CalcVad8khz(self, speech_nb, new_frame_length);

 

    

    

   

   

  

 

   

 

    

  

 

Guess you like

Origin www.cnblogs.com/damizhou/p/11318668.html