[Voice coding] based on matlab ADPCM coding and decoding [including Matlab source code 553 period]

1. Introduction

1 Adpcm coding principle
Insert picture description here
Coding steps:


Calculate the difference diff between the input pcm data and the predicted pcm data (the last pcm data for the first time); calculate the delta through the differential quantizer (by index (the first encoding index is 0) to find the step, through diff and step Find the delta). The delta is the encoded data;
the vpdiff is calculated by the inverse quantizer (the vpdiff is calculated by the calculated delta and step);
the new predicted valpred is calculated, that is, the last predicted valpred+vpdiff; the
predictor (normalized ), find the predicted pcm value of the current input pcm input, for the next calculation;
quantization step adjustment (by delta look-up table and index, calculate the new index value). For the next calculation;
2 adpcm decoding principle
Insert picture description here
Decoding steps (in fact, the decoding principle is the third to sixth steps of encoding):

Find vpdiff through the inverse quantizer (from the stored delta and index, find the step, calculate the vpdiff);
find the new predicted valpred, that is, the last predicted valpred+vpdiff;
through the predictor (normalized), find The predicted pcm value of the current input pcm input is used for the next calculation. The predicted pcm value is the decoded data; the
quantization step is adjusted (the new index value is calculated by looking up the table and index through delta). For the next calculation;
note description
Through the principle of encoding and decoding, we can see that in fact, the first encoding has already been decoded, that is, the predicted pcm.
Because the output data after encoding and decoding has been quantized. According to the calculation formula delta = diff*4/step; vpdiff = (delta+0.5)*step/4; Considering that they are all integer operations, it can be derived: the predicted pcm data generated by encoding and then decoding the pcm data, if the pcm data is predicted The data obtained by re-encoding is the same as the data obtained by the first encoding. Therefore, after the pcm data is lost after one encoding, no matter how many times it is decoded and re-encoded, the data is the same, and the sound quality will not be lost again. That is, after the first encoding, no matter how many times the data is encoded and decoded, it is a lossless output.

3 ADPCM data storage format
This part is the description of adpcm data storage, which belongs to the detail part. Many codes are noisy when decoded because the details of this part are incorrect, so you need to read it carefully.

3.1 Introduction to adpcm data block
Adpcm data is stored in one block and one block, and the block is composed of both block header (block header) and data. The block header is a structure, and its definition in mono is as follows:

Typedef struct
{
    
    
short  sample0;    //block中第一个采样值(未压缩)
BYTE  index;     //上一个block最后一个index,第一个block的index=0;
BYTE  reserved;   //尚未使用
}MonoBlockHeader;

For dual channels, its blockheader should contain two MonoBlockHeaders, which are defined as follows:

typedaf struct
{
    
    
MonoBlockHeader leftbher;
MonoBlockHeader rightbher;
}StereoBlockHeader;

When decompressing, the left and right channels are processed separately, so there must be two MonoBlockHeaders;
with the blockheader information, you can easily extract the compressed data in this block without knowing the previous data of the block. Therefore, adpcm decoding is only related to this block and has nothing to do with other blocks. It can only decode any block data individually.
The size of the block is fixed and can be customized. The number of samples nsamples contained in each block is calculated as follows:

//
#define BLKSIZE 1024
block = BLKSIZE * channels;
//block = BLKSIZE;//ffmpeg
nsamples = (block  - 4 * channels) * 8 / (4 * channels) + 1;

For example, the audition software uses the above, a single-channel block is 1024 bytes, 2041 samples, and a dual-channel block is 2048, which also contains 2041 samples.
And ffmpeg uses block=1024bytes, that is, both single and double channels are 1024bytes, and the number of samples for single and double channels can be calculated by the formula to be 2041 and 1017 respectively;

3.2 Single-channel pcm format:
Insert picture description here
3.3 Dual-channel pcm format:
Insert picture description here
3.4 Implementation of encoding and decoding codes
Special attention should be paid to the processing of two channels and the processing method when the data is not enough for 1 block. The
code includes encoding and decoding test cases, which implement encoding and then decoding. Welcome to exchange and learn the
complete code download address (this article only details the adpcm encoding and decoding, if you want the wav file encoding and decoding to be correct, you need to download the complete code. The
complete code is 0x0011 /* Intel's DVI ADPCM */ encoding and decoding code implementation. Including single and double channels The processing and the final data are not the processing of the whole block):

Second, the source code

clear all;
clc
close all;
[x,fs,numbits]= wavread('C6_3_y.wav'); 

sign_bit=2;                                     %两位ADPCM算法
ss=adpcm_encoder(x,sign_bit);
yy=adpcm_decoder(ss,sign_bit)';

nq=sum((x-yy).*(x-yy))/length(x);
sq=mean(yy.^2);
snr=(sq/nq);
t=(1:length(x))/fs;
subplot(211)
plot(t,x/max(abs(x)))
axis tight
title('(a)编码前语音')
xlabel('时间/s')
ylabel('幅度')
subplot(212)
plot(t,yy/max(abs(yy)))
axis tight
% APDCM解码函数
function y=adpcm_decoder(code,sign_bit)
len=length(code);
y = zeros(1,len);
ss2 = zeros(1,len); 
ss2(1) = 1; 

currentIndex =1; 
index = [-1 4]; 
startval = 1; 
endval = 127;
base = exp( log(2)/8 ); 
% 近似步长 
const = startval/base; 
numSteps = round( log(endval/const) / log(base) ); 
n = 1:numSteps; 
base = exp( log(endval/startval) / (numSteps-1) ); 
const = startval/base; 
table2 = round( const*base.^n ); 
for n = 2:len 
% 计算量化距离 
    neg = code(n) >= sign_bit; 
    if (neg) 
        temp = code(n) - sign_bit; 
    else 
        temp = code(n); 
    end 
    temp2 = (temp+.5)*ss2(n-1); 
    if (neg) 
        temp2 = -temp2; 
    end 

Three, running results

Insert picture description here

Four, remarks

Complete code or writing add QQ 1564658423 past review
>>>>>>
[Feature extraction] Audio watermark embedding and extraction based on matlab wavelet transform [Include Matlab source code 053]
[Speech processing] Voice signal processing based on matlab GUI [Include Matlab Source code issue 290]
[Voice acquisition] based on matlab GUI voice signal collection [including Matlab source code 291]
[Voice modulation] based on matlab GUI voice amplitude modulation [including Matlab source code 292]
[Speech synthesis] based on matlab GUI voice synthesis [including Matlab Source code issue 293]
[Voice encryption] Voice signal encryption and decryption based on matlab GUI [With Matlab source code 295]
[Speech enhancement] Matlab wavelet transform-based voice enhancement [Matlab source code 296]
[Voice recognition] Based on matlab GUI voice base frequency Recognition [Including Matlab source code 294]
[Speech enhancement] Matlab GUI Wiener filtering based voice enhancement [Including Matlab source code 298]
[Speech processing] Based on matlab GUI voice signal processing [Including Matlab source code 299]
[Signal processing] Based on Matlab speech signal spectrum analyzer [including Matlab source code 325]
[Modulation signal] Digital modulation signal simulation based on matlab GUI [including Matlab source code 336]
[Emotion recognition] Voice emotion recognition based on matlab BP neural network [including Matlab source code 349 Issue]
[Voice Steganography] Quantified Audio Digital Watermarking Based on Matlab Wavelet Transform [Include Matlab Source Code Issue 351]
[Feature extraction] based on matlab audio watermark embedding and extraction [including Matlab source code 350 period]
[speech denoising] based on matlab low pass and adaptive filter denoising [including Matlab source code 352 period]
[emotion recognition] based on matlab GUI voice emotion classification Recognition [Including Matlab source code 354 period]
[Basic processing] Matlab-based speech signal preprocessing [Including Matlab source code 364 period]
[Speech recognition] Matlab Fourier transform 0-9 digital speech recognition [Including Matlab source code 384 period]
[Speech Recognition] 0-9 digital speech recognition based on matlab GUI DTW [including Matlab source code 385]
[Voice playback] Matlab GUI MP3 design [including Matlab source code 425]
[Voice processing] Speech enhancement algorithm based on human ear masking effect Noise ratio calculation [Including Matlab source code 428]
[Speech denoising] Based on matlab spectral subtraction denoising [Including Matlab source code 429]
[Speech recognition] BP neural network speech recognition based on the momentum item of matlab [Including Matlab source code 430]
[Voice steganography] based on matlab LSB voice hiding [including Matlab source code 431]
[Voice recognition] based on matlab male and female voice recognition [including Matlab source code 452]
[Voice processing] based on matlab voice noise adding and noise reduction processing [including Matlab source code Issue 473]
[Speech denoising] based on matlab least squares (LMS) adaptive filter [including Matlab source code 481]
[Speech enhancement] based on matlab spectral subtraction, least mean square and Wiener filter speech enhancement [including Matlab source code 482 period】
[Communication] based on matlab GUI digital frequency band (ASK, PSK, QAM) modulation simulation [including Matlab source code 483]
[Signal processing] based on matlab ECG signal processing [including Matlab source code 484]
[Voice broadcast] based on matlab voice Broadcast [Including Matlab source code 507]
[Signal processing] Matlab wavelet transform based on EEG signal feature extraction [Including Matlab source code 511]
[Voice processing] Based on matlab GUI dual tone multi-frequency (DTMF) signal detection [Including Matlab source code 512 】
【Voice steganography】based on matlab LSB to realize the digital watermark of speech signal 【Include Matlab source code 513】
【Speech enhancement】Speech recognition based on matlab matched filter 【Include Matlab source code 514】
【Speech processing】Based on matlab GUI voice Frequency domain spectrogram analysis [including Matlab source code 527]
[Speech denoising] based on matlab LMS, RLS algorithm voice denoising [including Matlab source code 528]
[Voice denoising] based on matlab LMS spectral subtraction voice denoising [including Matlab Source code issue 529]
[Voice denoising] based on matlab soft threshold, hard threshold, compromise threshold voice denoising [including Matlab source code 530]
[Voice recognition] based on matlab specific person's voice recognition discrimination [including Matlab source code 534]
[ Speech denoising] based on matlab wavelet soft threshold speech noise reduction [including Matlab source code 531]
[speech denoising] based on matlab wavelet hard threshold speech noise reduction [including Matlab source code 532]
[speech recognition] based on matlab MFCC and SVM specific Human gender recognition [including Matlab source code 533]
[Voice recognition] GMM speech recognition based on MFCC [including Matlab source code 535 period]
[Voice recognition] Based on matlab VQ specific person isolated words voice recognition [including Matlab source code 536 period]
[Voice recognition] based on matlab GUI voiceprint recognition [including Matlab] Source code issue 537]
[Acquisition and reading] based on matlab voice collection and reading [including Matlab source code 538]
[Voice editing] based on matlab voice editing [including Matlab source code 539]
[Voice model] based on matlab voice signal mathematical model [including Matlab source code 540]
[Speech soundness] based on matlab voice intensity and loudness [including Matlab source code 541]
[Emotion recognition] based on matlab K nearest neighbor classification algorithm voice emotion recognition [including Matlab source code 542]
[Emotion recognition] based on matlab Support vector machine (SVM) speech emotion recognition [including Matlab source code 543]
[Emotion recognition] Neural network-based speech emotion recognition [including Matlab source code 544]
[Sound source localization] Sound source localization based on matlab different spatial spectrum estimation Algorithm comparison [Include Matlab source code 545]
[Sound source localization] Based on matlab microphone receiving signal under different signal-to-noise ratio [Include Matlab source code 546]
[Sound source localization] Room impulse response based on matlab single sound source and dual microphones [ Contains Matlab source code 547]
[Sound source localization] Matlab generalized cross-correlation sound source location [Matlab source code 548 is included]
[Sound source location] Matlab array manifold matrix-based signal display [Matlab source code 549]
[Features Extraction] based on matlab formant estimation [including Matlab source code 550 period]
[Feature extraction] based on matlab pitch period estimation [including Matlab source code 551]
[Feature extraction] based on matlab voice endpoint detection [including Matlab source code 552]

Guess you like

Origin blog.csdn.net/TIQCmatlab/article/details/114977184