mfcc代码+一阶、二阶差分(matlab代码)

clc;
close all;
clear all;
[x , fs] = audioread('C:\Users\Administrator\Desktop\waves_yesno\0_0_0_1_0_0_0_1.wav');
%mel
bank = melbankm(24, 256 , fs , 0 , 0.5 , 'm');
bank_1 = full(bank);
bank_2 = bank_1 / max(bank_1(:));
%DCT
for k = 1 : 12
    n = 0 : 23;
    dctcoef(k , :) = cos((2 * n + 1) * k * pi / (2 * 24));
end
w = 1 + 6 * sin(pi * [1 : 12] ./ 12);
w2one = w /max(w);
x2double = double(x);
x2emph = filter([1 -0.98] , 1 , x2double);
x2enfra = enframe(x2emph , 256 , 80);
for i = 1 : size(x2enfra , 1)
    y = x2enfra(1 ,  :);
    s = y' .* hamming(256);%注意hamming1 * 1matrix
%     y2rota = y';
    t = abs(fft(s));
    t2pow = t .^ 2;
    c1 = dctcoef * log(bank_2 * t(1 :129));
    c2 = c1 .* w';
    m(1 , :) = c2;
end
dtm2zeros = zeros(size(m));
% dols = size(m ,1);
for i = 3 : size(m , 1) - 2
    dtm2zeros(i , :) = -2 * m(i -2 , :) - m (i-1 , :) + m(i+1 ,:)+2 * m(i + 2 , :);
end
dtm2first= dtm2zeros / 3;


dtmm = zeros(size(dtm2first));
for i = 3: size(dtm2first , 1) - 2
    dtmm2zeros(i , :) = -2 * dtm2first(i -2 , :) - dtm2first(i-1 , :) + dtm2first(i+1 ,:)+2 * dtm2first(i + 2 , :); 
end
dtmm2first = dtmm2zeros / 3;


ccc = [m dtm2first dtmm2first];
ccc2first = ccc(3 : size(m , 1) - 2, :);
% subplot(211)
ccc_1 = ccc(: , 1);

我们都知道MFCC很好的表达了语音的特征,但只是静态的特征。提取动态特征,一般都采用一阶二阶差分,但一阶二阶差分究竟表示什么,什么含义:

一阶差分就是离散函数中连续相邻两项之差;定义X(k),则Y(k)=X(k+1)-X(k)就是此函数的一阶差分,物理意义就是当前语音帧与前一帧之间的关系, 体现帧与帧(相邻两帧)之间的联系;

在一阶差分的基础上,Z(k)=Y(k+1)-Y(k)=X(k+2)-2*X(k+1)+X(k)为此函数的二阶差分.二阶差分表示的是一阶差分与一阶差分之间的关系。即前一阶差分与后一阶差分之间的关系,体现到帧上就是相邻三帧之间的动态关系。

文章参考:https://blog.csdn.net/w_manhong/article/details/79497688;

                 https://blog.csdn.net/u011567017/article/details/44622363

猜你喜欢

转载自blog.csdn.net/xwei1226/article/details/80112577