DTW algorithm - Matlab implementation

overview

DTW (Dynamic time warping) algorithm is a method that can measure the similarity of two independent time series . It has been widely used in the matching of word audio. This method is mainly used to solve the similarity judgment when the duration of two sequences is different .
Example 1
In the figure above, the duration on the left side is equal, and the Euclidean distance can be calculated one by one, while the duration on the right side is unequal. The result obtained after DTW shows that the two sequences are not in one-to-one correspondence.
Example 2
Another example is the left picture above, to get the similarity between the blue sequence and the red sequence, because it can be seen that the two sequences have been translated, it is obviously unreasonable to directly use the one-to-one matching method. To get the corresponding effect of the left picture, you need to use the DTW method.

Algorithm principle and steps

① Calculate the Euclidean distance between two feature points. Constitute an n*m matrix, distance matrix.
insert image description here
② Calculate the cumulative distance to get the DP matrix .
insert image description here
The calculated values ​​are placed in the DP matrix. For a more intuitive understanding, the two sequences are drawn as follows:
insert image description here
In fact, during the calculation process, the calculation order actually has a direction. There are many blogs on the Internet who say it very clearly, and the bloggers will not repeat them here. In order to better understand the calculation process, a very, very, very, very simple example is given to help understand, as shown in the figure below: AB is a sequence with two eigenvalues, and the right side is the solution step of its corresponding DP matrix.
insert image description here
③ After calculating the entire DP matrix, the value in the upper right corner ( not necessarily the upper right corner, but the value at the corner of the final matrix ) is the cumulative distance between the two sequences.

④ Backtrack from the upper right corner to the lower left corner to find the path with the shortest cumulative distance, and get the corresponding relationship between each point according to the path.

Implementation of the algorithm

The blogger uses matlab to implement this algorithm, just because using matlab can easily view the matrix and draw pictures, and check the correctness of the algorithm, but does not call the formed function in matlab, so using this idea, it can also be realized with C/C++ Yes, easy to transplant.

First, write two functions.
One is Get Min(); used to get the minimum of the three values, which is used when calculating the DP matrix.

function min = GetMin(a,b,c)
if(a <= b && a <= c)
    min = a;
elseif(b <= a && b <= c)
    min = b;
elseif(c <= b && c <= a)
    min = c;
end
end

The other is GetMinIndex(); this is used to conveniently display the result of feature point matching after getting the DTW result, and return the index of the feature point corresponding to the two sequences.

function [index_i,index_j] = GetMinIndex(a,b,c,i,j)
%a 是相邻左上角,b 是相邻正上方,c说相邻正左方 
%i 是当前的x坐标  j 是当前 y坐标
if(a <= b && a <= c)
    index_i = i-1;
    index_j = j-1;
elseif(b <= a && b <= c)
    index_i = i-1;
    index_j = j;
elseif(c <= b && c <= a)
    index_i = i;
    index_j = j-1;
end
end

Next is the main function

%生成两个有明显平移性质的时间序列
x = zeros(1,50);
for i = 1:50
    x(i) = sin(i*2*pi/50)+2;
end
y = zeros(1,50);
for i = 1:50
    y(i) = sin(i*2*pi/50 + pi/6)+2;
end
x_len = length(x);
y_len = length(y);
plot(1:x_len,x);hold on
plot(1:y_len,y);hold on
%计算两序列每个特征点的距离矩阵
distance = zeros(x_len,y_len);
for i = 1:x_len
    for j=1:y_len
        distance(i,j) = (x(i)-y(j)).^2;
    end
end
%计算两个序列
DP = zeros(x_len,y_len);
DP(1,1) = distance(1,1);
for i=2:x_len
    DP(i,1) = distance(i,1)+DP(i-1,1);
end
for j=2:y_len
    DP(1,j) = distance(1,j)+DP(1,j-1);
end
for i=2:x_len
    for j=2:y_len
        DP(i,j) = distance(i,j) + GetMin(DP(i-1,j),DP(i,j-1),DP(i-1,j-1));
    end
end
%回溯,找到各个特征点之间的匹配关系
i = x_len;
j = y_len;
while(~((i == 1)&&(j==1)))
    plot([i,j],[x(i),y(j)],'b');hold on %画出匹配之后的特征点之间的匹配关系
    if(i==1)
        index_i = 1
        index_j = j-1
    elseif(j==1)
        index_i = i-1
        index_j = 1
    else
    [index_i,index_j] = GetMinIndex(DP(i-1,j-1),DP(i-1,j),DP(i,j-1),i,j)
    end
    i = index_i;
    j = index_j;  
end 

The final effect is as shown in the figure below. It can be seen that the matching after translation is considered.
renderings

Summarize

The computational complexity of the DTW algorithm is relatively high. If the amount of data is large, it will be very time-consuming to use the matlab function of DTW, because the cycle of matlab is very time-consuming. Therefore, it can be considered to call the C or C++ function of DTW with matlab, and the calculation time is greatly reduced. For example, for a piece of data I have used before, it takes tens of hours to calculate with DTW's matlab function, but calling DTW's C function only takes tens of seconds to complete. I have a C function that has been configured before, and the efficiency is very fast. But if the amount of data is small, using DTW's matlab function is also faster.

Guess you like

Origin blog.csdn.net/weixin_50514171/article/details/124953866