stft's window function design requirements and methods (COLA)

       When performing short-time Fourier transform in speech processing, there are certain requirements for the window function. This article will briefly explain this issue.

1. Background description

       Commonly used speech processing needs to be processed as follows:

stft framing will truncate the signal, in order to avoid this effect as much as possible, you should consider adding a suitable window

This method is called overlap and add, or OLA 

As can be seen from the above figure, the signal needs to be windowed twice (the analysis window and the synthesis window are generally the same), and in addition to rebuilding after windowing, special requirements are required for the window function. This requirement is called COLA compliance, full name Constant Overlap-Add (COLA) Constraint

Only some special windows and special hopsize (or overlap) can be achieved. It can also be concluded from the above:

At least 1/2 and above overlap must be guaranteed to achieve perfect reconstruction

2. Examples in Matlab

There is a special function iscola in matlab, which is used to judge whether the designed window meets the conditions. As an example, the window length is 120, and each hop is half:

window = sqrt(hann(120,'periodic')); % open method because each operation will go through the same window twice
noverlap = 60; 
[tf,m,maxDeviation] = iscola(window,noverlap)

Let's draw a picture to see if it meets the requirements

 Here you can do a simple derivation and proof: the hann window is as follows

 Note that the original analysis window (or synthesis window) is square rooted, so after two times of windowing, the window actually multiplied by the signal returns to w(n)

In the section from 0 to N/2 (other sections are analogous), due to the superposition of the two parts of the window, their sum is

0.5(1-cos(2*pi*n/N) (the blue part in the above figure) + 0.5(1-cos(2*pi*(n+N/2)/N) (the red part in the above figure shifts to 0 -N/2)

= 0.5-0.5*cos(2*pi*n/N) + 0.5 - 0.5*cos(2*pi*n/N+pi) 

=1-0.5*cos(2*pi*n/N)+ 0.5*cos(2*pi*n/N)[ This is for the unified code of the post-80s, the vertical and horizontal changes remain unchanged, and the symbols look at the quadrant! ]

=1

For 1/2 overlap (most used)

When analyzing the window, fill the new data at the end, and then multiply it by the window function. When synthesizing, kick out the first half and keep the data to make a sum, and keep the orange part at the end and wait for the next output.

3. Window design of general length

If I only have 160 points of input, but want to use 256 points to accelerate the FFT operation, how do I design this window?

( Of course, you can also consider padding to 256, but how do you get 160 points after ifft returns? )

In this case, overlap = 256-160=96, you can design a window of 2*overlap, split it in half, and then add 1 in the middle

The specific code is as follows:

nwin = 2*overlap; fftsize =256;

win = sqrt(hann(nwin,'periodic'));
nwin = length(win);
win1 = [win(1:0.5*nwin);ones(fftsize-noverlap*2,1);win(0.5*nwin+1:end)];
plot(win1);

tf = iscola(win1,noverlap);

 Design a 256 long buffer, update 160 data from the tail each time, and reserve 96 data for next use

 The synthesis window needs a 96 outbuffer for pre-output, and each time it is added to the data after the window is added

Guess you like

Origin blog.csdn.net/book_bbyuan/article/details/127805444