Echo Cancellation (AEC) Principle, Algorithm and Practice - Kalman Filter

What is a Kalman filter?

Kalman filtering is an algorithm that uses the linear system state equation to optimally estimate the system state through the observation data input by the system. Since the observation data includes the influence of system noise and interference, the optimal estimation can also be regarded as a filtering process.

Kalman filtering can estimate the state of a dynamic system from a series of data with measurement noise when the measurement variance is known.

You can use Kalman filtering in any dynamic system with uncertain information to make educated predictions about the next step of the system . Even with various disturbances, Kalman filtering can always point to what really happened.

It is very ideal to use Kalman filtering in a continuously changing system. It has the advantage of occupying a small memory (except for the previous state quantity, no other historical data needs to be kept), and it is very fast. It is very suitable for real-time problems and Embedded Systems.

code show as below:

import numpy as np
import librosa
import soundfile as sf
import pyroomacoustics as pra


def kalman(x, d, N = 64, sgm2v=1e-4):
  nIters = min(len(x),len(d)) - N
  u = np.zeros(N)
  w = np.zeros(N)
  Q = np.eye(N)*sgm2v
  P = np.eye(N)*sgm2v
  I = np.eye(N)
  e = np.zeros(nIters)
  for n in range(nIters):
    u[1:] = u[:-1]
    u[0] = x[n]
    e_n =  d[n] - np.dot(u, w)
    R = e_n**2+1e-10
    Pn = P + Q
    r = np.dot(Pn,u)
    K = r / (np.dot(u, r) + R + 1e-10)
    w = w + np.dot(K, e_n)
    P = np.dot(I - np.outer(K, u), Pn)
    e[n] = e_n

  return e
  
  
# x 原始参考信号
# v 理想mic信号 
# 生成模拟的mic信号和参考信号
def creat_sim_sound(x,v):
    rt60_tgt = 0.08
    room_dim = [2, 2, 2]

    e_absorption, max_order = pra.inverse_sabine(rt60_tgt, room_dim)
    room = pra.ShoeBox(room_dim, fs=sr, materials=pra.Material(e_absorption), max_order=max_order)
    room.add_source([1.5, 1.5, 1.5])
    room.add_microphone([0.1, 0.5, 0.1])
    room.compute_rir()
    rir = room.rir[0][0]
    rir = rir[np.argmax(rir):]
    # x 经过房间反射得到 y
    y = np.convolve(x,rir)
    scale = np.sqrt(np.mean(x**2)) /  np.sqrt(np.mean(y**2))
    # y 为经过反射后到达麦克风的声音
    y = y*scale

    L = max(len(y),len(v))
    y = np.pad(y,[0,L-len(y)])
    v = np.pad(v,[L-len(v),0])
    x = np.pad(x,[0,L-len(x)])
    d = v + y
    return x,d

if __name__ == "__main__":
    x_org, sr  = librosa.load('female.wav',sr=8000)
    v_org, sr  = librosa.load('male.wav',sr=8000)

    x,d = creat_sim_sound(x_org,v_org)

    e = kalman(x, d, N=256)
    e = np.clip(e,-1,1)
    
    sf.write('x.wav', x, sr, subtype='PCM_16')
    sf.write('d.wav', d, sr, subtype='PCM_16')
    sf.write('kalman.wav', e, sr, subtype='PCM_16')

References:

https://www.bilibili.com/video/BV1Fd4y1R7ap/?spm_id_from=333.788&vd_source=77c874a500ef21df351103560dada737

https://blog.csdn.net/u010720661/article/details/63253509?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522168162249216800225558520%2522%252C%2522scm%2522%253A%252220140713.130102334..%2522%257D&request_id=168162249216800225558520&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2~all~top_positive~default-1-63253509-null-null.142^v83^insert_down38,239^v2^insert_chatgpt&utm_term=%E5%8D%A1%E5%B0%94%E6%9B%BC%E6%BB%A4%E6%B3%A2&spm=1018.2226.3001.4187

Echo Cancellation (AEC) Principle, Algorithm and Practice - Kalman Filter

Guess you like