Mutual information method to determine the best time delay of time series

1 code implementation

Recently, it is necessary to realize the phase space reconstruction of the time series. Referring to ChatGPT and related papers, the program for determining the optimal time delay of the time series based on the mutual information method is implemented. The code is as follows:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

N_ft = 1000

def delay_time(data, max_delay=10):
    # 1. 计算自信息和联合信息
    ts = pd.Series(data)
    delays = range(1, max_delay+1)
    # ics = [ts.autocorr(d) for d in delays]   # 自信息

    ics = [ts.shift(d).autocorr() for d in delays]
    jcs = []
    for d in delays:
        jcs.append(ts.corr(ts.shift(d)))     # 联合信息

    # 2. 计算互信息
    print(ics)
    print(jcs)
    mis = []
    for jc, ic in zip(jcs, ics):
        print(jc, ic)
        mi = -0.5*np.log(1-jc**2)+0.5*np.log(1-ic**2) # 互信息
        print(mi)
        mis.append(mi)

    # 3. 找到第一个局部极小值并返回其对应的时延
    diffs = np.diff(mis)
    print(diffs)
    i = np.where(diffs > 0)[0][0]
    delay = delays[i]

    # 可视化互信息函数
    plt.plot(delays[0:], mis, 'bo-')
    plt.xlabel('Delay(τ)')
    plt.ylabel('Mutual Information(I(τ))')
    plt.grid(axis='x')
    plt.grid(axis='y')
    plt.axvline(x=delay, color='r', linestyle='--')
    plt.show()

    return delay

t = []
f1 = 25
f2 = 30
for i in range(N_ft):
    t.append(i * 0.001)
t = np.array(t)
# yu = np.ones(M * N)
AEall = np.sin(t * 2 * np.pi * f1) + np.sin(t * 2 * np.pi * f2)  #在这里直接改信号

delay = delay_time(AEall, max_delay=30)
print('Delay time:', delay)

The running results are shown in the figure:
Delay-mutual information relationship curveAccording to the paper "Research and Application of Chaotic Time Series Prediction" , the best time delay is determined by finding the first local minimum point, that is, the best time delay of the sequence is 9.

2 Relevant instructions

The relevant formulas in the program are provided by chatGPT, and their correctness may need to be further confirmed. If you have any questions during use, please discuss with us! In addition, after completing the determination of the optimal delay, it is necessary to complete the determination of the optimal embedding dimension. You can refer to the blog . Here, the implementation of the GP algorithm is slightly modified. The code is as follows:

import numpy as np
from scipy.fftpack import fft
from scipy import fftpack
import matplotlib.pyplot as plt
import pandas as pd

N_ft = 1000

# GP算法求关联维数(时频域特征)
def GP(imf, tau):
    if (len(imf) != N_ft):
        print('请输入指定的数据长度!')  # 需要更改,比如弹出对话框
        return
    elif (isinstance(imf, np.ndarray) != True):
        print('数据格式错误!')
        return
    else:
        m_max = 10  # 最大嵌入维数
        ss = 50  # r的步长
        fig = plt.figure(1)
        for m in range(1, m_max + 1):
            i_num = N_ft - (m - 1) * tau
            kj_m = np.zeros((i_num, m))  # m维重构相空间
            for i in range(i_num):
                for j in range(m):
                    kj_m[i][j] = imf[i + j * tau]
            dist_min, dist_max = np.linalg.norm(kj_m[0] - kj_m[1]), np.linalg.norm(kj_m[0] - kj_m[1])
            Dist_m = np.zeros((i_num, i_num))  # 两向量之间的距离
            for i in range(i_num):
                for k in range(i_num):
                    D = np.linalg.norm(kj_m[i] - kj_m[k])
                    if (D > dist_max):
                        dist_max = D
                    elif (D > 0 and D < dist_min):
                        dist_min = D
                    Dist_m[i][k] = D
            dr = (dist_max - dist_min) / (ss - 1)  # r的间距
            r_m = []
            Cr_m = []
            for r_index in range(ss):
                r = dist_min + r_index * dr
                r_m.append(r)
                Temp = np.heaviside(r - Dist_m, 1)
                for i in range(i_num):
                    Temp[i][i] = 0
                Cr_m.append(np.sum(Temp))
            r_m = np.log(np.array((r_m)))
            print(r_m)
            Cr_m = np.log(np.array((Cr_m)) / (i_num * (i_num - 1)))
            print(Cr_m)
            plt.plot(r_m, Cr_m, label = str(m))
        plt.xlabel('ln(r)', fontsize=18)
        plt.ylabel('ln(C)', fontsize=18)
        plt.xticks(fontsize=16)
        plt.yticks(fontsize=16)
        plt.legend(fontsize=15)
        plt.show()

if __name__=='__main__':
    # 检验关联维数程序
    t = []
    f1 = 25
    f2 = 30
    for i in range(N_ft):
        t.append(i * 0.001)
    t = np.array(t)

    # yu = np.ones(M * N)
    AEall = np.sin(t * 2 * np.pi * f1) + np.sin(t * 2 * np.pi * f2)  #在这里直接改信号
    GP(AEall, 1)

The results of the code operation are as follows:
insert image description hereSimilarly, combined with the introduction of the GP algorithm to determine the optimal embedding dimension in the paper , the slope of the linear part of the curve generally increases with the increase of m, and tends to be stable when the slope no longer increases When , it is a saturated relational dimension, and the corresponding m is the best embedding dimension. So far, the determination of the optimal time delay and embedding dimension is completed, and the phase space reconstruction of the time series is completed based on these two parameters.

3 References

[1] Gao Junjie. Research and application of chaotic time series forecasting [D]. Shanghai Jiao Tong University, 2013.
[2] Python implements phase space reconstruction to find correlation dimension: https://blog.csdn.net/Lwwwwwwwl/article /details/111410179.

Guess you like

Origin blog.csdn.net/qq_38606680/article/details/130654825