EEG data analysis of healthy people and epileptic patients based on DFA method with code

introduction

       The DFA analysis method is a method proposed by C.-K to study the long-term correlation of time series fluctuations. It is mainly used to distinguish the fluctuations generated by the complex system itself from the fluctuations generated by external and environmental stimuli on the system. Changes generated by external stimuli are assumed to cause local effects, whereas changes generated by dynamics within the system itself are assumed to exhibit long-term dependencies. The DFA is another measure of scale-invariant behavior because it estimates trends at all scales that exhibit fractal properties. The DFA calculation includes emphasizing the local trend of the correlation reduction caused by non-stationarity, and quantifying the long-term fractal correlation characteristics that represent the nature of the system itself.

       DFA provides a method for clinicians to study the long-term correlation of physiological signals generated by the internal mechanism of the system itself, and the research does not include stimulus signals generated by the outside world that are not related to the system itself. The method utilizes the entire sequence for computation and is scale-free, so it can provide useful information for distinguishing physiological signals. Theoretically, the scale index

       The scale exponent of physiological signals approaches 1 varying from 0.5 (random sequence) to 1.5 (random pace). A scale index greater than 1 indicates loss of long-term scale correlations and pathological changes in the organism itself. This technique was initially applied to detect the long-term correlation of DNA sequences, and was later widely used in the analysis of physiological time series.

       The DFA method has been utilized to estimate the onset and diagnosis of epilepsy, the elderly, and epilepsy patients all show loss of fractal scale. In the heart study, through the analysis of 2-hour ECG records, it was found that DFA can provide information that traditional time domain and frequency domain analysis cannot provide. In the analysis of epilepsy data of 24-hour post-seizure patients compared with other analysis methods, the decrease of scale index significantly predicted the onset of abnormal brain activity. The DFA method is obviously superior to the spectral analysis method in analyzing the brain activity changes of sleep apnea patients. In the study of predicting mortality after epileptic seizures, the decline of the short time scale index is a good predictor parameter. In conclusion, the change of DFA scale index in the brain fluctuation signal can provide auxiliary diagnosis and prediction information for the occurrence and diagnosis of epilepsy.

       In this article, I use the DFA method to analyze the self-similar characteristics of EEG in different physiological and pathological states, and draw some comparative conclusions, which can have an important prompting effect on clinical diagnosis.

principle

       Electroencephalogram (electroencephalogram, referred to as EEG), as a graphic record of brain physiological electrical activity, reflects the electrical changes in the process of brain excitation, conduction and recovery. It is shown in the electric diagram, as shown in Figure 1. Therefore, the use of EEG to detect and analyze the physiological electrical activity of the brain has always been the most important method and means for the detection and diagnosis of cardiac function in clinical medical practice.

Figure 1 EEG of normal and epileptic patients

       In order to overcome the high non-stationarity of physiological data time series, an improved root mean square analysis method (detrended fluctuation analysis, DFA) for dealing with random walk is proposed to analyze the self-similar characteristics of biomedical data. Its advantage over traditional methods (such as spectral analysis and Hurst analysis) is that it can detect the inherent self-similarity of a seemingly non-stationary time series, while avoiding spurious detection. Extrinsic tendency self-similarity. The DFA algorithm is more suitable for some slowly changing trend non-stationary time series.

The specific algorithm is as follows:

       For a certain ECG time series (the total length is N) as shown in Figure 1, the summation is first performed

         

Wherein, B i is the i-th data, B ave is the average value of the analyzed ECG time series. This summation step maps the original time series to a self-similar process.

       Next, measure the vertical feature scale of the summed time series. The method is: the time series after the summation is divided into many small fragments of the same length as uploading... re-uploading cancellation. Draw the least-squares fitting line for each small segment of length , which is used to represent the trend in that segment (Fig. 2). The y-coordinate of the line segment is labeled y n k .

Figure 2 Local detrending diagram

(The vertical line shows that the segment length is 10000, and the straight line segment in the figure represents the trend of the summed time series drawn by linear least squares fitting for each segment)

       Second, detrend the summed time series by subtracting the local trend y n k from y k in each segment . For a given segment length, the characteristic size of the fluctuations of this summed and detrended time series can be calculated by:

 

       Repeating the above calculations over all time scales (segment sizes) yields a plot versus segment size. The slope of the pair determines the scale index (self-similarity parameter).

methods

       The DFA method was applied to analyze two typical physiological time series. We collected the EEG data of 5 healthy people and 5 epilepsy patients. We took 10,000 points in each group of data, and then calculated the time series and analyzed them.

results

       In order to further explore the relationship between DFA and scale, in this experiment, I chose eight different scales of 4, 8, 20, 50, 100, 200, 500, and 1000; and, in order to make the experiment universal, avoid experiment In order to avoid the influence of the physiological response brought by the maladaptation of the initial patients, we randomly selected the middle signal of the EEG signal of the normal person and the epileptic patient. Load this part of the signal into python for experimentation. In order to study the EEG signals of normal people and epileptic patients, we might as well choose the signal collected by the C3 lead as the main research object.

       The situation of DFA sub-scale fitting in the experiment is shown in Figure 3.

Figure 3 Comparison of the fractal diagram of DFA at eight different scales and the original EEG diagram


       Through the data analysis of DFA, the slope determines the scale is being uploaded... Re-upload the canceled value, and use the least squares method to fit it to get you and the image in Figure 4.

Figure 4 Slope Determines Scale Diagram

       Through the DFA algorithm, in order to distinguish the difference between the EEG signals of normal people and EEG patients at the algorithm level, we can see from the scatter plot that the FN value of the EEG of epilepsy patients is larger than that of normal people and the distribution is scattered. As shown in Figure 5.

Figure 5 Scatter diagram of the distribution of FN values ​​in normal people and epileptic patients

       It can be clearly seen in Figure 6 that the DFA index of epilepsy patients, that is, the slope determines the scale (self-similarity parameter), the maximum value is higher than that of normal people, and the minimum value is lower than normal people, and the distribution of self-similarity parameters is very scattered. The value span is large. Therefore, the DFA algorithm can clearly distinguish epileptic patients from normal patients, and can bring certain auxiliary uses to medical workers in clinical diagnosis and reduce the rate of misdiagnosis.

Discussion and conclusion

       We use the method of DFA to study the change of self-similar characteristics of ECG under different physiological and pathological states. The results show that DFA is a better method for indicating the evolution trend of epilepsy.

       The study found that, from the perspective of medical statistics, the average value of electrocardiograms decreased first and then increased as the neurological health of the brain deteriorated. The fluctuation range of the DFA value of the electrocardiogram is relatively stable in a certain range in healthy people; the fluctuation range of the value in patients with coronary heart disease is relatively large; the fluctuation range in patients with myocardial infarction is the largest. Studies have shown that the changes in the mean value and fluctuation range can more effectively reveal the health of the brain, and the changes in the mean value and fluctuation range can more effectively reveal the health of the heart, especially the change in the value fluctuation range is an early detection of epilepsy It is a more sensitive parameter of the disease and has clinical diagnostic significance.



5 codes

The function open_file.py to read the file

import random
import codecs

def open_file(file_path):
    f = codecs.open(file_path)
    line = f.readline()
    data = []
    data1 = []
    real_data1 = []
    while line:
        a = line.split()
        b = a[4:5]
        b1 = ','.join(map(str, b))
        data.append(b)
        data1.append(b1)
        line = f.readline()
    data.remove(['C3'])
    data1.remove('C3')
    real_data = [float(a) for a in data1]
    r = round(random.uniform(0, 1), 2) * 5000
    for i in range(int(r), int(r) + 10000):
        real_data1.append(real_data[i])
    return real_data1

Data preprocessing function split_list.py for splitting strings


def list_of_groups(init_list, children_list_len):
    list_of_groups = zip(*(iter(init_list),) *children_list_len)
    end_list = [list(i) for i in list_of_groups]
    count = len(init_list) % children_list_len
    end_list.append(init_list[-count:]) if count !=0 else end_list
    return end_list

Data processing function data_process.py

import  split_list
import  math
import numpy as np

def process(variable_name,group_count):
    ave_value = []
    variable_sum = 0
    for i in range(0, len(variable_name)):
        variable_sum += variable_name[i]
    ave = variable_sum / len(variable_name)
    for i in range(0, len(variable_name)):
        ave_value.append(variable_name[i] - ave)
    yi = np.cumsum(ave_value)  # yi是文献中的y(k)

    temp = []
    for i in range(0, len(variable_name)):
        temp.append(i)
    x_data1 = split_list.list_of_groups(temp, group_count)
    y_data1 = split_list.list_of_groups(yi, group_count)  # y_data1 = yi
    temp1 = []
    for i in range(0, int(len(variable_name)/group_count)):
        poly = np.polyfit(x_data1[i], y_data1[i], deg=1)
        temp2 = np.polyval(poly, x_data1[i])
        for j in range(0, group_count):
            temp1.append(temp2[j])

    y_data2 = split_list.list_of_groups(temp1, group_count)  # 去趋势之后的函数值 yk
    yk_minus = []
    t = []
    a = 0
    for i in range(0, int(len(variable_name)/group_count)):
        yk_minus.append(list(map(lambda x: x[0] - x[1], zip(y_data1[i], y_data2[i]))))
        for j in range(0, group_count):
            t.append(yk_minus[i][j])

    for i in range(0,len(t)):
        a += (t[i])**2
    fn = math.sqrt(a / len(t))
    return temp,temp1,yi,fn

drawDFA_Signal_figure.py

from open_file import open_file
from matplotlib import pyplot as plt
plt.rcParams['font.sans-serif']=['SimHei']
plt.rcParams['axes.unicode_minus']=False
from data_process import *
import math
fn = []
epilepsy_1 = open_file('E:/DFA/DFA_DATA/癫痫异常脑电图/20121030105258.txt')
epilepsy_2 = open_file('E:/DFA/DFA_DATA/癫痫异常脑电图/20121031103719.txt')
epilepsy_3 = open_file('E:/DFA/DFA_DATA/癫痫异常脑电图/20121031111404.txt')
epilepsy_4 = open_file('E:/DFA/DFA_DATA/癫痫异常脑电图/20121031113934.txt')
epilepsy_5 = open_file('E:/DFA/DFA_DATA/癫痫异常脑电图/20121031163308.txt')
normal_1 = open_file('E:/DFA/DFA_DATA/正常脑电图/20121030084615.txt')
normal_2 = open_file('E:/DFA/DFA_DATA/正常脑电图/20121030091121.txt')
normal_3 = open_file('E:/DFA/DFA_DATA/正常脑电图/20121030103952.txt')
normal_4 = open_file('E:/DFA/DFA_DATA/正常脑电图/20121030110908.txt')
normal_5 = open_file('E:/DFA/DFA_DATA/正常脑电图/20121030151450.txt')

plt.plot(epilepsy_1, label='abnormal_1 plot')
plt.plot(epilepsy_2, label='abnormal_2 plot')
plt.plot(normal_1, label='normal plot')
# generate a legend box
plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=0,ncol=3, mode="expand", borderaxespad=0.)
plt.show()



temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,4)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=4)')
plt.show()
fn.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,8)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=8)')
plt.show()
fn.append(fn_1)

temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,20)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=20)')
plt.show()
fn.append(fn_1)

temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,50)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=50)')
plt.show()
fn.append(fn_1)

temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,100)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=100)')
plt.show()
fn.append(fn_1)

temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,200)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=200)')
plt.show()
fn.append(fn_1)

temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,500)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=500)')
plt.show()
fn.append(fn_1)

temp_1,fitted_value_1,yk_1,fn_1 = process(epilepsy_1,1000)
plt.plot(temp_1,epilepsy_1,'g',temp_1,yk_1,'b',temp_1,fitted_value_1,'r--')
plt.title('去趋势之后的癫痫脑电信号图Ⅰ(n=1000)')
plt.show()
fn.append(fn_1)

number = [4,8,20,50,100,200,500,1000]
number_log = []
fn_log = []
for i in range(0,len(number)):
    number_log.append(math.log(number[i],10))
    fn_log.append(math.log(fn[i],10))

poly = np.polyfit(number_log, fn_log, deg = 1)

plt.plot(number_log, fn_log,'o')
plt.plot(number_log, np.polyval(poly,number_log))
plt.show()

fn_normal_1 = []
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,4)
fn_normal_1.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,8)
fn_normal_1.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,20)
fn_normal_1.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,50)
fn_normal_1.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,100)
fn_normal_1.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,200)
fn_normal_1.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,500)
fn_normal_1.append(fn_1)
temp_1,fitted_value_1,yk_1,fn_1 = process(normal_1,1000)
fn_normal_1.append(fn_1)
#
# fn_normal_log = []
# for i in range(0,len(number)):
#     fn_normal_log.append(math.log(fn_normal_1[i],10))
# poly = np.polyfit(number_log, fn_log, deg = 1)
# plt.plot(number_log, fn_normal_log,'o')
# plt.plot(number_log, np.polyval(poly,number_log))
#

DFA_scatter_figure.py

from matplotlib import pyplot as plt
plt.rcParams['font.sans-serif']=['SimHei']
plt.rcParams['axes.unicode_minus']=False
from DFA_signal_figure import *
import math

FN_abnormal = []
FN_normal = []
member_count = [4,8,20,50,100,200,500,1000]
group_name_abnormal = [epilepsy_1,epilepsy_2,epilepsy_3,epilepsy_4,epilepsy_5]
group_name_normal = [normal_1,normal_2,normal_3,normal_4,normal_5]
for i in range(0,len(group_name_abnormal)):
    for j in range(0,len(member_count)):
        temp_1, fitted_value_1, yk_1, fn_1 = process(group_name_abnormal[i], member_count[j])
        FN_abnormal.append(fn_1)
for i in range(0,len(group_name_normal)):
    for j in range(0,len(member_count)):
        temp_1, fitted_value_1, yk_1, fn_1 = process(group_name_normal[i], member_count[j])
        FN_normal.append(fn_1)
dimension_4_abnormal = []
dimension_8_abnormal = []
dimension_20_abnormal = []
dimension_50_abnormal = []
dimension_100_abnormal = []
dimension_200_abnormal = []
dimension_500_abnormal = []
dimension_1000_abnormal = []
dimension_4_normal = []
dimension_8_normal = []
dimension_20_normal = []
dimension_50_normal = []
dimension_100_normal = []
dimension_200_normal = []
dimension_500_normal = []
dimension_1000_normal = []
x = [0,8,16,24,32]
for i in range(0,len(x)):
    dimension_4_abnormal.append(math.log(FN_abnormal[x[i]],10))
    dimension_8_abnormal.append(math.log(FN_abnormal[x[i] + 1],10))
    dimension_20_abnormal.append(math.log(FN_abnormal[x[i] + 2],10))
    dimension_50_abnormal.append(math.log(FN_abnormal[x[i] + 3],10))
    dimension_100_abnormal.append(math.log(FN_abnormal[x[i] + 4],10))
    dimension_200_abnormal.append(math.log(FN_abnormal[x[i] + 5],10))
    dimension_500_abnormal.append(math.log(FN_abnormal[x[i] + 6],10))
    dimension_1000_abnormal.append(math.log(FN_abnormal[x[i] + 7],10))
    dimension_4_normal.append(math.log(FN_normal[x[i]],10))
    dimension_8_normal.append(math.log(FN_normal[x[i] + 1 ],10))
    dimension_20_normal.append(math.log(FN_normal[x[i] + 2 ],10))
    dimension_50_normal.append(math.log(FN_normal[x[i] + 3 ],10))
    dimension_100_normal.append(math.log(FN_normal[x[i] + 4 ],10))
    dimension_200_normal.append(math.log(FN_normal[x[i] + 5 ],10))
    dimension_500_normal.append(math.log(FN_normal[x[i] + 6 ],10))
    dimension_1000_normal.append(math.log(FN_normal[x[i] + 7 ],10))
log_x = []
for i in range(len(member_count)):
    log_x.append(math.log(member_count[i],10))

x1 = []
x2 = []
x3 = []
x4 = []
x5 = []
x6 = []
x7 = []
x8 = []
for i in range(0,5):
    x1.append(math.log(4, 10))
    x2.append(math.log(8,10))
    x3.append(math.log(20, 10))
    x4.append(math.log(50, 10))
    x5.append(math.log(100, 10))
    x6.append(math.log(200, 10))
    x7.append(math.log(500, 10))
    x8.append(math.log(1000, 10))


plt.scatter(x1, dimension_4_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x2, dimension_8_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x3, dimension_20_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x4, dimension_50_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x5, dimension_100_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x6, dimension_200_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x7, dimension_500_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x8, dimension_1000_abnormal, marker = 'x',color = 'red', s = 20 ,label = 'First')
plt.scatter(x1, dimension_4_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.scatter(x2, dimension_8_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.scatter(x3, dimension_20_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.scatter(x4, dimension_50_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.scatter(x5, dimension_100_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.scatter(x6, dimension_200_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.scatter(x7, dimension_500_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.scatter(x8, dimension_1000_normal, marker = 'o',color = 'blue', s = 20 ,label = 'First')
plt.title('正常和癫痫的脑电FN值的分布散点图')
plt.show()

# plt.plot(epilepsy_1, label='abnormal_1 plot')
# plt.plot(epilepsy_2, label='abnormal_2 plot')
# plt.plot(normal_1, label='normal plot')
# # generate a legend box
# plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=0,ncol=3, mode="expand", borderaxespad=0.)
# # annotate an important value
# plt.annotate("Important value", (55, 20), xycoords='data',xytext=(5, 38),arrowprops=dict(arrowstyle='->'))
# plt.show()

boyplot.py


import numpy as np
from DFA_scatter_figure import FN_normal,FN_abnormal
import pandas as pd
import matplotlib.pyplot as plt
import math
logFN_normal = []
logFN_abnormal = []
for i in range(0,len(FN_normal)):
    logFN_normal.append(math.log(FN_normal[i],10))
for i in range(0,len(FN_abnormal)):
    logFN_abnormal.append(math.log(FN_abnormal[i],10))

normal = []
abnormal = []
for i in range(0,len(FN_abnormal)):
    if i%8 == 0 :
        normal.append(float(logFN_normal[i])/math.log(4,10))
        abnormal.append(float(logFN_abnormal[i])/math.log(4,10))
    if i%8 == 1:
        normal.append(float(logFN_normal[i]) / math.log(8, 10))
        abnormal.append(float(logFN_abnormal[i]) / math.log(8, 10))
    if i%8 == 2:
        normal.append(float(logFN_normal[i]) / math.log(20, 10))
        abnormal.append(float(logFN_abnormal[i]) / math.log(20, 10))
    if i%8 == 3:
        normal.append(float(logFN_normal[i]) / math.log(50, 10))
        abnormal.append(float(logFN_abnormal[i]) / math.log(50, 10))
    if i%8 == 4:
        normal.append(float(logFN_normal[i]) / math.log(100, 10))
        abnormal.append(float(logFN_abnormal[i]) / math.log(100, 10))
    if i%8 == 5:
        normal.append(float(logFN_normal[i]) / math.log(200, 10))
        abnormal.append(float(logFN_abnormal[i]) / math.log(200, 10))
    if i%8 == 6:
        normal.append(float(logFN_normal[i]) / math.log(500, 10))
        abnormal.append(float(logFN_abnormal[i]) / math.log(500, 10))
    if i%8 == 7:
        normal.append(float(logFN_normal[i]) / math.log(1000, 10))
        abnormal.append(float(logFN_abnormal[i]) / math.log(1000, 10))

print(normal)
print(abnormal)
data = {

'normal': normal,

'abnormal': abnormal,

}



df = pd.DataFrame(data)
df.plot.box(title="正常与非正常情况下DFA指数的分布图")
plt.grid(linestyle="--", alpha=0.3)
plt.show()

Guess you like

Origin blog.csdn.net/LusionLv/article/details/125095193