Effectively Delaying Dementia: Yonsei University Finds Gradient Boosting Machine Model Can Accurately Predict BPSD Subsyndrome

Contents at a glance: Dementia has become a public health problem as the population continues to age. At present, the medical treatment of the disease can only be relieved by drugs, and no effective method of cure has been found. Therefore, the prevention of dementia is particularly urgent. Against this backdrop, researchers at Yonsei University developed and validated several machine learning models for predicting BPSD. Experimental results show that machine learning can effectively predict BPSD sub-syndromes.

Key words: dementia BPSD gradient boosting machine

This article was first published on the HyperAI super neural WeChat public platform~

Currently, more than 550 million people worldwide suffer from dementia (Alzheimer's disease is the most common type), with nearly 10 million new cases each year, and this number is expected to increase by 2050 as the population continues to age double. Dementia is a brain disorder that causes a slow decline in memory, thinking and reasoning. The disease mainly affects the elderly and is one of the main causes of loss of self-care ability in the elderly. It ranks seventh among the leading causes of death (in terms of total number of deaths) in the world, and the top three are ischemic heart disease , stroke and chronic obstructive pulmonary disease.

Typically, dementia patients exhibit a range of behavioral and psychological symptoms (BPSD) such as agitation, aggression, apathy, and depression, in addition to cognitive impairment. These symptoms are some of the most complex and challenging issues in dementia care, not only contributing to the inability of the patient to live independently, but also placing a considerable burden on caregivers.

Recently, researchers at Yonsei University in Korea, Eunhee Cho et al. developed and validated several machine learning models for predicting BPSD. The research has been published in the journal "Scientifc Reports" with the title "Machine learning-based predictive models for the occurrence of behavioral and psychological symptoms of dementia: model development and validation".

   The research results have been published in "Scientific Reports"

Paper address:

Machine learning-based predictive models for the occurrence of behavioral and psychological symptoms of dementia: model development and validation | Scientific Reports

data set

Data collection was performed in three trips, and a total of 187 dementia patients were used for model training and 35 additional patients for external validation. The second data collection was a repeated measurement of the participants in the first data collection, and the third data collection recruited new participants for measurement. In the study, the data collected for the first and second time were used as the training set, and the data set collected for the third time was used as the test set.

In order to collect comprehensive characteristic information on the participants, the researchers first investigated their health data (age, gender, marital status, etc. ) Loggers monitor nighttime sleep and activity levels, and finally a symptom diary is employed to record caregivers' perceived triggers for symptoms (hunger/thirst, urination/defecation, pain, insomnia, noise, etc.) and 12 BPSDs that occur daily in the patient. In addition, these symptoms are also divided into 7 sub-syndromes, and the figure below visually shows the recording of data from physical activity recorders and symptom diaries.

Table 1: Statistics for Actigraphy and Symptom Diaries

SD: standard deviation

TST: total sleep time

WASO: time to wake up after falling asleep

NoA: number of wake-ups

MAL: waking hours

METs: metabolic equivalents

MVPA: moderate to vigorous physical activity

BPSD: Behavioral and Psychological Symptoms of Dementia

Other causes: BPSD triggers perceived by other caregivers (therapy, nightmares, etc.)

However, due to reasons such as participants' disobedience or improper wearing of the device, the data of the actigraph was missing. According to statistics, those with missing data accounted for 36% of the total number of participants, with an average of 0.9 days of data missing per person. Therefore, the researchers used chained equations of multiple imputation (multivariate imputation was applied using chained equations) to deal with this part of the missing data.

experiment procedure

The researchers trained four models to determine the best model for predicting each subsyndrome. Based on the findings, researchers can apply these models to clinical monitoring and prediction of BPSD subsyndromes. At the same time, we will intervene on potential BPSD influencing factors to achieve patient-centered dementia care services. Additionally, machine learning algorithms can also be embedded in smartphone apps to further increase their value.

model performance 

The researchers used four machine learning algorithms, including logistic regression, random forest, gradient boosting machine, and support vector machine, to evaluate Model performance, the best model for predicting BPSD subsyndromes was selected. Here, the logistic regression model is the most common and mature, so it is used as a benchmark model to judge the performance improvement of machine learning.

Based on the training set, through five-fold cross-validation, the performance of different models to predict BPSD sub-syndrome is as follows:

Table 2: Prediction performance of different models for BPSD subsyndromes based on the training set

AUC: Area under the ROC curve

LR: logistic regression model

RF: Random Forest Model

GBM: Gradient Boosting Machine Model

SVM: Support Vector Machine Model

ROC curve: The ROC (Receiver Operating Characteristic Curve) curve is a graphical tool that depicts the performance of a classifier.

AUC value: AUC (Area Under the Curve) value represents the area under the ROC curve, which is used to measure the performance of the classifier. The closer the AUC value is to 1, the better the performance of the classifier.

Table 2 shows that the gradient boosting machine model has a higher AUC value in predicting ADHD (0.706), affective symptoms (0.747) and eating disorders (0.816); the support vector machine model has the highest AUC value (0.706) in predicting mental symptoms; random forest The model had the highest AUC values ​​for sleep and nocturnal behavior (0.942); the logistic regression model had the highest AUC values ​​for abnormal activity behavior (0.822) and pathological euphoria (Euphoria/elation, 0.696).

model validation 

The researchers used an external validation method to validate the model on a third collected dataset. Based on the test set, the performance of different models to predict BPSD sub-syndrome is as follows:

Table 3: Prediction performance of different models for BPSD subsyndrome based on test dataset

AUC: Area under the ROC curve

LR: logistic regression model

RF: Random Forest Model

GBM: Gradient Boosting Machine Model

SVM: Support Vector Machine Model

Table 3 shows that machine learning models perform better than logistic regression models . Specifically, for most of the sub-syndromes, the performance of random forest and gradient boosting machine models is better than that of logistic regression and support vector machine models; ) and eating disorders (0.888) than other predictive models; the gradient boosting machine model has a higher AUC value than other predictive models in predicting psychiatric symptoms (0.801); the support vector machine model has higher AUC values ​​in sleep and nocturnal behavior ( 0.929) has the highest AUC value.

Combining the information of the two charts, the researchers found that the gradient boosting machine model had the highest average AUC value in predicting the 7 sub-syndromes, that is, the best performance. At the same time, the researchers also reminded that when the sample size of the test data set is small, the results of the prediction performance need to be extrapolated cautiously, and suggested that repeated experiments with larger sample sizes should be carried out in the future to obtain more accurate prediction results.

Domestic achievements: predicting the onset of dementia ten years in advance

In terms of dementia prediction, not only foreign countries, but also China has achieved remarkable results. Last September, the clinical research team of Yu Jintai, Chief Physician of the Department of Neurology, Huashan Hospital Affiliated to Fudan University, together with Professor Feng Jianfeng of the Institute of Brain-Inspired Intelligence Science and Technology of Fudan University and the algorithm team of Cheng Wei, a young researcher, developed the UKB-DRP dementia prediction model.

The model can predict whether an individual will develop disease in the next five years, ten years or even longer, and screen out groups in the early stages of dementia, including all-cause dementia and its major subtypes (such as Alzheimer's disease). The research results have been published in the "Lancet" sub-journal "Electronic Clinical Medicine".

Paper address:

https://www.thelancet.com/journals/eclinm/article/PIIS2589-5370(22)00395-9/fulltext

This research result also shows the domestic innovation strength and scientific research level in the field of dementia prediction. In the future, with the joining of more institutions and research teams and the accumulation of more comprehensive and diverse data, we are expected to see more cooperation and progress at home and abroad. With the power of artificial intelligence and big data analysis, we can make greater contributions to the prevention, treatment and management of dementia, and bring more hope and well-being to patients and families.

This article was first published on the HyperAI super neural WeChat public platform~

Guess you like

Origin blog.csdn.net/HyperAI/article/details/131170071