2023 Huashu Cup Mathematical Modeling C Question Complete Paper, including the code for each question

Table of contents

 

Summary

2.1 Problem analysis of problem 1

2.2 Problem analysis of problem 2

2.3 Problem analysis of problem 3

The finished paper is here


 

                Summary

In question one, we used data from 390 3- to 12-month-old infants and their mothers to explore the
effects of mothers' physical and psychological indicators on infant behavioral characteristics and sleep quality. We first performed a descriptive statistical analysis
to
understand . Then, through One-Hot coding to deal with infant behavior characteristics and sleep patterns, we use random forest model to evaluate mother's physical indicators (such as age, marital status, education level, etc.) and psychological indicators (such as CBTS, EPDS, HADS, etc.
) The relationship between sleep quality in infants. The results showed that the mother's mental health
status was indeed significantly related to the baby's sleep quality, especially the mother's depression and anxiety symptoms were
negatively correlated with the baby's sleep quality. In addition, the mother's education level and the way the baby fell asleep were also related to the baby's sleep quality.
These findings underscore the importance of maternal mental health to infant development and also inform future intervention strategies.
Question two, from data preprocessing to feature engineering, to model establishment and solution. The goal is to predict the target variable based on the given
features . In the data loading and inspection phase, we loaded the data and did basic inspections and overviews.
The data preprocessing phase included encoding of categorical variables, handling missing values, and data normalization. The feature engineering phase
scales the numerical features to meet the needs of the model. In the model building and solving phase, we tried
basic models such as logistic regression, support vector machine, k-nearest neighbor, gradient boosting tree, and integrated models such as random forest, Adaboost, XGBoost, and optimized the Hyperparameters of the random forest model with GridSearch . Among the models tried, logistic regression performed the best, with an accuracy of about 64.10%. We also gain insight into data distribution and model performance through various graphs and visualizations
.
Question 3 mainly discusses how to estimate the
minimum treatment cost to change the infant's behavior characteristics from ambivalent to moderate or quiet through mathematical modeling and optimization techniques. First, we extracted relevant
features from the provided dataset. We then used linear interpolation models to establish
the relationship between treatment costs and prevalence scores such as CBTS, EPDS, and HADS scores. Next, we constructed a linear programming model to minimize treatment costs
while ensuring that a target score, such as the average score for moderate or quiet infants, was achieved.
In question 4, the baby's sleep quality should be comprehensively judged in four grades: excellent, good, medium and poor. The assessment is based
on three key factors: total sleep duration, number of awakenings, and sleep patterns. The sleep time of the whole night is analyzed in minutes
, and points are assigned according to different time intervals. The number of awakenings is assigned a score according to the number of awakenings, reflecting
the continuity of sleep. The way of falling asleep is given points according to the comfort and effectiveness of falling asleep, including five levels, such as sleeping method,
touching method and so on. By combining these factors, we calculated a total score for each sample and divided sleep quality
into four grades based on the total score. For the processed data, missing value processing and other means are used to check the integrity of the data and
carry out orderly coding. Then, use XGBoost to filter the features for the data and delete useless features. A stacked fusion classification model was subsequently constructed to predict sleep quality ratings. Random forests, support vector machines, and gradient boosting machines were used
as basic classifiers, and the best combination of parameters was found by grid search.
The accuracy of the model on the test set is 91.54%. We also plotted the ROC curve and confusion matrix of each model, and used the trained model
to predict new feature values.

2.1 Problem analysis of problem 1


In this study, we focused on how the mother's physical and psychological indicators affect the infant's behavioral characteristics and
sleep quality. By analyzing data from 390 infants aged 3 to 12 months and their mothers, we aim to reveal the relationship between the mother
's age, marital status, education level, gestational duration, mode of delivery, mental health status, etc. and the infant's sleep
quality and behavioral characteristics potential relationship between.
Problem analysis must consider the following aspects:
(1) Data understanding: We need to understand the structure and content of data, including the meaning and possible association of various indicators,
in order to determine the direction and method of analysis.
(2) Feature processing: Since the data includes numerical and categorical features, we need to consider how to process these features, such as
using One-Hot encoding to convert categorical features. (3) Model selection: choosing an appropriate model is the key. In this case, the random forest model was chosen as a suitable algorithm
because it can handle complex non-linear relationships and provides an assessment of feature importance.
(4) Interpretation of results: Interpreting the results of the model to understand how the mother's physical and psychological state affects the infant's behavior and sleep
can help provide targeted intervention or support.
(5) Visualization: By drawing descriptive statistical charts and model result graphs, the analysis is more intuitive and easy to understand.


2.2 Problem analysis of problem 2


In this project, our goal is to predict information about behavioral characteristics of infants. We have a set of characteristics including mother
's age, marital status, education, gestational age (weeks), mode of delivery, and some mental health indicators (CBTS,
EPDS, HADS). Our task is to build a model based on these features to accurately predict the infant's behavior pattern
.


2.3 Problem analysis of problem 3


In this problem, the challenge is to estimate the
minimum cost of treatment required to change an infant's behavioral characteristics from ambivalent to moderate or quiet. This problem involves many complex factors and potential relationships, requiring careful analysis
and precise modeling. The main aspects of problem analysis are as follows:
(1) Data understanding: The data set covers the relevant data of 390 infants aged 3 to 12 months and their mothers, including physical
indicators, psychological indicators and infant sleep quality indicators. Understanding these variables and their possible interactions is
key to building effective models.
(2) Feature engineering: Selecting appropriate features to describe the mother's mental health status and the infant's behavioral characteristics is an important
step. This may involve feature selection, feature transformation and possibly feature interaction.

(3) Model selection: Due to the complexity of the problem, it may be necessary to use advanced machine learning techniques, such as gradient boosting machines,
support vector machines, and deep neural networks, to capture complex nonlinear relationships.
(4) Cost estimation: The relationship between treatment cost and disease severity may be non-linear. Linear interpolation methods can be used to
estimate treatment costs within a given score range.
(5) Optimization problem: The ultimate goal of the problem is to find a solution that minimizes the cost of treatment while ensuring that a specific
target score is achieved. This is an optimization problem that may involve linear programming or other optimization techniques.
(6) Practical considerations: When constructing mathematical models, practical medical and mental health factors also need to be considered to ensure the
feasibility and practicality of solutions.

The finished paper is here

Business card at the end of the paper to get the full paper
 

 

 

Guess you like

Origin blog.csdn.net/weixin_45499067/article/details/132114245