2021 Asia-Pacific Cup Mathematical Contest in Modeling

Asia-Pacific Cup Mathematical Contest in Modeling

Knowledge points and basic model explanations involved in data type questions

I have participated in the Asia-Pacific Cup four times, and the three times were the First Prize. One of them stopped writing a modeling paper because the computer burned out halfway and won the Second Prize. The bridge of dreams; comprehensively promote the winning goal of each student's mathematical modeling competition, and earnestly grasp the key tasks of programming and mathematical models.
insert image description here
It can be said that the Asia-Pacific Cup is the mathematical modeling competition that is most likely to be awarded among all modeling competitions. No matter whether the topic of this Asia-Pacific Cup is to study the efficiency of algorithms, to explore data laws, or to have answers to calculation results within a fixed range, we can Tried and tested using the data I describe next to describe the basic model:

Basic statistical description of the data

– Purpose
– Basic statistical description of data
• To better identify the nature of the data and grasp the overall picture of the data.
• Measures of Central Tendency, Measures of Data Dispersion, Graphical Representation of Data
- Measures of Central Tendency
• Mean, Weighted Arithmetic Mean, Median, Mode, Median
- Measures of Data Dispersion
• Ranges, Quantiles and Quartiles , variance and standard deviation
– graphical display of data
• Box plots, pie charts, frequency histograms, and scatter
plots. Do this every year, and get the First Prize every year, because before your data analysis: first, you need to do data processing; second, the step of understanding the data from a macro perspective is called the basic statistical description of the data; the third step is specific and targeted Analyze data variables

1. Measure of central tendency

– Mean (Mean)
• Let x1, x2, ..., xN be N observed values ​​of a certain numerical attribute X, and the mean value of this value set is shown in formula
(2-1).
insert image description here
– Weighted Mean (Weighted Mean)
• For i=1,...,N, each value xi has a weight wi.
insert image description here
– Grouped Median (Grouped Median)
• Determine the group where the median is located based on N/2
insert image description here
Me: the median, L: the lower limit of the group where the median is located, Sm-1: each group below the group where the median is located
The cumulative frequency of the median, fm: the frequency of the group where the median is located, d: the group distance of the group where the median is located.
– Mode: the most frequent value in the data
– Midrange: the arithmetic mean of the maximum and minimum values ​​in the data set

2. Data Dispersion Metrics

– Range (also known as range, Range): It is the gap between the maximum value and the minimum value in the collection,
that is, the data obtained by subtracting the minimum value from the maximum value.
– Quantile: Points taken at regular intervals from the data distribution to
divide the data into coherent sets of basically equal size.
– Variance (sample variance): is the mean of the squares of the respective differences of each data point from the mean.
insert image description here

3. Graphical display of data (data visualization)

Drawings are a must in every mathematical modeling competition, but do you really understand the graphical display of data? Do you know how to draw? Do you know when to draw a histogram, when to draw a scatterplot, and when to draw a density histogram? Do you know why you drew the drawing, and the result was right, why didn't you win the prize? **Because you don't know what visual graphics to draw under what circumstances. **Let me tell you here: without comparison, there is no identification, only one thing is known, and nothing is known.

– Box plot (also known as box plot, Box-plot), is a statistical graphic used to describe the distribution of data, which
can represent descriptive statistics such as median, quartile and extreme value of observed data.
– Pie chart (also known as circular chart or pie chart, Pie Graph), usually used to represent the components of the whole
and the proportional relationship between the various parts. Pie charts show the proportional relationship between the size of the items in a data series
and the sum of the items.

– Frequency histogram (also known as frequency distribution histogram, Frequency Histogram), is
a graph representing frequency distribution in statistics.

– Scatter Diagram: Draw the sample data points on a two-dimensional plane or three-dimensional space
, and intuitively study the statistical relationship and
strength between variables according to the distribution characteristics of the data points.

Proximity Measures for Nominal Attributes

− Dissimilarity
insert image description here
• p is the total number of attributes of the object, and m is the number of matching attributes (that is, the
number of attributes of objects i and j in the same state)
− Similarity
insert image description here
The methods for dissimilarity of data are:
Euclidean Distance (Euclidean Distance) : Also known as straight-line distance.
Manhattan Distance (Manhattan Distance): Also known as the city block distance.
Minkowski Distance.
Chebyshev Distance: Also known as the supremum distance, the
supremum distance between two objects is defined as the maximum value of the numerical difference of each coordinate.

If you want to learn more and take credit in competitions, study hard. There is no hesitation, no confusion, no entanglement in university study, focus on walking the path of learning under your feet, go further and higher, don't look back, and one day when you look down at your feet and see the stars shining under your feet.

Guess you like

Origin blog.csdn.net/weixin_43292788/article/details/120266070