2023 20th May 1 Mathematical Contest in Modeling Topic B Super Detailed Ideas

You can watch the detailed ideas and release video version, here is the corresponding text version, the content is similar.

Question B: Express delivery demand analysis problem

Question B is not very difficult, and the difficulty lies in the solution of the model in the next few questions. Many questions, many models, and complicated are the characteristics of question B.

Difficulty A>B>C

Topic selection B>C>A

Given data  data preprocessing (outliers, missing values)

Many questions, semi-open results

Evaluation + Prediction + Optimization

(Introduction to data files) Attachments 1, 2, and 3 are the express transportation data between some cities recorded by a domestic express company, including the delivery date, delivery city, and delivery city (the city names have been replaced by letters (taken off) Min), excluding June, November, December (feasibility) data).

Data preprocessing (missing value 0 is not a missing value, an outlier)

Question 1 : Attachment 1 is the courier transportation data recorded by the courier company between April 19, 2018 and April 17, 2019 (shipping city-receiving city). Cargo volume, express quantity growth/decrease trend, correlation, etc. (maximum receipt volume, maximum shipment volume) are considered from multiple perspectives, a mathematical model is established, and the importance of each site city (sample) is comprehensively ranked, and the importance is given. Fill in Table 1 with the city names of the top 5 stations in terms of degree.

The first problem, in simplified terms, is to choose indicators. Build a comprehensive evaluation model . Before the model is established, we need to perform data preprocessing. For question B, the question gives the data file, and we also need to consider missing values ​​and outliers. For the selection of indicators, the title implies that we can learn from the volume of goods received, the volume of shipments, the growth/decrease trend of the number of express delivery, correlation, etc. Or you can start from other perspectives, such as large receipts, maximum shipments, and so on.

For the calculation of the amount of goods received, we can use the filtering function of wps to list the same city as the post-delivery city, and then analyze it. Or you can use matlab, python, etc. to perform several for loops, and calculate the shipments of cities with the same results, etc.

Build a comprehensive evaluation model . For the selection of models, we have many options, such as objective comprehensive evaluation methods such as principal components, rank sum ratios, and ideal solutions. This question pertaining to the overall evaluation is an open-ended one, but it allows us to fill in this form with specific results. The top five results may have a range, so for this question, its comprehensive evaluation is a semi-open result. Just choose a comprehensive evaluation model and substitute it into the code package.

Question 2. Please use the data in Attachment 1 to establish a mathematical model to predict the number of express delivery between the cities of the "delivery-receipt" sites on April 18, 2019 and April 19, 2019, (and all "delivery" on that day - the total number of courier shipments between cities at the "receipt" site), and fill in Table 2 the number of courier shipments between the designated site cities, and the total courier shipments between all "shipment-receipt" site cities on the day Shipping quantity.

For question two, the essence is a simple predictive model. The difficulty lies not in the selection of the prediction model, but in the large amount of data, which requires us to make repeated predictions many times. For the choice of predictive model, we have many, many suitable options, as shown in the figure below.

For example, the current normalized LSTM model, based on its long-term memory function, can well distinguish long-term trends from seasonal trends.

This model is also a predictive model that I often use, so for convenience, the code of the model has been changed to modify the initial input data, and then adjust a few parameters according to the data to run the results (the positions to modify the parameters are marked) (put It will be used for your reference in subsequent sharing)

For question 2, there are so many prediction results, if the team feels that they have enough time, we can make predictions one by one. In fact, there is another faster way, which is filling. We predict two or three times, and have a rough understanding of the result range. We can write the rest of the results. Remember that this kind of data black box method cannot be described in the paper. The paper still says that we have established a XXXX prediction model and obtained XXXX results. This approach simply speeds up the solution of the model.

Question 3. Attachment 2 is the number of express shipments recorded by the express company from April 28, 2020 to April 27, 2023. Due to the impact of emergencies, the express lines between some cities cannot be transported normally, resulting in the inability to deliver or receive goods normally between the site cities (no data indicates that delivery cannot be received normally, and 0 indicates that there is no delivery demand). Please use the data in Attachment 2 to establish a mathematical model to predict the city pairs (shipping city-receiving city) that can be normally "delivered-received" on April 28, 2023 and April 29, 2023, and judge the table Whether the site city pair specified in 3 can deliver normally, if it can deliver normally, give the corresponding express delivery quantity, and fill in the result in Table 3.

Question 3 is still a prediction model. Its difficulty lies in the fact that there is a situation where normal shipments cannot be made. We can approximately consider this situation as a kind of outlier processing. For judging whether the site city pair specified in Table 3 can be normal Shipping, we can understand it as the prediction of outliers. Because there are a lot of literature, codes, and models related to the prediction of outliers. For the model of question three, we can still choose the same as question two, choose the basic prediction model, or choose an advanced prediction model that is more suitable for outliers. I will add more information here later. For the prediction of the quantity of express delivery, it can be the same as the prediction model established in question 2. As for whether it can be shipped normally, you can build a model alone, or you can continue to use the model in question 2.

Question 4

Remarks: For the convenience of calculation, the weight and size of the courier are not distinguished, assuming that the weight of each courier is unit 1. Only the transportation cost is considered, and other costs such as transit are not considered.

Question 4 is a variant TSP question, similar to the mining of open-pit mines in 2003, and the setting of the traffic police service platform is similar to the scheduling problem setting. You can refer to the decision variable setting method of these two problems. Later, I will also collect the data of these two games. My current initial idea is to take the minimum transportation cost as the objective function, the various conditions of the question as the constraints, and Xij as the decision variable, which represents the freight volume from the I-th city to the J-th city. You can refer to it.

For the solution of the optimization model, those who are familiar with optimization can build a model to solve it. For Xiaobai, this question is as difficult as heaven. Therefore, this involves my theory of ten class hour guarantees. Appropriate models + advanced algorithms + reasonable results can win awards. What is a reasonable result? This is for many novices who know some models, but cannot solve them. We are missing a result. To obtain the results, you can go to the Internet to search for purchases, or perform numerical simulations, or the easiest way is to compile and compile a reasonable result. The premise is that it is reasonable! ! Usually it is difficult for the judges to see whether a result is fabricated. Therefore, for the situation where the model cannot be solved, we can choose to fabricate a reasonable result.

Question 5. Normally, express delivery demand is composed of two parts, one is fixed demand , which comes from daily necessary online shopping consumption (generally, it cannot be simply identified as the minimum value of express delivery demand historical data, usually less than the demand The minimum value); the other part is non- fixed demand, which usually has large fluctuations and is greatly affected by factors such as time. Assume that in the same quarter, the fixed demand of the same "shipping-receiving" site city pair is a definite constant (hereinafter referred to as the fixed demand constant); the non-fixed demand of the same "shipping-receiving" site city pair obeys a certain Probability distribution (the mean and standard deviation of this distribution are respectively called the mean value of non-fixed demand and the standard deviation of non-fixed demand). Please use the data in Attachment 2, regardless of the data that has been eliminated, data that has no delivery demand, and data that cannot be shipped normally, to solve the following problems.

(1) Establish a mathematical model to estimate the constant demand constant on a quarterly basis and verify its accuracy. Fill in Table 5 with the specified quarter, the fixed demand constant of the designated "shipping-receiving" site city pair, and the sum of the fixed demand constants of all "shipping-receiving" city pairs in the current quarter.

(2) Give the estimation method of the probability distribution of non-stationary demand, and the specified quarter, the non-stationary demand mean and standard deviation of the specified "shipping-receiving" site city pair, and all "shipping-receiving" cities in the current quarter The sum of the mean value of the non-fixed demand and the sum of the standard deviation of the non-fixed demand are filled in Table 5.

For question five, I haven't found a suitable model classification yet, so I can only think of it as a regular data calculation question. I will have a deeper understanding later and will update you in time.

Finally, I wish everyone a smooth competition and a happy holiday. ! ! ! ! 11

Guess you like

Origin blog.csdn.net/qq_33690821/article/details/130426832