Analysis of C questions in the 2021 American College Students Mathematical Contest in Modeling

Download link

2021 Beauty Contest title download: link: https://pan.baidu.com/s/1yFdg3vBMS4MY7CnQ3PMG9Q
extraction code: 6666
2021 cubic meters

Question C: Confirm the buzzing of the wasp

In September 2019, a group of (also known as Asian giant hornet) was found on Vancouver Island, British Columbia, Canada. The
nest was quickly destroyed, but news of the incident quickly spread throughout the area. Since then, it has
occurred in neighboring Washington State . There have been several confirmed pest sightings, as well as a large number of false sightings. For the detection map, hornet watch
and public sight, please see
Insert picture description here
Figure 1 below. Figure 1: A map depicting the detection of Asian giant bees, as well as Hornet observation and public location.
Insert picture description here

The wasp is the largest species of wasp in the world, and the occurrence of the nest is shocking. In addition, giant bees are predators of European honeybees, invading and destroying their nests. A few bumblebees can destroy the entire European bee colony in a short time. At the same time, they are greedy predators of other insects and are considered agricultural pests. The life cycle of this wasp is similar to many other wasps. The fertilized queen bee appeared in the spring and started a new colony. In autumn, the new queen leaves the nest and will spend winter in the soil, waiting for spring. The range of a new queen is estimated to be 30 kilometers to build her lair. More detailed information about the Asian Hornet is included in the question attachment and can also be found online. Due to the potentially serious impact on the local bee population, their presence can cause a lot of anxiety. Washington State has established a hotline and a website for people to report seeing these bumblebees. Based on these reports from the public, the country must decide how to prioritize its limited resources in order to follow up more investigations. Although some reports have been identified as insects, many other sightings have proven to be other types of insects. The main questions of this question are "How do we interpret the data provided by public reports?" and "With limited resources of government agencies, what strategies can we use to prioritize these public reports for additional investigation?" Your paper should explore And solve the following aspects:

  • Process and discuss whether it is possible to predict the spread of this pest over time and with what accuracy.
  • Most reported sightings mistake other Hornets for Hornet. Only use the provided dataset files and (possibly) provided image files to create, analyze, and discuss models that predict the likelihood of misclassification.
  • Use your model to discuss how your classification analysis leads to a priority investigation report that is most likely to be a positive finding.
  • Solve how to update your model, given more new reports over time, and how often updates should occur.
  • Using your model, what would constitute evidence that the pest has been eradicated in Washington State? Finally, your report should include a two-page memo summarizing your results to the Washington State Department of Agriculture. Your PDF solution does not exceed 25 pages and should include:
    • One page summary table.
    • table of Contents.
    • Your solution.
    • Two-page memo.
    • List of references.
      Note: MCM competitions now have a 25 page limit. All aspects of your submission are calculated to a limit of 25 pages (summary form, table of contents, reference list and any appendices).
      You should not use unauthorized images and materials, and their use is restricted by copyright law. Make sure you cite the source of your ideas and the materials used in your report. The general guidelines for question C, in addition to the specific requirements listed above, remember that this is a statistical modeling exercise. The submitted materials should follow best practices related to the use of data. Some examples of these expectations include but are not limited to the following:
      • Define all the metrics and cost functions you use.
      • Any estimation of parameters should include interval estimation.
      • Any result should include an estimate of the goodness of fit of the result.
      • All assumptions should be clearly stated, especially distributions related to data or errors.
      • All assumptions related to the data should be checked and the robustness of the technology to these assumptions should be checked.
      • All assumptions related to a method or technology should be clearly stated. Attachment We provide the following four materials for this question. The data file provided contains the only data you should use for this question.
  1. 2021MCM_ProblemC_Vespamandarinia.pdf
    background information from Pennsylvania State University, describing this insect.
  2. 2021MCM_ProblemC_DataSet.xlsx
    A spreadsheet of 4440 sighting reports with the following fields:
    Global ID: a unique label for each observation record.
    Detection date: the report date of the detection.
    Note: Comments provided by the author of the report. This can be a member of the public or occasionally a state employee. Laboratory status: Official classification of eyesight after analysis by the Ministry of Agriculture. A positive ID means that it is confirmed as an Asian giant bee. Negative ID means it is excluded. Unprocessed means it has not been classified. Unverified means that no decision was made due to lack of information.
    Laboratory comment: The content added to the record after analysis by the State Entomology Laboratory.
    Date of submission: the date the report was submitted to the country. This date can be significant after the detection date.
    Latitude (optioning): These data are provided by the country after converting the address provided in the report.
    Longitude (line of sight): These data are provided by the country after converting the address provided in the report.
  3. 2021MCM_ProblemC_Files.rarA
    rar file with 3305 pictures, submitted together with the witness report.
    The 662MB file can be downloaded from:
    http://www.comapmath.com/MCMICM/2021MCM_ProblemC_Files.rar
    A password is required to open the file: Af6SP7rdm33PxPJmDb4wZq7cw
  4. 2021MCM ProblemC_Images_by_GlobalID.xlsx
    uses the following fields to map images to the spreadsheet of the scope:
    file name: the name of the image in the rar folder.
    Global ID: a unique label for each observation record. This is consistent in the two spreadsheets.
    File type: The image
    arrives in the form of .jpg, .pdf, .png, .jfif, octal stream, xml open format or .zip file. The video arrives in the form of .mp4 or fast time files.
    Reference
  5. Washington State Department of Agriculture. 2020 Asian Giant Hornet Public Dashboard.
    https://agr.wa.gov/departments/insects-pests-andweeds/insects/hornets/data
    accessed 11/5/2020.

    Thinking analysis

Problem center: Spatio-temporal prediction of excel data and wasp classification of image data requires deep learning foundation (personal feel that traditional models are not strong in prediction and classification results, recall rate, explanatory persuasiveness, but still can be modeled ).

Brief thinking: For big data problems, first spend time to understand the attached data, exploratory analysis, and predictively give the direction and answer to the model through simple fitting.

Personal habit is to write data cleaning separately in Chapter 4 of big data problems. The specific process depends on the national award papers of graduate students in the group.
First, we must analyze the three attachments, extract useful variables, delete missing data strips, normalization of categorical variables, etc., and further process and wait for update.

(1) Prediction model
Discuss and analyze the changes in the emergence of bumblebees over time and innovative points: consider the differences in spatial distribution.
(2) Classification model
training image classification model. The model needs to build a deep learning framework. It is not difficult for students with deep learning foundation. It can be solved by giving common modeling indicators. The innovation lies in the analysis of recall and accuracy. And other spatial characteristics.
(3) Model evaluation
Whether the results of prediction (LSTM, RNN, ARIMA, MLR, SVR) and classification (CNN, SVM, decision tree) are conducive to summarizing constructive suggestions, and how to improve the effectiveness of the results for suggestions?
(4) Model optimization
Explain the update mechanism, complexity, timeliness, and applicability of different regions of the model.
(5) Suggested solutions
.
It is relatively difficult to give evidence that the number of wasps has been reduced to the recognized safety range. Although the big data problem is attractive, it may be difficult for students without advanced experience in data analysis because the threshold of text data is more critical. Make results, consider carefully.

Guess you like

Origin blog.csdn.net/qq_43475285/article/details/113663986