2023 Certification Cup Mathematical Modeling Network Challenge B Question Text Analysis

2023
"Certification Cup" Mathematics China Mathematical Modeling Network Challenge, the first stage of the
B-question text

In the process of copying ancient texts, various errors often occur, so that a book may be handed down in multiple versions. In philology, errors are often summed up in the forms of "erroneous", "off", "yan", "inverted", etc., and multiple errors may also appear at the same time. Errors can add up during the copying process.

  1. "Corruption" refers to the tampering of the original text. This includes inadvertently miswriting a single word, and also includes tampering with complete words, sentences or even entire paragraphs based on the copyist's own understanding. For example, there are different versions of ancient books handed down to the present for the recipe of the famous dish "eggplant" in "Dream of Red Mansions", and the contents are far from each other, and some of them must have been tampered with by copyists;
  2. "Take off" means to delete the text by mistake. This includes the omission of individual words or paragraphs. For example, in the article "Xunzi Encouraging Learning", there is a sentence "In the fluffy hemp, straight without support", there is no subsequent text in the handed down version of the ancient book. After textual research by Wang Niansun of the Qing Dynasty, there should be a sentence "the white sand is in nirvana, and it is black with it";
  3. "Yan" refers to the mistaken addition of words. Including mistakenly adding single characters or words, and mistakenly adding whole sentences. For example, the character "Shiren" of the Three Kingdoms is written as "Fu Shiren" in the popular version of "The Romance of the Three Kingdoms". Some people speculate that the surname "Fu" is originally a derivative. It is rare to add a whole paragraph, which is often caused by copyists adding other documents or their own original annotations to the text, which cannot be recognized by later generations;
  4. "Upside down" generally refers to exchanging the position of the original text. The reverse position of a single text is often due to a copying error, and the reverse of a large section or even the entire text is often due to a binding error. For example, in the Ming Dynasty Yu Qian's poem "Lime Yin", there is a sentence "I am not afraid of being crushed to pieces", which was mistakenly written as "I am not afraid of being crushed to pieces" in some copied versions.

Not only ancient copyists made mistakes, even modern communication or storage devices cannot avoid random errors when a message is forwarded or transcribed many times. Here, we transform this problem into a more idealized form: assume that the length of the original text is large enough, and in the process of copying, the copyer does not check with other versions. In this way, in the process of spreading or forwarding for a long enough time, different errors are superimposed, and a large number of different versions may be produced. Please establish a reasonable mathematical model and study the following questions.
First stage questions:

  1. Please design a reasonable plan to measure the difference between two different versions of the text.
  2. If one version was copied several times from another, we wish to estimate the number of copies that went through between the two texts. Please analyze and solve this problem. As you model, keep in mind: What other essential information do we need to know in order to make a valid estimate?
  3. In solving the problems posed earlier, there are some schemes, although conceptually sound, that encounter practical computational difficulties. Now please design an effective and fast algorithm for the first two questions. Please describe the principle of the algorithm, estimate its speed, and give an example.

Guess you like

Origin blog.csdn.net/qq_43475285/article/details/130143786