Encoding and decoding issues

Encoding: Representing different real numbers as different 0, 1 binary string representations to complete the encoding. This one is relatively simple.
Decoding: Convert the encoded binary string into a decimal string that we understand, so what we need is the inverse mapping of y = f (x), that is, convert the binary to decimal and map it to a certain interval, that is, decoding.

For example: the relationship between decimal and binary mapping is expressed as follows:

For a binary string of length 10, such as [0,0,0,1,0,0,0,0,0,1], the steps to map it to the interval [1,3] are:

  1. First, expand the binary numbers by weight: 0 ∗ 2 9 + 0 ∗ 2 8 + 0 ∗ 2 7 + 1 ∗ 2 6 + 0 ∗ 2 5 + 0 ∗ 2 4 + 0 ∗ 2 3 + 0 ∗ 2 2 + 0 ∗ 2 1 + 1 ∗ 2 0 = 65 0* 2^9+0*2^8+0*2^7+1*2^6+0*2^5+0*2^4+0*2^3 +0*2^2+0*2^1+1*2^0=65029+028+027+126+025+024+023+022+021+120=65
  2. Then compress it to the interval [0,1]: 65 / (2 10 − 1) ≈ 0.0635386 65/(2^{10}-1) \approx0.063538665/(2101)0.0635386
  3. Finally, the number in the interval [0,1] is mapped to the interval [1,3] we want: 0.0635386 ∗ (3 − 1) + 1 ≈ 1.12707722 0.0635386*(3-1) + 1\approx1.127077220.063538631+1. 1 . . 1 2 . 7 0 . 7 . 7 2 2 , as can be seen in law: num * (high-low)where num is a number between [0, 1] here corresponds to 0.0635386, high and low mean that we want The upper and lower boundaries of the interval to be mapped correspond to 3 and 1 here, respectively.

analysis:
The above steps include the encoding and decoding process: first we encode 65 into a 10-bit binary number, then compress it to the interval [0, 1], and finally map the number in the interval [0, 1] to [ 1, 3] This range is to find an inverse mapping after encoding to map this binary string to the interval [-3, 3]. This process is also decoding.


Regarding the encoding: here is a question, that is, the encoding is a 10-bit binary number. If a longer binary number is used for encoding, what changes will there be?

10-bit binary can represent 2 10 2^{10}21 0 different states, can be seen in the last section to be converted to decimal [1, 3] Segmentation2102 ^ {10}2. 1 0 points, each point corresponding to data corresponding to a binary code. It can be seen that if we increase the length of the binary string, then we can segment the interval more finely, and the location of the segmentation point corresponding to each binary code is more accurate, which means that the corresponding decimal number is also more accurate.

For example:
10-bit binary code 1111111111 (all 1s), expand it to 1023 according to the weight, and finally map to the interval [-3, 3] to 3, and 111111110 (the first 9 1s) expand to 1022, 1022 according to the weight / (2 10 − 1) ≈ 0.999022 1022/(2^{10}-1) \approx0.9990221022/(2101)0.999022, 0.999022 ∗ ( 3 − 1 ) + 1 ≈ 2.994134 0.999022*(3-1)+ 1\approx2.994134 0.999022(31)+12 . . 9 . 9 . 4 . 1 . 3 . 4 . If the code is 12-bit binary, 111111111111 (12 1s) is finally converted to 3, and 111111111110 (the first 11 1s) is expanded to 4094 according to the weight,4094 / (2 12 − 1) ≈ 0.999756 4094/(2^{12} -1) \approx0.9997564094/(2121)0.999756 0.999756 ∗ ( 3 − 1 ) + 1 ≈ 2.998534 0.999756*(3-1)+ 1\approx2.998534 0.999756(31)+12.998534

3−2.994134=0.005866; 3 − 2.998534 = 0.001466,
so after dividing the interval with 10-bit binary code, the real number corresponding to each binary state change will change about 0.005866, while with 12-bit binary code, the number will drop to 0.001466, so the longer the DNA length , The more accurate the real number is decoded as decimal.

Reference:
https://blog.csdn.net/ha_ha_ha233/article/details/91364937

Guess you like

Origin blog.csdn.net/qq_38048756/article/details/115310234