Algorithm and Data Structure Interview Guide - Digital Coding

digital encoding

Original code, inverse code and complement code

In the table in the previous section, we found that all integer types can represent one more negative number than positive numbers. For example, the bytevalue range of is [ − 128 , 127 ] [-128, 127][128,127 ] . This phenomenon is relatively counter-intuitive, and its intrinsic reason involves the knowledge of original code, inverse code, and complement code.

First of all, it needs to be pointed out that numbers are stored in computers in "complement" form . Before analyzing the reasons for this, we first give the definitions of the three.

  • Original code : We consider the highest bit of the binary representation of a number as the sign bit, where 0 00 represents a positive number,1 11 represents a negative number and the remaining bits represent the value of the number.
  • One 's complement code : The one's complement of a positive number is the same as its original code. The one's complement of a negative number is the inversion of all bits of its original code except the sign bit.
  • Two's complement : The complement of a positive number is the same as its original code, and the complement of a negative number is the complement of its complement plus 1 11

The figure below shows the conversion method between original code, one's complement code and one's complement code.

Insert image description here

Although the "original true form" is the most intuitive, it has some limitations. On the one hand, the original code of negative numbers cannot be directly used in operations . For example, calculate 1 + ( − 2 ) 1 + (-2) under the original code1+( 2 ) , the result is− 3 -33 , which is obviously wrong.

1 + ( − 2 ) → 0000    0001 + 1000    0010 = 1000    0011 → − 3 \begin{aligned} & 1 + (-2) \newline & \rightarrow 0000 \; 0001 + 1000 \; 0010 \newline & = 1000 \; 0011 \newline & \rightarrow -3 \end{aligned} 1+(2)00000001+10000010=100000113

In order to solve this problem, computers introduced "1's complement code". If we first convert the original code into one's complement, and calculate 1 + ( − 2 ) 1 + (-2) under its complement1+( 2 ) , and finally convert the result from the complement code back to the original code, then the correct result can be obtained− 1 -11

1 + ( − 2 ) → 0000 0001 (original code) + 1000 0010 (original code) = 0000 0001 (reverse code) + 1111 1101 (reverse code) = 1111 1110 (reverse code) = 1000 0001 (original code) → − 1 \begin{aligned} & 1 + (-2) \newline & \rightarrow 0000 \; 0001 \; \text{(original code)} + 1000 \; 0010 \; \text{(original code)} \newline & = 0000 \; 0001 \; \text{(reverse code)} + 1111 \; 1101 \; \text{(reverse code)} \newline & = 1111 \; 1110 \; \text{(reverse code)} \newline & = 1000 \; 0001 \; \text{(original code)} \newline & \rightarrow -1 \end{aligned}1+(2)00000001( original code )+10000010( original code )=00000001( reverse code )+11111101( reverse code )=11111110( reverse code )=10000001( original code )1

On the other hand, the original code for the number zero is + 0 +0+ 0 and− 0 -00 can be expressed in two ways. This means that the number zero corresponds to two different binary codes, which may cause ambiguity. For example, in conditional judgment, if positive zero and negative zero are not distinguished, the judgment result may be incorrect. And if we want to handle the ambiguity between positive zero and negative zero, we need to introduce additional judgment operations, which may reduce the computing efficiency of the computer.

+ 0 → 0000    0000 − 0 → 1000    0000 \begin{aligned} +0 & \rightarrow 0000 \; 0000 \newline -0 & \rightarrow 1000 \; 0000 \end{aligned} +000000000010000000

Like the original code, the complement code also has the problem of positive and negative zero ambiguity, so the computer further introduced "2's complement code". Let’s first observe the conversion process of the original code, inverse code, and complement code of negative zero:

− 0 → 1000 0000 (original code) = 1111 1111 (reverse code) = 1 0000 0000 (complement code) \begin{aligned} -0 \rightarrow \; & 1000 \; 0000 \; \text{(original code)} \newline = \; & 1111 \; 1111 \; \text{(one's complement)} \newline = 1 \; & 0000 \; 0000 \; \text{(one's complement)} \newline \end{aligned}0==110000000( original code )11111111( reverse code )00000000( complement code )

Add 1 1 to the complement of negative zero1 would produce a carry, butbytethe type is only 8 bits long, so it overflows to1 in bit 9 11 will be discarded. That is,the complement of negative zero is 0000 0000 0000 \; 000000000000 , which is the same as positive zero's complement. This means that there is only one zero in the two's complement representation and the positive and negative zero ambiguity is thus resolved.

There is one last doubt left: bytethe value range of the type is [ − 128 , 127 ] [-128, 127][128,127 ] , an extra negative number− 128 -128− How is 128 obtained? We note that the interval[ − 127 , + 127 ] [-127, +127][127,+ 127 ] All integers within have corresponding original codes, complement codes and complement codes, and the original codes and complement codes can be converted to each other.

However, the complement 1000 0000 1000 \; 000010000000 is an exception, it has no corresponding original code. According to the conversion method, we get the original code of the complement code as0000 0000 0000 \; 000000000000 . This is obviously a contradiction because the original code represents the number0 00 , its complement should be itself. The computer specifies this special complement1000 0000 1000 \; 000010000000 represents− 128 -128128 . In fact,( − 1 ) + ( − 127 ) (-1) + (-127)(1)+The calculation result of ( 127 ) in two’s complement is− 128 -128128

( − 127 ) + ( − 1 ) → 1111 1111 (original code) + 1000 0001 (original code) = 1000 0000 (complement code) + 1111 1110 (complement code) = 1000 0001 (complement code) + 1111 1111 (complement code) ) = 1000 0000 (complement code) → − 128 \begin{aligned} & (-127) + (-1) \newline & \rightarrow 1111 \; 1111 \; \text{(original code)} + 1000 \; 0001 \; \text{(original code)} \newline & = 1000 \; 0000 \; \text{(reverse code)} + 1111 \; 1110 \; \text{(reverse code)} \newline & = 1000 \; 0001 \; \text{(complement code)} + 1111 \; 1111 \; \text{(complement code)} \newline & = 1000 \; 0000 \; \text{(complement code)} \newline & \rightarrow - 128 \end{aligned}(127)+(1)11111111( original code )+10000001( original code )=10000000( reverse code )+11111110( reverse code )=10000001( complement code )+11111111( complement code )=10000000( complement code )128

You may have noticed that all the above calculations are addition operations. This implies an important fact: the hardware circuits inside the computer are mainly designed based on addition operations . This is because compared to other operations (such as multiplication, division, and subtraction), addition operations are simpler to implement in hardware, easier to parallelize, and faster.

Note that this does not mean that computers can only do addition. By combining addition with some basic logical operations, computers are able to perform a variety of other mathematical operations . For example, calculate the subtraction a − ba - bab can be transformed into the calculation additiona + ( − b ) a + (-b)a+( b ) ; Calculating multiplication and division can be converted into calculating multiple additions or subtractions.

Now we can summarize the reasons why computers use complement codes: Based on complement representation, computers can use the same circuits and operations to process the addition of positive and negative numbers. There is no need to design special hardware circuits to handle subtraction, and there is no need to specially handle positive numbers. The ambiguity problem with negative zero. This greatly simplifies hardware design and improves computing efficiency.

The design of the complement code is very exquisite. Due to space constraints, we will introduce it here first. It is recommended that interested readers learn more about it in depth.

floating point encoding

If you are careful, you may find that: the length intof floatis the same as 4 bytes, but why floatthe value range of is much larger than int? This is very counter-intuitive, because it stands to reason that floatto represent decimals, the value range should be smaller.

In fact, this is because floating point numbers floatare represented differently . Remember a 32-bit length binary number as:

b 31 b 30 b 29 … b 2 b 1 b 0 b_{31} b_{30} b_{29} \ldots b_2 b_1 b_0 b31b30b29b2b1b0

According to the IEEE 754 standard, the 32-bit length floatconsists of the following three parts.

  • Sign bit S \mathrm{S}S : occupies 1 bit, corresponding tob 31 b_{31}b31
  • Exponent bit E \mathrm{E}E : occupies 8 bits, corresponding tob 30 b 29 … b 23 b_{30} b_{29} \ldots b_{23}b30b29b23
  • Fractional digit N \mathrm{N}N : 23 bits, corresponding tob 22 b 21 … b 0 b_{22} b_{21} \ldots b_0b22b21b0

floatHow to calculate the corresponding value of a binary number :

val = ( − 1 ) b 31 × 2 ( b 30 b 29 … b 23 ) 2 − 127 × ( 1. b 22 b 21 … b 0 ) 2 \text {val} = (-1)^{b_{31}} \times 2^{\left(b_{30} b_{29} \ldots b_{23}\right)_2-127} \times\left(1 . b_{22} b_{21} \ldots b_0\right)_2 val=(1)b31×2(b30b29b23)2127×(1.b22b21b0)2

The calculation formula converted to decimal:

val = ( − 1 ) S × 2 E − 127 × ( 1 + N ) \text {val}=(-1)^{\mathrm{S}} \times 2^{\mathrm{E} -127} \ times (1 + \mathrm{N})val=(1)S×2E127×(1+N)

The value range of each item is:

S ∈ { 0 , 1 } , E ∈ { 1 , 2 , … , 254 } ( 1 + N ) = ( 1 + ∑ i = 1 23 b 23 − i 2 − i ) ⊂ [ 1 , 2 − 2 − 23 ] \begin{aligned} \mathrm{S} \in & \{ 0, 1\}, \quad \mathrm{E} \in \{ 1, 2, \dots, 254 \} \newline (1 + \mathrm{N}) = & (1 + \sum_{i=1}^{23} b_{23-i} 2^{-i}) \subset [1, 2 - 2^{-23}] \end{aligned} S(1+N)={ 0,1},E{ 1,2,,254}(1+i=123b23i2i)[1,2223]

Insert image description here

Observe the figure above, given an example data S = 0 \mathrm{S} = 0S=0E = 124 \mathrm{E} = 124E=124 ,N = 2 − 2 + 2 − 3 = 0.375 \mathrm{N} = 2^{-2} + 2^{-3} =N=22+23=0.375 , then there is:

 val  = ( − 1 ) 0 × 2 124 − 127 × ( 1 + 0.375 ) = 0.171875 \text { val } = (-1)^0 \times 2^{124 - 127} \times (1 + 0.375) = 0.171875  val =(1)0×2124127×(1+0.375)=0.171875

Now we can answer the original question: floatthe representation of contains an exponent bit, causing its range of values ​​to be much larger thanint . According to the above calculation, floatthe maximum positive number that can be represented is 2 254 − 127 × ( 2 − 2 − 23 ) ≈ 3.4 × 1 0 38 2^{254 - 127} \times (2 - 2^{-23}) \approx 3.4 \times 10^{38}2254127×(2223)3.4×1038 , the smallest negative number can be obtained by switching the sign bit.

Although floating point numbers floatextend the range of values, a side effect is that precision is sacrificed . The integer type intuses all 32 bits to represent numbers, and the numbers are evenly distributed; and due to the existence of the exponent bit, floatthe larger the value of the floating point number, the greater the difference between two adjacent numbers will tend to be.

As shown in the table below, the exponent bit E = 0 E = 0E=0 sumE = 255 E = 255E=255 has a special meaning andis used to represent zero, infinity, N a N \mathrm{NaN}NaN et al.

Table meaning of exponent bits

Exponent bit E Fractional bit N = 0 \mathrm{N} = 0N=0 Fractional digit N ≠ 0 \mathrm{N} \ne 0N=0 Calculation formula
0 0 0 ± 0 \pm 0 ±0 subnormal number ( − 1 ) S × 2 − 126 × ( 0. N ) (-1)^{\mathrm{S}} \times 2^{-126} \times (0.\mathrm{N}) (1)S×2126×(0.N)
1 , 2 , … , 254 1, 2, \dots, 254 1,2,,254 normal number normal number ( − 1 ) S × 2 ( E − 127 ) × ( 1. N ) (-1)^{\mathrm{S}} \times 2^{(\mathrm{E} -127)} \times (1.\mathrm{N}) (1)S×2(E127)×(1.N)
255 255 255 ± ∞ \pm \infty ± N a N \mathrm{NaN}NaN

It is worth mentioning that subnormal numbers significantly improve the accuracy of floating point numbers. The smallest positive normal number is 2 − 126 2^{-126}2126 , the smallest positive subnormal number is2 − 126 × 2 − 23 2^{-126} \times 2^{-23}2126×223

Double precision doublealso uses a similar floatrepresentation method, which will not be described in detail here.

Guess you like

Origin blog.csdn.net/zy_dreamer/article/details/132911073