This series of blogs focuses on the core content of the computer system (2) course of Shenzhen University, and the reference bibliography "In-depth understanding of computer systems" (if you have any questions, please discuss and point out in the comment area, or contact me directly by private message).

The first chapter in-depth understanding of computer system 01 - computer system roaming_@李敬如的博客-CSDN Blog

Chapter 2 In-depth understanding of computer systems 02 - representation and processing of information

synopsis

This blog mainly introduces the relevant knowledge about the representation and processing of information in the second chapter of the computer system bibliography.

1. Information storage

1. Preliminary knowledge and concepts

① Most computers use 8-bit blocks (bits) , or bytes , as the smallest addressable unit of memory .

② Everything is a bit, each bit is 0 or 1 (easy to realize electronically), and the bit set is encoded/interpreted in various ways .

2. Hex notation

For the description of the bit pattern, the binary notation is too lengthy, and the conversion between the decimal notation and the bit pattern is troublesome. The alternative is to represent bit patterns in base 16 (hexadecimal numbers).

2.1 Binary conversion

Hexadecimal to decimal: the corresponding square and number multiplied by 16

Decimal to hexadecimal: remove and divide, from bottom to top

3. Word data size

Computers of each architecture have a fixed length of binary bits as a word, which is used to indicate the nominal size of integer and pointer data. It belongs to the hardware concept, and there are two common word lengths: 32-bit and 64-bit.

Tips: You can use sizeof() to output the bytes corresponding to a certain data type.

Pay special attention to different types (long, pointer) under different systems

4. Addressing and byte order

4.1 Addressing

Program addressing - logical address, virtual address (take the first byte of data as the address)

Tips: One address occupies one byte (8bits)

4.2 Byte order (big endian little endian)

Suppose the variable is of type int, located at address 0x100, with a hexadecimal value of 0x01234567

The C language verification code is as follows:

#include<stdio.h>
int main(){
int num = 0x01234567;
char *c = (char *)&num;
if(*c == 0x01)
    printf("big");
else if (*c == 0x67)
    printf("small");
}

4.3 Compilation options

5. Represent strings and codes

① String encoding

The string is encoded as a null-terminated array of characters with an ASCII code for each character .

Example “123456” = “31 32 33 34 35 36 00 ”

② Indicates the code

Expressed in machine language encoding, binary encodings are not compatible with each other (different systems).

6. Boolean Algebra

Tips: Determine whether it is a logical operation or a bitwise operation.

7. Shift operation in C language

Symbols: << and >> add numbers to indicate the number of bits to move

Left shift: shift left bit by bit and add 0 at the right end

move right

① Logical right shift: bitwise right shift and add 0 to the left (used for unsigned integers)

② Arithmetic right shift: Bitwise right shift fills the most significant effective value to the left (used for signed integers)

8. Integer representation

The integer data type represents a limited range of integers, and the size is related to the type and machine (the range of negative numbers is 1 greater than that of positive numbers).

8.1 Unsigned integers

The mapping is one-to-one, without ambiguity.

8.2 Two's Complement Encoding (Signed)

8.3 Conversion of unsigned and signed numbers

Tips: 2 to the 31st power - 1 = 2147483647

The default is a signed number, and the signed number has a higher priority, and the unsigned number declaration adds U after the number.

Conversion: The bit pattern is unchanged, only the value is changed.

If both signed and unsigned numbers exist in expressions (including comparisons), signed numbers are implicitly converted to unsigned numbers.

Tips: The encoding defaults of C99 (general) and C90 are different, and the comparison results may be different.

9. Digit expansion and truncation

9.1 Digital extension

① Extend unsigned numbers by zero

② Signed numbers are extended by the highest number of digits

9.2 Digit truncation

Directly delete the first k digits, and the new number obtained is interpreted according to the original type (signed or unsigned).

Tips: Pay attention to the range of digits in addition and subtraction operations, and truncation occurs if the range is exceeded (limitation of operations).

10. Integer operations

Tips: The range of digits is very important for operations (overflow problem)!

10.1 Unsigned Integer Addition

10.2 Signed Integer Addition

10.3 Integer multiplication (left shift)

In a high-level language, when two n-digit numbers are multiplied, the result is usually also an n-digit number, that is, the result only takes the lower n bits of the 2n-bit product .

10.4 Multiplication by constants

Tips: Integer multiplication is much more expensive than shifting and adding.

10.5 Examples

There is a problem with the shift code

① There may be multiplication overflow in the application space, resulting in insufficient application space

② Unreleased space

③ Doing the shift operation starting from the low bit may overwrite the original high bit data (the space is continuous and too small)

④ There are security holes in using continuous storage space to store information. You can overwrite the original information with preset information to achieve intrusion or even steal root privileges.

10.6 Unsigned division by powers of 2

Just logically shift right to the power bit.

11. Floating point numbers

Tips: Sparse on the number line than integers

11.1 Representation and conversion of floating point numbers

Tips: Some floating-point numbers (eg.0.1) cannot be accurately represented in the computer (eg.3.1f + 2.1f != 5.2f) and some effective bits will be lost

Restrictions on Exact Representation

①It can only accurately express the kth power of x/2

② Only one binary point can be set in the w bit

Tips: The higher the significant digit, the higher the precision.

11.2 IEEE Representation

11.2.1 Numerical form

11.2.2 Encoding

Tips: Offset value = exp (unsigned) - (2 to the 7/10th power - 1)

11.2.3 Normalization, denormalization, infinity, NAN (different encoding forms)

11.2.4 IEEE Examples

① Convert hexadecimal to decimal

② Decimal to hexadecimal

Tips: Remember to save 1 in the mantissa.

11.3 Rounding

The most common rounding is even rounding. There are two principles. One is to round to the nearest value, and the other is to check whether the effective value is an even number when it is in the "intermediate value". If it is an even number, it will be directly rounded off. No carry, carry if it is odd.

11.4 Addition of floating-point numbers

The exponents are aligned, the mantissas are added, normalized, and rounded if they exceed the effective digits.

Tips: The title assumes that the effective digits are three digits, and even-aligned rounding is used.

11.5 Floating point multiplication

Similar to addition.

11.6 Integer to floating point conversion

Range of numbers: double > float > int

11.7 Floating-point and integer examples

Determines whether the following expressions are always true

Tips

① Floating-point overflow does not change the sign bit

② 2/3 2/3.0 is the same as C language

③ Only when int and float are converted to double can the conversion be perfect without rounding and overflow.

④ The rounding rules of double and float are different (the former is 64 bits and the latter is 32 bits).

12. Practice questions

The exercises in Chapter 2 of the book need to be done as follows:

Summarize

The above is the second chapter of "In-depth Understanding of Computer Systems" - the core knowledge of information representation and processing. In the second chapter, the focus involves various operations and concepts in the C language, various representations and operations of integers and floating point numbers.

In-depth understanding of computer systems 02 - information representation and processing

Series Article Directory