Computer Systems Final 2022

Preface

When reviewing this course, please focus on the review of knowledge points. There are a total of 280+ knowledge points in the entire content, all of which may appear in the final exam. It is not enough to just review the test papers of previous years. The test papers of previous years can only be used as a test for some knowledge points. As well as a rough reference for the types of questions, reviewing this course requires reviewing the knowledge points learned as completely as possible and connecting the related parts of each chapter .

This answer is not a standard answer , it was completed by me this semester. Since I am no longer as familiar with all the knowledge points as I was a year ago, I asked several students to complete some of the questions for verification. However, because the course was not over, I did not ask students to complete the last question and check it with me. If you have any questions, please contact me privately in time. .

As a reference for checking the review situation: This volume is overall simple (the first six questions) and involves few knowledge points. It should be completed successfully after review. It is normal for the paper to take more than 2 hours to complete. You should be able to complete the first six questions + the reading of the last question in 2 hours (if you go quickly, you can do part of it). Don't worry about the amount of questions. If you encounter a difficult question in the exam, it's enough to make sure that the other questions are correct. The last question is a little more difficult. I have written down the complete address translation process for your reference. It is very important to understand the entire address translation process. If you understand the last question, any form of examination of address translation will be no problem. .

According to the performance of several students, question 4 is also relatively easy to make mistakes and is slightly more difficult than other questions. Please note that students should review the actions completed by the most basic assembly instructions to avoid making mistakes on basic questions.


1.IEEE floating point numbers

Insert image description here

(1) Prove:

​ IEEE floating point numbers are expressed as:
n = ( − 1 ) s × M × 2 E n=(-1)^s \times M \times 2^En=(1)s×M×2E
​ The code part is:
E = { e − bias normalized number 1 − bias denormalized number E = \begin{cases} e- bias & normalized number\\ 1-bias & denormalized number\end{ cases}E={ ebias1biasNormalized numberDenormalized numbers
​ Within the scope of the question, there are no negative numbers and special values. The sign bit is 0, so only the last 31 bits can be considered for encoding. If m<n, then:
M m × 2 E m < M n × 2 E n M_m\times 2^{E_m}<M_n\times 2^{E_n}Mm×2Em<Mn×2En

  • If Em=En, Mm<Mn must be satisfied to satisfy the above formula, then the code parts of m and n are the same, the mantissa part is m<n, and fm<fn
  • If Em<En, Mm and Mn may take any value:
    • m and n are both normalized numbers: E=e-bias, em<en, the high bits of the encoding m<n, fm<fn
    • m is a normalized number, n is a denormalized number: this does not exist, the normalized number m < denormalized number n
    • m is a denormalized number, n is a normalized number: em (all 0)<en, the high bits of the encoding m<n, fm<fn

|Description:

​ This question does not require strict proof. It is enough to list the representation of floating point numbers and discuss them on a case-by-case basis, and it is enough to have the above two points. In the case of Em>En, m<n does not exist, and the inconvenience does not require proof or explanation. As can be seen from the figure below, values ​​with larger order codes are always larger:

Insert image description here

(2) Prove:

​ Assuming that m+1 and n+1 are also normalized numbers, then:
stepm = m + 1 − m = 2 E m ( M m + 1 − M m ) = 2 E mstepn = n + 1 − n = 2 E n ( M n + 1 − M n ) = 2 E n step_m=m+1-m=2^{E_m}(M_m+1-M_m)=2^{E_m} \\ step_n=n+1- n=2^{E_n}(M_n+1-M_n)=2^{E_n}stepm=m+1m=2Em(Mm+1Mm)=2Emstepn=n+1n=2En(Mn+1Mn)=2En
​ From m<n, Em<=En, that is, stepm<=stepn.

​ Assume that m is a denormalized number, m+1 is a normalized number, and stepm=1. (Since the denormalized number code is 1-bias and the normalized number is E-bias, a smooth transition is achieved between the two, please refer to the table above). At this time, stepm<=stepn also exists, regardless of the specific situations of n+1 and n.

|Description:

​ This question also does not require strict proof, just a brief explanation based on the situation.


2. Assembly code analysis

Insert image description here

Insert image description here

Insert image description here

(1)

​ 23

(2)

​ 6

(3)

​ (j>15) ? 0 : (j-15)

(4)

​ array1[i][j] - 2array2[k]

(5)

​ sum + array1[0][0] + array2[0]


3. Assembly code analysis

Insert image description here

Insert image description here

Insert image description here

Insert image description here

Insert image description here

(1)

​ *x>a+4y

(2)

​ xy>ua.i

(3)

​ test1(ua.pi, test1(ua.pi, ua.i+10))

|Description:

​ It is also correct to write ua.pi and ua.i directly as ua, but they should be written completely in the source program. This question and the previous question are relatively simple and should be completed without any errors.


4. Assembly program simulation

Insert image description here

Insert image description here

Insert image description here

Insert image description here

Insert image description here

Insert image description here

(1)

The frame stack change process is as follows:

esp ebp
Enter the main function and adjust the stack pointer 0xbffff1e0 0xbffff1f8
call P(5), retaddr is pushed onto the stack 0xbffff1dc 0xbffff1f8
Control is transferred to P(5) and the stack pointer is adjusted. 0xbffff1c0 0xbffff1d8
call P(16), retaddr is pushed onto the stack 0xbffff1bc 0xbffff1d8
Control is transferred to P(8) and the stack pointer is adjusted. 0xbffff1a0 0xbffff1b8
call P(8), retaddr is pushed onto the stack 0xbffff19c 0xbffff1b8
Control is transferred to P(8) and the stack pointer is adjusted. 0xbffff180 0xbffff198
call P(4), retaddr is pushed onto the stack (the position required in the question) 0xbffff17c 0xbffff198
Control is transferred to P(4) and the stack pointer is adjusted. 0xbffff160 0xbffff178

​ The position requested in the question has just been executed, call P(4), (%esp) is the return address 0x804842f, and (%esp+4) is parameter 4.

|Description:

​ This question examines the changes in the stack and stack pointer during program execution. It can be completed by careful simulation, but it may take a while. Don't forget the CALL instruction to push the return address onto the stack and change the stack pointer.

(2)

​Changed program:

void p(int x){
    cout<<x<<endl;
    while(x>1){
        if(x%2==0) x/=2;
        else x=x*3+1;
    }
}

​ A recursive function will re-open the stack space every time it calls itself. When the number of recursive calls is too many, a segmentation fault will occur due to exhaustion of stack space. After changing to a non-recursive situation, only the value of x at a fixed position is modified during the processing, and the problem of stack space exhaustion will not occur.


5.CPU design questions

Insert image description here

Insert image description here

Insert image description here

(1)

​Added parts:

Insert image description here

(2)

​ Add register B.

Insert image description here

|Description:

​ This question only needs to be supplemented according to the original status and data path. If you encounter CPU design problems with the same type of instructions that have already occurred, you should be able to solve them quickly.


6. Links and exception control flow

Insert image description here

Insert image description here

(1)

  • Executable object files do not have a .rel section. The .rel section stores relocatable entries for relocation during linking. This section is no longer needed in the executable target file after the link is completed.
  • After relocation, the sections in the executable object file already have the final runtime memory address. When the linker performs relocation, it will merge all sections of the same type in the relocatable object file into a new aggregate section of the same type, and assign the runtime memory address to the new aggregate section and each symbol, and complete Relocation of symbol references within sections.
  • The executable target file has an .init section, and the code in this section will be called for initialization after the program is loaded.

(2)

​ This code has multiple definition problems when parsing symbols. For the variable x, there are multiple definitions of the function f. Functions and initialized global variables are strong symbols, and uninitialized global variables are weak symbols. Multiple definitions of symbols are handled according to the following rules:

  • Multiple strong symbols are not allowed
  • One strong symbol and multiple weak symbols, choose the strong symbol
  • If there are multiple weak symbols, choose one arbitrarily

The two definitions of x in this code are both weak symbols, and one definition will be randomly selected. And f has a strong symbolic definition, which will use the f function that assigns a value to x. At runtime, x may be of int type, occupying 4 bytes, and f assigns 8 bytes to x according to the double type, which will overwrite the locations of x and y in the memory.

(3)

Relocation is divided into two steps:

  • Relocation section and symbol definition: The linker merges all sections of the same type into a new aggregate section of the same type, assigns the runtime memory address to the new aggregate section, and assigns it to each symbol. At this time, all instructions and global Variables have unique run-time memory addresses.
  • Symbol references in relocation sections: The linker modifies references to each symbol in the code and data sections so that they point to the correct run-time address. This step needs to be completed with the help of relocation entries.

(4)

After entering the executable file name, the shell will call the loader through the execve function, modify the virtual address space, copy the code and data segments of the file into memory under the guidance of the segment header table of the executable target file, and then jump to Go to the program entry _start, call some initialization routines, and call the atexit routine to register some programs that are called upon termination. Then the main function is called. After the operation is completed, the _exit function is called to terminate the program and return control to the operating system.

(5)

​ A SIGCHLD signal will be generated and sent to the parent process. The receiver of the signal is the shell, indicating that the child process has stopped or terminated. Signal non-queuing means that at most one signal of the same type is pending, and subsequent signals of the same type will be discarded directly. This is because the signals to be processed are maintained using bit vectors and cannot be counted.

|Description:

​ This question is all about basic concepts. (1) only needs to answer two points, and you should be able to answer at least the first two points. The content of other questions is very small, so try to answer them as completely as possible.


7. Virtual memory + memory hierarchy

Insert image description here

Insert image description here

Insert image description here

In order to speed up address translation, TLB is used for address caching. If the TLB hits, the PDE is obtained, the physical address of the page table is calculated, and then the PTE is found to complete the address translation. As can be seen from the figure, TLB uses four-way group association, so the search process is to first find the group through the index, and then search by matching the tag. TLB is indexed using VPN bits. The page in this question is 4K, and the offset VPO occupies 12 bits. Therefore, the first 20 bits are used for TLB search. The address needs to be divided first. Among these 20 bits, the group index is The last 2 digits, tag is the first 18 digits.

​ If there is a miss, you need to find the PDE in physical storage. The address of the PDE is the base address of the page directory 0x45d000, and the offset is the first 10 bits of the virtual address. The offset needs to be shifted left by 2 bits , because for a PDE4 byte, similar to array addressing, the address of A[i] is A+4i. The same goes for page tables. The base address of the secondary page table obtained from PDE needs to be shifted to the left by 12 bits , because the secondary page table is 4KB and the last 12 bits of the base address are 0.

In this question, the TLB directly caches the physical page number, but in the book, the TLB caches the PTE. Note this difference.

(1)0x9fd28c10

​ The first 20 digits are: 0x9fd28 = 100111111101001010 (TLBT) 00 (TLBI)

  • TLBI: 0
  • TLBT:27f4a

Checking the TLB, there is an entry for this address, but the valid bit is 0, so there is no hit . Need to find the PDE in memory:

​ PDEADDR = 0x45d000 + 0x9fc (the first 10 bits of the virtual address, shifted left by two bits) = 0x45d9fc

​Reading the PDE from memory is:

​ 0x0df2a237 = 0000 1101 1111 0010 1010 (the first 20 bits, the base address of the level 2 page table, hexadecimal is 0xdf2a) 0010 0011 0111 (the valid bit is 1)

​ PTEADDR = 0x0df2a000 (base address of level 2 page table) + 0x4a0 (middle 10 bits of virtual address, shifted left by two bits) = 0xdf2a4a0

​Reading the PTE from memory is:

​ 0x324236 = 0000 0000 0011 0010 0100 (the first 20 digits, PPN, hexadecimal is 0x00324) 0010 0011 0110 (the valid digit is 0)

Since the valid bit is 0, the conversion fails, causing the failed PTE address to be 0xdf2a4a0.

(2)0x0a32fcd0

​ The first 20 digits are: 000010100011001011 (TLBT) 11 (TLBI)

  • TLBI: 3
  • TLBT:028cb

Check TLB, there is no entry matching TLBT, no hit . Need to find PDE in memory:

​ PDEADDR = 0x45d000 + 0x0a0 (the first 10 bits of the virtual address, shifted left by two bits) = 0x45d0a0

​Reading the PDE from memory is:

​ 0x000c3297 = 0000 0000 0000 1100 0011 (the first 20 bits, the base address of the level 2 page table, hexadecimal is 0x000c3) 0010 1001 0111 (the valid bit is 1)

​ PTEADDR = 0x000c3000 (base address of level 2 page table) + 0xcbc (middle 10 bits of virtual address, shifted left by two bits) = 0xc3cbc

​Reading the PTE from memory is:

​ 34abd237 = 0011 0100 1010 1011 1101 (the first 20 digits, PPN, hexadecimal is 0x34abd) 0010 0011 0111 (the valid digit is 1)

​The physical address is:

​ PPN + PPO = 0x34abd000 + cd0 = 0x34abdcd0

(3)0x0d4182c0

​ The first 20 digits are: 000011010100000110 (TLBT) 00 (TLBI)

  • TLBI: 0
  • TLBT:03506

There is this entry in the TLB and the valid bit is 1. The TLB hits and the physical page number is: 0x98f8a.

​The physical address is:

​ PPN + PPO = 0x98f8a000 + 2c0 = 0x98f8a2c0

2023.5.8

Guess you like

Origin blog.csdn.net/Aaron503/article/details/131104909